From guido@CNRI.Reston.VA.US  Wed Dec  1 17:32:08 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 12:32:08 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Fri, 19 Nov 1999 14:59:11 CST."
 <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com>
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com>
Message-ID: <199912011732.MAA10419@eric.cnri.reston.va.us>

> My first Python-Dev post.  :-)

Welcome!

> >We had some discussion a while back about enabling thread support by
> >default, if the underlying OS supports it obviously.  

I agree with this.  MacOS seems to be the only OS without threads
these days.

> What's the consensus about Python microthreads -- a likely candidate
> for incorporation in 1.6 (or later)?

What are microthreads?  If you think about threads implemented in the
Python VM instead of in the OS, forget it.

> Also, we have a couple minor convenience functions for Python in an 
> MSDEV environment, an exposure of OutputDebugString for writing to 
> the DevStudio log window and a means of tripping DevStudio C/C++ layer
> breakpoints from Python code (currently experimental).  The msvcrt 
> module seems like a likely candidate for these, would these be 
> welcome additions?

Sure -- send patches.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From petrilli@amber.org  Wed Dec  1 17:39:00 1999
From: petrilli@amber.org (Christopher Petrilli)
Date: Wed, 1 Dec 1999 12:39:00 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: <199912011732.MAA10419@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Wed, Dec 01, 1999 at 12:32:08PM -0500
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us>
Message-ID: <19991201123900.A7419@trump.amber.org>

Guido van Rossum [guido@CNRI.Reston.VA.US] wrote:
> > >We had some discussion a while back about enabling thread support by
> > >default, if the underlying OS supports it obviously.  
> 
> I agree with this.  MacOS seems to be the only OS without threads
> these days.

I believe the new GUISI package has pthread-API compatible threads
implemented, which talk to the underlying ThreadManager.  With MacOSX
being impending before 1.6 (i.e. early 2000), I'd say this is a good
way to go.  Threads are VERY useful for a lot of problem domains.

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From guido@CNRI.Reston.VA.US  Wed Dec  1 17:54:53 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 12:54:53 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Wed, 01 Dec 1999 12:39:00 EST."
 <19991201123900.A7419@trump.amber.org>
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us>
 <19991201123900.A7419@trump.amber.org>
Message-ID: <199912011754.MAA10465@eric.cnri.reston.va.us>

> > I agree with this.  MacOS seems to be the only OS without threads
> > these days.
> 
> I believe the new GUISI package has pthread-API compatible threads
> implemented, which talk to the underlying ThreadManager.  With MacOSX
> being impending before 1.6 (i.e. early 2000), I'd say this is a good
> way to go.  Threads are VERY useful for a lot of problem domains.

What's GUISI?  The son of GUSI?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Wed Dec  1 17:55:19 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 12:55:19 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Wed, 01 Dec 1999 12:32:08 EST."
 <199912011732.MAA10419@eric.cnri.reston.va.us>
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com>
 <199912011732.MAA10419@eric.cnri.reston.va.us>
Message-ID: <199912011755.MAA10476@eric.cnri.reston.va.us>

> > Also, we have a couple minor convenience functions for Python in an 
> > MSDEV environment, an exposure of OutputDebugString for writing to 
> > the DevStudio log window and a means of tripping DevStudio C/C++ layer
> > breakpoints from Python code (currently experimental).  The msvcrt 
> > module seems like a likely candidate for these, would these be 
> > welcome additions?
> 
> Sure -- send patches.

I hadn't seen Mark Hammond's response -- I take it back.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Wed Dec  1 18:15:26 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 13:15:26 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Sat, 20 Nov 1999 11:04:28 +1100."
 <005f01bf32ea$d0b82b90$0501a8c0@bobcat>
References: <005f01bf32ea$d0b82b90$0501a8c0@bobcat>
Message-ID: <199912011815.NAA10506@eric.cnri.reston.va.us>

> This is really a pointer to the fact that some or all of the win32api
> should be moved into the core - registry access is the thing people
> most want, but there are plenty of other useful things that people
> reguarly use...
> 
> Guido objects to the coding style, but hopefully that wont be a big
> issue.  IMO, the coding style isnt "bad" - it is just more an "MS"
> flavour than a "Python" flavour - presumably people reading the code
> will have some experience with Windows, so it wont look completely
> foreign to them.  The good thing about taking it "as-is" is that it
> has been fairly well bashed on over a few years, so is really quite
> stable.  The final "coding style" issue is that there are no "doc
> strings" - all documentation is embedded in C comments, and extracted
> using a tool called "autoduck" (similar to "autodoc").  However, Im
> sure we can arrange something there, too.

That's a good summary of the status quo.  I would appreciate it if
win32all could become part of the core.  However the coding style
issues need to be addressed (I also believe that it needs to be
compiled in C++ mode).  One concern that Mark doesn't mention is that
there are some safety issues -- you can abuse some of the calls to
cause segfaults, whether intentional or by mistake, and that's not a
good thing.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Wed Dec  1 18:55:40 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 13:55:40 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Wed, 24 Nov 1999 09:43:57 EST."
 <383BF9AD.E183FB98@interet.com>
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org>
 <383BF9AD.E183FB98@interet.com>
Message-ID: <199912011855.NAA10662@eric.cnri.reston.va.us>

> I would like to argue that on Windows, import of dynamic libraries is
> broken.  If a file something.pyd is imported, then sys.path is searched
> to find the module.  If a file something.dll is imported, the same thing
> happens.  But Windows defines its own search order for *.dll files which
> Python ignores.  I would suggest that this is wrong for files named
> *.dll,
> but OK for files named *.pyd.

I think you misunderstand some of the issues.

Python cannot import every .dll file.  Only .dll files that conform to
the convention for Python extension modules can be imported.  (The
convention is that it must export an init<module> function.)

On most other platforms, shared libraries must have a specific
extension (e.g. .so on most Unix).  Python allows you to drop such a
file into any directory where is looks for modules, and it will then
direct the dynamic load support to load that specific file.

This seems logical -- Python extensions must live in directories that
Python searches (Python must do its own search because the search
order is significant).

On Windows, Python uses the same strategy.  The only modification is
that it is allowed to give the file a different extension, namely
.pyd, to indicate that this really is a Python extension and not a
regular DLL.  This was mostly introduced because it is apparently
common to have an existing DLL "foo.dll" and write a Python wrapper
for it that is also called "foo".  Clearly, two files foo.dll are too
confusing, so we let you name the wrapper foo.pyd.  But because the
file format is essentially that of a DLL, we don't *require* this
renaming; some ways of creating DLLs in the first place may make it
difficult to do.

> A SysAdmin should be able to install and maintain *.dll as she has
> been trained to do.  This makes maintaining Python installations
> simpler and more un-surprising.

I don't see that a SysAdmin needs to do much DLL management.  This is
up to installer scripts.  Anyway how hard can it be for a SysAdmin to
leave DLLs in specific directories alone?

> I have no solution to the backward compatibilty problem.  But the
> code is only a couple lines.  A LoadLibrary() call does its own
> path searching.

But at what point should this LoadLibrary() call be called?  The
import statement contains no clue that a DLL is requested -- the
sys.path search reveals that.

I claim that there is nothing with the current strategy.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Wed Dec  1 19:01:12 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Wed, 1 Dec 1999 14:01:12 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs
References: <199911161700.MAA02716@eric.cnri.reston.va.us>
 <14389.31511.706588.20840@anthem.cnri.reston.va.us>
Message-ID: <14405.28792.184298.298597@anthem.cnri.reston.va.us>

>>>>> "BAW" == Barry A Warsaw <bwarsaw@cnri.reston.va.us> writes:

    BAW> There was a suggestion to start augmenting the checkin emails
    BAW> to include the diffs of the checkin.  This would let you keep
    BAW> a current snapshot of the tree without having to do a direct
    BAW> `cvs update'.

The voting has stopped, with the "yeah" vote slightly head of the
"nay" vote.  We'll go with context diffs, and we'll be implementing
Greg Stein's approach with the xml-checkins list: truncating diffs to
H number of lines at the top and T number of lines at the bottom, so
as not to overwhelm incoming email.

I'll try to get this going sometime today (no promises).  You'll
likely see a number of tests coming through python-checkins in the
meantime.  I'll send a message out when it's done.

-Barry


From da@ski.org  Wed Dec  1 19:34:56 1999
From: da@ski.org (David Ascher)
Date: Wed, 1 Dec 1999 11:34:56 -0800 (Pacific Standard Time)
Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues
In-Reply-To: <14405.25141.297349.76968@gargle.gargle.HOWL>
Message-ID: <Pine.WNT.4.04.9912011132130.222-100000@rigoletto.ski.org>

On Wed, 1 Dec 1999, Geoffrey Furnish wrote:

[...]

> Well, like I said above, I haven't analyzed your posts for technical
> details, so I can't say whether you made avoidable mistakes.  But I
> definitely do agree with you that it is roughly 100 times harder than
> it needs to be, to use Python from C++.  The charter of this sig is to 
> fix that, by developing the additional software that would allow
> Python's compiled interface to be exploited from C++ "with ease".
> 
> The first and most basic issue, is compiling Python so it initializes
> C++ global objects correctly.  There is a patch on the sig's www site
> to help with that.

Any opinions from this esteemed body re: integrating said patch in the
main tree?

--david


From jim@interet.com  Wed Dec  1 19:47:14 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 14:47:14 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org>
 <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us>
Message-ID: <38457B42.85552AC@interet.com>

Guido van Rossum wrote:
> 
> > I would like to argue that on Windows, import of dynamic libraries is
> > broken.  If a file something.pyd is imported, then sys.path is searched
> > to find the module.  If a file something.dll is imported, the same thing
> > happens.  But Windows defines its own search order for *.dll files which
> > Python ignores.  I would suggest that this is wrong for files named
> > *.dll,
> > but OK for files named *.pyd.
> 
> I think you misunderstand some of the issues.
> 
> Python cannot import every .dll file.  Only .dll files that conform to
> the convention for Python extension modules can be imported.  (The
> convention is that it must export an init<module> function.)

Of course I meant that the test is LoadLibrary(module) followed
by GetProcAddress(h, "init" + module).  Both must succeed.

> This seems logical -- Python extensions must live in directories that
> Python searches (Python must do its own search because the search
> order is significant).

The PYTHONPATH search path is what I am trying to get away
from.  If I eliminate PYTHONPATH I still can not use the
Windows DLL search path (which is superior) because DLLs
are searched on PYTHONPATH too; thus my post.  I don't believe
it is important for Python module.dll to be located on PYTHONPATH.

> > A SysAdmin should be able to install and maintain *.dll as she has
> > been trained to do.  This makes maintaining Python installations
> > simpler and more un-surprising.
> 
> I don't see that a SysAdmin needs to do much DLL management.  This is
> up to installer scripts.  Anyway how hard can it be for a SysAdmin to
> leave DLLs in specific directories alone?

The problem is maintaining PYTHONPATH plus having DLL's on a
non-standard search path.  Yes, PythonDev[:] and professional
SysAdmins can do it.  But it is not as simple as it could be.
Someone has to write the install scripts.  And what if something
doesn't work?  Think of Python being used as a teaching language
for the 8th grade.  Think of the 8th grade teacher trying to get
all this right.  The only thing that works is simplicity.

> But at what point should this LoadLibrary() call be called?  The
> import statement contains no clue that a DLL is requested -- the
> sys.path search reveals that.

Just after built-in and frozen modules.

> I claim that there is nothing with the current strategy.

Thank you for thoughtfully considering and commenting at length
on this issue.  Lets ignore it for the moment.  The other
problems with PYTHONPATH are more pressing.  But if those
issues are solved, this one will stick out.

JimA


From da@ski.org  Wed Dec  1 19:59:44 1999
From: da@ski.org (David Ascher)
Date: Wed, 1 Dec 1999 11:59:44 -0800 (Pacific Standard Time)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <38457B42.85552AC@interet.com>
Message-ID: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>

On Wed, 1 Dec 1999, James C. Ahlstrom wrote:

> > This seems logical -- Python extensions must live in directories that
> > Python searches (Python must do its own search because the search
> > order is significant).
> 
> The PYTHONPATH search path is what I am trying to get away
> from.  If I eliminate PYTHONPATH I still can not use the
> Windows DLL search path (which is superior) because DLLs
> are searched on PYTHONPATH too; thus my post.  I don't believe
> it is important for Python module.dll to be located on PYTHONPATH.

Why is the DLL search path superior?  

In my experience, the DLL search path (PATH for short) is problematic
because it requires either using the System control panel or modifying
autoexec.bat, both of which can have massive systemic effects completely
unrelated to Python if a mistake is made during the modification.

On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH,
although I think there are significant variations in how that works across
platforms.  Most beginning unix users have no idea how to modify their
LD_LIBRARY_PATH, as they typically don't understand the configuration
mechanisms on Unix (system vs. user-specific, login vs. shell-specific,
different shell configuration languages, etc.).

I know it's not what you had in mind, but have you tried doing something
like:

  import sys, os, string
  sys.path.extend(string.split(os.environ['PATH'], ';'))

--david


From gmcm@hypernet.com  Wed Dec  1 20:19:13 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 1 Dec 1999 15:19:13 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>
References: <38457B42.85552AC@interet.com>
Message-ID: <1268042932-41354568@hypernet.com>

David Ascher wrote:
> On Wed, 1 Dec 1999, James C. Ahlstrom wrote:
> 
> > > This seems logical -- Python extensions must live in
> > > directories that Python searches (Python must do its own
> > > search because the search order is significant).
> > 
> > The PYTHONPATH search path is what I am trying to get away
> > from.  If I eliminate PYTHONPATH I still can not use the
> > Windows DLL search path (which is superior) because DLLs are
> > searched on PYTHONPATH too; thus my post.  I don't believe it
> > is important for Python module.dll to be located on PYTHONPATH.
> 
> Why is the DLL search path superior?  
> 
> In my experience, the DLL search path (PATH for short) 

Make that:
 [ os.path.dirname(sys.executable),
   os.getcwd(),
   win32api.GetSystemDirectory(),
   os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'), 
   win32api.GetWindowsDirectory()
 ] + string.split(os.environ['PATH'], ';')

> is
> problematic because it requires either using the System control
> panel or modifying autoexec.bat, both of which can have massive
> systemic effects completely unrelated to Python if a mistake is
> made during the modification.

Hear, hear!

[snip]


- Gordon


From jim@interet.com  Wed Dec  1 20:36:04 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 15:36:04 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>
Message-ID: <384586B4.48905B32@interet.com>

David Ascher wrote:

> Why is the DLL search path superior?
> 
> In my experience, the DLL search path (PATH for short) is problematic
> because it requires either using the System control panel or modifying
> autoexec.bat, both of which can have massive systemic effects completely
> unrelated to Python if a mistake is made during the modification.

I agree that altering PATH is problematic.  So is altering PYTHONPATH
and for exactly the same reason.  That is why I think PYTHONPATH is
a bad idea.

The reason the DLL search path is superior is that it is not just PATH.
It defines a path which includes the install directory of the
application
plus the system directories, and this path is discovered at runtime.  So
it is not necessary to set a global PYTHONPATH, nor make registry
entries,
nor do anything at all.  It Just Works.

The Windows DLL search path is:

1) The directory of the executable program.  That means you can just
   throw all your DLL's in with the *.exe's, and it all Just Works.

2) The current directory.  Also useful.

3) The Windows system directory (call GetSystemDirectory() to get this).
4) The Windows directory (call GetWindowsDirectory() to get this).

   These two directories are used for system files.  Think of /sbin,
/bin.
   Windows apps usually throw some of their DLL's here, especially if
they
   are of general interest.

5) The directories in PATH.  This is relatively useless, and AFAIK it
   is seldom used in a real installation.  It is a left-over from DOS.
   That is also why it appears last.

> On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH,
> although I think there are significant variations in how that works across
> platforms.  Most beginning unix users have no idea how to modify their
> LD_LIBRARY_PATH, as they typically don't understand the configuration
> mechanisms on Unix (system vs. user-specific, login vs. shell-specific,
> different shell configuration languages, etc.).

I agree.

> 
> I know it's not what you had in mind, but have you tried doing something
> like:
> 
>   import sys, os, string
>   sys.path.extend(string.split(os.environ['PATH'], ';'))

Adding PATH (or anything else) to PYTHONPATH is making it worse.  Have
you tried "import sys; print sys.path" on Windows?  It is junk.

JimA


From jim@interet.com  Wed Dec  1 20:44:00 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 15:44:00 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <38457B42.85552AC@interet.com> <1268042932-41354568@hypernet.com>
Message-ID: <38458890.BCB36FE2@interet.com>

Gordon McMillan wrote:

> Make that:
>  [ os.path.dirname(sys.executable),
>    os.getcwd(),
>    win32api.GetSystemDirectory(),
>    os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'),
>    win32api.GetWindowsDirectory()
>  ] + string.split(os.environ['PATH'], ';')

Very nice!  "../SYSTEM" needed on NT I guess.

JimA


From fredrik@pythonware.com  Wed Dec  1 20:56:16 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 1 Dec 1999 21:56:16 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com>
Message-ID: <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>

James C. Ahlstrom <jim@interet.com> wrote:
> Adding PATH (or anything else) to PYTHONPATH is making it worse.  Have
> you tried "import sys; print sys.path" on Windows?  It is junk.

not on my machine.

it would help if you stopped assuming that every-
one have the same problems as you have.  we've
distributed several python apps on windows, and
frankly, I don't understand what you're talking
about.

</F>


From jim@interet.com  Wed Dec  1 21:26:37 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 16:26:37 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>
Message-ID: <3845928D.C0462322@interet.com>

Fredrik Lundh wrote:

> > you tried "import sys; print sys.path" on Windows?  It is junk.
> 
> not on my machine.

On my Windows machine I get:

['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib',
  '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin']

PYTHONPATH is N:/prd/winlease/vest.
os.path.dirname(sys.executable) is F:/bin.
The others are junk.  What do you get?  Did
you change sys.path from the default?

> it would help if you stopped assuming that every-
> one have the same problems as you have.  we've
> distributed several python apps on windows, and
> frankly, I don't understand what you're talking
> about.

We distribute our app by freezing all *.py files
into a DLL, and we don't set PYTHONPATH on the
target machine.  The files are located with the
executable file and are found there.  This works
fine and we don't have a problem with it.

It would help me a lot if you could describe how you
distribute your app.  Do you set PYTHONPATH on the
target machine?

JimA


From da@ski.org  Wed Dec  1 21:41:31 1999
From: da@ski.org (David Ascher)
Date: Wed, 1 Dec 1999 13:41:31 -0800 (Pacific Standard Time)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <384586B4.48905B32@interet.com>
Message-ID: <Pine.WNT.4.04.9912011251250.254-100000@rigoletto.ski.org>

On Wed, 1 Dec 1999, James C. Ahlstrom wrote:

> > In my experience, the DLL search path (PATH for short) is problematic
> > because it requires either using the System control panel or modifying
> > autoexec.bat, both of which can have massive systemic effects completely
> > unrelated to Python if a mistake is made during the modification.
> 
> I agree that altering PATH is problematic.  So is altering PYTHONPATH
> and for exactly the same reason.  That is why I think PYTHONPATH is
> a bad idea.

I see.  Thanks for the explanation. I didn't know the complete story of
the "Windows DLL search path".  BTW, I think a huge difference b/w
PYTHONPATH and PATH is the system-wide nature of PATH, vs. the
Python-restriced nature of PYTHONPATH.

--david


From mhammond@skippinet.com.au  Wed Dec  1 22:29:38 1999
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Thu, 2 Dec 1999 09:29:38 +1100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <Pine.WNT.4.04.9912011251250.254-100000@rigoletto.ski.org>
Message-ID: <009c01bf3c4b$8f119090$0501a8c0@bobcat>

> I see.  Thanks for the explanation. I didn't know the
> complete story of
> the "Windows DLL search path".  BTW, I think a huge difference b/w
> PYTHONPATH and PATH is the system-wide nature of PATH, vs. the
> Python-restriced nature of PYTHONPATH.

And more to the point - and the critical distinction - is that
PYTHONPATH is actually specific to the Python _app_, not just Python
on the machine.

Sure - the standard Python installation puts a "default" PYTHONPATH
suitable for general purpose development - but any distributed
application _can_ define their own PYTHONPATH that is independant of
any other Python systems or applications.  People have been doing this
for years, including MS :-)

Sorry Jim, but count this as another vote against it - which isnt to
argue that the current system is perfect, simply (IMO) better than the
Windows path and DLL search order.

Mark.


From guido@CNRI.Reston.VA.US  Wed Dec  1 23:00:21 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:00:21 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Wed, 01 Dec 1999 16:26:37 EST."
 <3845928D.C0462322@interet.com>
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>
 <3845928D.C0462322@interet.com>
Message-ID: <199912012300.SAA10861@eric.cnri.reston.va.us>

> Fredrik Lundh wrote:
> 
> > > you tried "import sys; print sys.path" on Windows?  It is junk.
> > 
> > not on my machine.
> 
> On my Windows machine I get:
> 
> ['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib',
>   '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin']
> 
> PYTHONPATH is N:/prd/winlease/vest.
> os.path.dirname(sys.executable) is F:/bin.
> The others are junk.  What do you get?  Did
> you change sys.path from the default?

You must not have used the standard Python installer; if you had used
it you wouldn't have had this problem (and perhaps we wouldn't have
had this discussion).

The problem is that you apparently have installed python.exe in
f:\bin.  "Modern" Python versions execute some code at startup that
comes up with a suitable value for sys.path; the Windows version of
this code is in PC/getpathp.c -- I recommend that you study it.  This
code tries to find the Python install directory by looking for a
"landmark" file relative to the executable path, and then adds a bunch
of directory entries to the path relative to the install directory.
If it fails, it defaults to "." for the install directory.  The
entries '.\\DLLs', '.\\lib', '.\\lib\\plat-win', '.\\lib\\lib-tk' are
all a result of this failing.

As long as this works, there is no need for the user (or anyone) to
ever set the PYTHONPATH variable -- that variable is only needed to
add directories in front of sys.path for stuff that getpathp.c doesn't
know about (e.g. PIL, Numeric, etc.).  With packagized versions of
those modules, even that won't be necessary, because the packages will
be dropped in the Python install directory (typically C:\Program
Files\Python).

I believe that most of your desire to get rid of PYTHONPATH comes from
your insistence to bypass the default installer.  There's probably a
way to install your app in such a way that the getpathp.c algorithm
actually succeeds?  There's also a separate env variable, PYTHONHOME,
which overrides the Python install directory; if getpathp.c sees that
it is set, it will bypass the search relative to the executable's
path.

I take blame for not documenting all this well enough.  However I wish
you stopped criticizing the design -- I think the design is quite
solid.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Wed Dec  1 23:09:43 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:09:43 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Wed, 01 Dec 1999 14:47:14 EST."
 <38457B42.85552AC@interet.com>
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org> <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us>
 <38457B42.85552AC@interet.com>
Message-ID: <199912012309.SAA10873@eric.cnri.reston.va.us>

> > This seems logical -- Python extensions must live in directories that
> > Python searches (Python must do its own search because the search
> > order is significant).
> 
> The PYTHONPATH search path is what I am trying to get away
> from.  If I eliminate PYTHONPATH I still can not use the
> Windows DLL search path (which is superior) because DLLs
> are searched on PYTHONPATH too; thus my post.  I don't believe
> it is important for Python module.dll to be located on PYTHONPATH.

But I do.

First of all, I'm not sure whether you're talking here about sys.path
or PYTHONPATH.  As I explained in a previous post, you should normally
not have to set PYTHONPATH at all.  Let's assume you really meant
sys.path.

Let's assume sys.path is [A, B].  Let's assume there's a foo.py and a
foo.dll.  If foo.py lives in A and foo.dll lives in B, then import foo
should load foo.py.  If it's the other way around, it should load
foo.dll.  If we were to use the default DLL search path, there's no
way that we can get this behavior: either you have to look for a DLL
first, which means there's no way for foo.py to override foo.dll, or
you have to look for a DLL last, and then there's no way for a foo.dll
to override foo.py.  It is desirable that both overrides are possible:
we want to be able to have foo.dll override foo.py, because perhaps
foo.py should only be used when for some reason foo.dll can't be
loaded (say foo.py does the same thing only slower); but we also want
to be able to have foo.py override foo.dll (by simply placing it in a
directory that's earlier on the path) e.g. in a situation where the
dll version does something undesirable and we want to create a safe
substitute.  (Deleting files is not always an option.)

> The problem is maintaining PYTHONPATH plus having DLL's on a
> non-standard search path.

I've commented already that PYTHONPATH maintenance is probably a red
herring due to your non-standard install.  I'm not sure what the
problem is with having a DLL on a non-std path?

> Yes, PythonDev[:] and professional
> SysAdmins can do it.  But it is not as simple as it could be.
> Someone has to write the install scripts.

The distutil-sig (a.k.a. Greg Ward :-) is taking care of this as we
speak.

> And what if something
> doesn't work?  Think of Python being used as a teaching language
> for the 8th grade.  Think of the 8th grade teacher trying to get
> all this right.  The only thing that works is simplicity.

We will provide an installer that Just Works [tm].

> > But at what point should this LoadLibrary() call be called?  The
> > import statement contains no clue that a DLL is requested -- the
> > sys.path search reveals that.
> 
> Just after built-in and frozen modules.

See my long comment above.

> > I claim that there is nothing with the current strategy.
> 
> Thank you for thoughtfully considering and commenting at length
> on this issue.  Lets ignore it for the moment.  The other
> problems with PYTHONPATH are more pressing.  But if those
> issues are solved, this one will stick out.

And those other issues should be resolved in a different way than what
you have been proposing.  See other post.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Wed Dec  1 23:11:28 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:11:28 -0500
Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues
In-Reply-To: Your message of "Wed, 01 Dec 1999 11:34:56 PST."
 <Pine.WNT.4.04.9912011132130.222-100000@rigoletto.ski.org>
References: <Pine.WNT.4.04.9912011132130.222-100000@rigoletto.ski.org>
Message-ID: <199912012311.SAA10888@eric.cnri.reston.va.us>

> > The first and most basic issue, is compiling Python so it initializes
> > C++ global objects correctly.  There is a patch on the sig's www site
> > to help with that.
> 
> Any opinions from this esteemed body re: integrating said patch in the
> main tree?

I presume you meant me :-)

I'll give it a try tonight.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@cnri.reston.va.us  Wed Dec  1 23:24:06 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 1 Dec 1999 18:24:06 -0500 (EST)
Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01
Message-ID: <14405.44566.832799.96438@goon.cnri.reston.va.us>

It looks like there has been some mail glitch that result in no
digests being sent between 11/26 and 12/01 and no messages being
archived between 11/24 and 12/01.  Does anyone keep a personal archive
that has those messages?  I'd like to read them.

Jeremy


From guido@CNRI.Reston.VA.US  Wed Dec  1 23:28:14 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:28:14 -0500
Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01
In-Reply-To: Your message of "Wed, 01 Dec 1999 18:24:06 EST."
 <14405.44566.832799.96438@goon.cnri.reston.va.us>
References: <14405.44566.832799.96438@goon.cnri.reston.va.us>
Message-ID: <199912012328.SAA12879@eric.cnri.reston.va.us>

> It looks like there has been some mail glitch that result in no
> digests being sent between 11/26 and 12/01 and no messages being
> archived between 11/24 and 12/01.  Does anyone keep a personal archive
> that has those messages?  I'd like to read them.

I do :-)

I'll provide Jeremy with an archive.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Thu Dec  2 04:24:03 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Wed, 1 Dec 1999 23:24:03 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs
References: <199911161700.MAA02716@eric.cnri.reston.va.us>
 <14389.31511.706588.20840@anthem.cnri.reston.va.us>
Message-ID: <14405.62563.345566.500106@anthem.cnri.reston.va.us>

Okay folks, I think I've got the diff thing working now.  The trick
(for you CVS heads) was that you can't do a `cvs diff' while you're
executing a loginfo script.  Lock contention (repeat after me: "I Love
CVS!").  Anyway, let's see how you all like it.

Note that based on a suggestion by Greg Stein, seconded by GvR, I do
not send out the entire diff of every file (which could potentially be
huge).  I send out 20 lines from the head of the diff and 20 lines
from the tail, and suppress everything inbetween.  Those numbers can
be easily tweaked, and I'm not sure what the ideal is.  Let's see what
the emails look like when stuff starts getting checked in.

Enjoy,
-Barry


From jack@oratrix.nl  Thu Dec  2 11:00:45 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Thu, 02 Dec 1999 12:00:45 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Message by Guido van Rossum <guido@CNRI.Reston.VA.US> ,
 Wed, 01 Dec 1999 18:09:43 -0500 , <199912012309.SAA10873@eric.cnri.reston.va.us>
Message-ID: <19991202110045.96F33370CF2@snelboot.oratrix.nl>

On the Mac I've introduced "magic cookies" into sys.path, which allow you to 
do interesting searches (like searching for a DLL or PYC-resource in the 
application itself) at known places in the import process.

There isn't a cookie for "search along the standard MacOS dll search path" 
(which is somewhat similar to the Windows dll search path) because I haven't 
seen a reason for it, but there's nothing to stop it. And if you'd insert that 
cookie it would be perfectly clear (at least, it should be) that only dll 
modules will be found in that step, not .py modules.

Actually I'm so happy with the magic cookie scheme that I've advocated at 
various times in the past that something similar also be used for determining 
where builtin modules and frozen modules appear in sys.path...
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From guido@CNRI.Reston.VA.US  Thu Dec  2 11:59:34 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 06:59:34 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Thu, 02 Dec 1999 12:00:45 +0100."
 <19991202110045.96F33370CF2@snelboot.oratrix.nl>
References: <19991202110045.96F33370CF2@snelboot.oratrix.nl>
Message-ID: <199912021159.GAA13732@eric.cnri.reston.va.us>

> On the Mac I've introduced "magic cookies" into sys.path, which
> allow you to do interesting searches (like searching for a DLL or
> PYC-resource in the application itself) at known places in the
> import process.

> There isn't a cookie for "search along the standard MacOS dll search
> path" (which is somewhat similar to the Windows dll search path)
> because I haven't seen a reason for it, but there's nothing to stop
> it. And if you'd insert that cookie it would be perfectly clear (at
> least, it should be) that only dll modules will be found in that
> step, not .py modules.

> Actually I'm so happy with the magic cookie scheme that I've
> advocated at various times in the past that something similar also
> be used for determining where builtin modules and frozen modules
> appear in sys.path...

I see the magic cookies as a poor man's (but more compatible!) version
of a chain of importers as advocated by Greg Stein and other imputil
fans.  I like the idea, except that I think that the chain should be
manipulatable more easily than the current imputil implementation.
(I'll have more comments on Greg's comments later, when I've actually
read them through.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Thu Dec  2 12:09:40 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 2 Dec 1999 04:09:40 -0800 (PST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <199912021159.GAA13732@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912020404500.18236-100000@nebula.lyra.org>

On Thu, 2 Dec 1999, Guido van Rossum wrote:
>...
> I see the magic cookies as a poor man's (but more compatible!) version
> of a chain of importers as advocated by Greg Stein and other imputil
> fans.  I like the idea, except that I think that the chain should be
> manipulatable more easily than the current imputil implementation.
> (I'll have more comments on Greg's comments later, when I've actually
> read them through.)

Anything in sys.path that is not a string pointing to a directory is not
very compatible. My current proposal keeps the existing semantics for
sys.path (the proposal adds functionality thru other mechanisms, rather
than changing/interfering with existing ones).

I look forward to your comments! I'll definitely provide new solutions
where you find problems :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik@pythonware.com  Thu Dec  2 12:53:03 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 2 Dec 1999 13:53:03 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <19991202110045.96F33370CF2@snelboot.oratrix.nl>  <199912021159.GAA13732@eric.cnri.reston.va.us>
Message-ID: <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com>

Guido van Rossum <guido@CNRI.Reston.VA.US> wrote:
> > Actually I'm so happy with the magic cookie scheme that I've
> > advocated at various times in the past that something similar also
> > be used for determining where builtin modules and frozen modules
> > appear in sys.path...
> 
> I see the magic cookies as a poor man's (but more compatible!) version
> of a chain of importers as advocated by Greg Stein and other imputil
> fans.  I like the idea, except that I think that the chain should be
> manipulatable more easily than the current imputil implementation.

I know this has been asked before, but cannot recall
any of the arguments against it: how about replacing
Jack's magic cookies with importer objects?

(in other words, if a path item is a string, import as
usual.  otherwise, ask the importer for a code object
or maybe better, a module object).

</F>


From jack@oratrix.nl  Thu Dec  2 13:23:31 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Thu, 02 Dec 1999 14:23:31 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Message by "Fredrik Lundh" <fredrik@pythonware.com> ,
 Thu, 2 Dec 1999 13:53:03 +0100 , <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com>
Message-ID: <19991202132331.E3F8D370CF2@snelboot.oratrix.nl>

> > I see the magic cookies as a poor man's (but more compatible!) version
> > of a chain of importers as advocated by Greg Stein and other imputil
> > fans. [...]
> 
> I know this has been asked before, but cannot recall
> any of the arguments against it: how about replacing
> Jack's magic cookies with importer objects?

For the record: I definitely agree with both comments here. The only thing 
that would need solving (but maybe it already is? Greg?) is the external 
representation of an importer, as I'd definitely want to be able to name them 
in PYTHONPATH (or the mac equivalent).
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From jim@interet.com  Thu Dec  2 14:19:31 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 09:19:31 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <009c01bf3c4b$8f119090$0501a8c0@bobcat>
Message-ID: <38467FF3.D938EE4@interet.com>

Mark Hammond wrote:

> Sure - the standard Python installation puts a "default" PYTHONPATH
> suitable for general purpose development - but any distributed
> application _can_ define their own PYTHONPATH that is independant of
> any other Python systems or applications.  People have been doing this
> for years, including MS :-)

How is this done?
 
> Sorry Jim, but count this as another vote against it - which isnt to
> argue that the current system is perfect, simply (IMO) better than the
> Windows path and DLL search order.

Sigh.....

JimA


From jim@interet.com  Thu Dec  2 15:49:10 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 10:49:10 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>
 <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us>
Message-ID: <384694F6.E5D74221@interet.com>

Guido van Rossum wrote:

> You must not have used the standard Python installer; if you had used
> it you wouldn't have had this problem (and perhaps we wouldn't have
> had this discussion).

Correct, I did not use the standard Python installer.  I compiled
Python from the source distribution.  There are good reasons for this
in my case.

First, my real issue is how to DISTRIBUTE Python programs, not to get
Python working on my own machine.  We have 12 machines on a network.
It is not acceptable to run a Python installation script on every one
of them just to run a simple Python program.  OK, I guess I could do 12,
but what about a larger company?  And we ship to hundreds of customers.
I can distribute simple C or C++ programs without a hassle, why not
Python?
It is not acceptable to ask our customers to run a separate Python
installer.
We have our own Wise installer to install our software.  Every
commercial
vendor has Wise, Install Shield or other installer in place.  No
commercial
vendor is going to abandon Wise et al. and move to The Official Python
Installer because it will not have the features of Wise (such as binary
patches across the network), and because what it does won't be
documented,
and because it is Just Different.

Second, I can not run ANY installer on my development machine, Python or
otherwise.  This is a general Windows problem not specific to Python.
Right now our help system is broken on every office machine except the
one where the help system installer was run (where we develop help).
If I run a Python installer, it may Just Work here.  So testing is
fine, but when I distribute the program to customers where the install
program has not been run it fails.  The installer made registry entries,
installed files, etc.  And what did it do??  No one knows.  And how do I
install at a customer site if I don't have documentation on what the
Help
installer or Python installer did??  No one knows.  Who fixes it if
something goes wrong??  Hours on the phone to Help System customer
support.
Does it work on Windows 2000??  No one knows.

> f:\bin.  "Modern" Python versions execute some code at startup that
> comes up with a suitable value for sys.path; the Windows version of
> this code is in PC/getpathp.c -- I recommend that you study it.  This

> [ Highly useful discussion of startup...]

Thank you, I will study this.

> know about (e.g. PIL, Numeric, etc.).  With packagized versions of
> those modules, even that won't be necessary, because the packages will
> be dropped in the Python install directory (typically C:\Program
> Files\Python).

Yes, this is essential.  Packages must be easily installed.  I was
hoping
for single file package archive files.

> I believe that most of your desire to get rid of PYTHONPATH comes from
> your insistence to bypass the default installer.

Correct, I refuse to execute the default installer.  And I am
a patient person who loves Python, so I will read getpathp.c
to see what is happening.  But other commercial developers, students,
teachers, SysAdmins etc. are not so patient.  In the interest of
promoting Python, there should be documentation on the official
way to easily install Python programs.

> There's probably a
> way to install your app in such a way that the getpathp.c algorithm
> actually succeeds?  There's also a separate env variable, PYTHONHOME,

Perhaps, and if there is it should be prominently documented in the
How to Distribute Your App section of the manual.  I
am worried about supporting versioning, but I will think about it.

> I take blame for not documenting all this well enough.  However I wish
> you stopped criticizing the design -- I think the design is quite
> solid.

Thank you for the explanation.  I will study the design again.  I
always wondered what PYTHONHOME did.

JimA


From guido@CNRI.Reston.VA.US  Thu Dec  2 16:03:09 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 11:03:09 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Thu, 02 Dec 1999 10:49:10 EST."
 <384694F6.E5D74221@interet.com>
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us>
 <384694F6.E5D74221@interet.com>
Message-ID: <199912021603.LAA14455@eric.cnri.reston.va.us>

> Perhaps, and if there is it should be prominently documented in the
> How to Distribute Your App section of the manual.  I
> am worried about supporting versioning, but I will think about it.

Join the distutil-SIG, they are discussing just this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Thu Dec  2 15:48:40 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 02 Dec 1999 16:48:40 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <19991202110045.96F33370CF2@snelboot.oratrix.nl>  <199912021159.GAA13732@eric.cnri.reston.va.us> <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com>
Message-ID: <384694D8.DCA3D75E@lemburg.com>

Fredrik Lundh wrote:
> 
> Guido van Rossum <guido@CNRI.Reston.VA.US> wrote:
> > > Actually I'm so happy with the magic cookie scheme that I've
> > > advocated at various times in the past that something similar also
> > > be used for determining where builtin modules and frozen modules
> > > appear in sys.path...
> >
> > I see the magic cookies as a poor man's (but more compatible!) version
> > of a chain of importers as advocated by Greg Stein and other imputil
> > fans.  I like the idea, except that I think that the chain should be
> > manipulatable more easily than the current imputil implementation.
> 
> I know this has been asked before, but cannot recall
> any of the arguments against it: how about replacing
> Jack's magic cookies with importer objects?
> 
> (in other words, if a path item is a string, import as
> usual.  otherwise, ask the importer for a code object
> or maybe better, a module object).

Plus, for backward compatibility, make sure that str(importerobj)
returns something which resembles a non-existing directory.

Note that the builtin importer skips non-string entries
in sys.path, so the above will only be needed for existing
import hooks.

Still, I would like to rephrase my 0.02EUR which I already
posted twice... why not start to think about what these
importers would do first ? If there are only a handful of
wishes we could just add them to the builtin machinery and
be done with it...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    29 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@CNRI.Reston.VA.US  Thu Dec  2 16:28:28 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 11:28:28 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Fri, 19 Nov 1999 22:43:32 EST."
 <1269053086-27079185@hypernet.com>
References: <1269053086-27079185@hypernet.com>
Message-ID: <199912021628.LAA14506@eric.cnri.reston.va.us>

> No success whatsoever in either direction across Samba. In 
> fact the mtime of my Linux home directory as seen from NT is 
> Jan 1, 1980.

That's only the case for an NT mount point (something of the form
\\host\name; I notice that os.stat() only believes it exists if you
append a backslash: \\host\name\).  For interior directories, at least
with the Samba version that I'm using, os.stat() seems to give correct
results.

I think that this whole issue (that doing a stat on a directory to
find out whether files in it were modified doesn't give usable
results) is widely blown out of proportion.

The only useful bit of info is that mtimes may have an up to 2 second
granularity, and that anything as recent as 2 seconds should be
considered as newer than the cache even if the cache is also less than
2 seconds.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@interet.com  Thu Dec  2 16:28:50 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 11:28:50 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org> <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us>
 <38457B42.85552AC@interet.com> <199912012309.SAA10873@eric.cnri.reston.va.us>
Message-ID: <38469E42.AF0A0D55@interet.com>

Guido van Rossum wrote:

> Let's assume sys.path is [A, B].  Let's assume there's a foo.py and a
> foo.dll.  If foo.py lives in A and foo.dll lives in B, then import foo
> ...

Thank you for the detailed discussion showing that sys.path is
needed so a choice can be made whether to load foo.dll or
foo.py.  As you correctly point out, a separate search path
defeats this behavior.

But I don't think the usefulness of the feature compensates for
its resultant complexity.  Specifically, it will be hard to
create this behavior in archive files.

As I envision archive files (which of course is subject to change)
they contain *.pyc files and not DLL's.  The DLL's must be in a
./DLL directory since the OS can not load them from strings.  So
if every *.pyc is in an archive file, your only choice is whether
to load all DLL's first or last.  That is, archive.pyl is either
before or after ./DLL.

If a package (probably with lots of subdirectories) author depends on
having a search path within a package which discriminates between
pyc and DLL files with equal names, then that search path plus the
existence of the DLL's must be recorded in the archive.

This is much more complicated than just an archive with all *.pyc
files entered in a dotted name space:
  foo
  foo.sub1
  foo.sub2
  foo.sub2.pkx

I would question whether equally named foo.dll and foo.py is worth it.
The alternative (which is IMHO more common) is to code the choice in
Python in the module that cares about it.

> > And what if something
> > doesn't work?  Think of Python being used as a teaching language
> > for the 8th grade.  Think of the 8th grade teacher trying to get
> > all this right.  The only thing that works is simplicity.
> 
> We will provide an installer that Just Works [tm].

OK for this case.  Not enough for Python program distribution. 

JimA


From jim@interet.com  Thu Dec  2 16:30:49 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 11:30:49 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us>
 <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us>
Message-ID: <38469EB9.5EDB9617@interet.com>

Guido van Rossum wrote:
> 
> > Perhaps, and if there is it should be prominently documented in the
> > How to Distribute Your App section of the manual.  I
> > am worried about supporting versioning, but I will think about it.
> 
> Join the distutil-SIG, they are discussing just this.

I already belong to the distutil-SIG and have seen no such
discussion.

Jim


From guido@CNRI.Reston.VA.US  Thu Dec  2 17:17:52 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 12:17:52 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Thu, 02 Dec 1999 11:30:49 EST."
 <38469EB9.5EDB9617@interet.com>
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us>
 <38469EB9.5EDB9617@interet.com>
Message-ID: <199912021717.MAA14682@eric.cnri.reston.va.us>

[Jim]
> > > Perhaps, and if there is it should be prominently documented in the
> > > How to Distribute Your App section of the manual.  I
> > > am worried about supporting versioning, but I will think about it.

[me]
> > Join the distutil-SIG, they are discussing just this.

[Jim again]
> I already belong to the distutil-SIG and have seen no such
> discussion.

Sorry, you're right (except for a brief exchange between you and Paul
Dubois :-).  But I think they should, it falls under their charter.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Thu Dec  2 17:30:02 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 2 Dec 1999 12:30:02 -0500 (EST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <199912021717.MAA14682@eric.cnri.reston.va.us>
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>
 <384586B4.48905B32@interet.com>
 <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>
 <3845928D.C0462322@interet.com>
 <199912012300.SAA10861@eric.cnri.reston.va.us>
 <384694F6.E5D74221@interet.com>
 <199912021603.LAA14455@eric.cnri.reston.va.us>
 <38469EB9.5EDB9617@interet.com>
 <199912021717.MAA14682@eric.cnri.reston.va.us>
Message-ID: <14406.44186.574647.651111@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > Sorry, you're right (except for a brief exchange between you and Paul
 > Dubois :-).  But I think they should, it falls under their charter.

  This was deliberatly postponed until after extension packages are
supported and in place.  I know Greg is interested in application
installation as well as package installation.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From gmcm@hypernet.com  Thu Dec  2 17:53:03 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 2 Dec 1999 12:53:03 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912021628.LAA14506@eric.cnri.reston.va.us>
References: Your message of "Fri, 19 Nov 1999 22:43:32 EST."             <1269053086-27079185@hypernet.com>
Message-ID: <1267965342-1446902@hypernet.com>

[Gordon]
> > No success whatsoever in either direction across Samba. In fact
> > the mtime of my Linux home directory as seen from NT is Jan 1,
> > 1980.
[Guido]
> That's only the case for an NT mount point (something of the form
> \\host\name; I notice that os.stat() only believes it exists if
> you append a backslash: \\host\name\).  For interior directories,
> at least with the Samba version that I'm using, os.stat() seems
> to give correct results.

Correct (as I discovered not long after I posted). (I find that 
from NT I have to stat some file _in_ the directory to get an 
updated mtime from the stat _of_ the directory).
 
> I think that this whole issue (that doing a stat on a directory
> to find out whether files in it were modified doesn't give usable
> results) is widely blown out of proportion.

This has come up twice: re caching importers and dircache.py 
(used only by dircmp). We've arrived at the fact that it _can_ 
be made to work on Windows boxes. NFS? Andrew (anyone 
still use that)?

IOW, do we want to trust it? Do we want to document that it 
might not be trustworthy in some situations? Make it optional-
for-wizards? Kill it?
 
IOOW, what's the proper proportion ;-)?

> The only useful bit of info is that mtimes may have an up to 2
> second granularity, and that anything as recent as 2 seconds
> should be considered as newer than the cache even if the cache is
> also less than 2 seconds.

From NT, at least, stat'ing any file in the directory seems to 
remove this 2 second limitation.


- Gordon


From guido@CNRI.Reston.VA.US  Thu Dec  2 20:43:46 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 15:43:46 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Fri, 19 Nov 1999 05:29:50 PST."
 <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org>
Message-ID: <199912022043.PAA15108@eric.cnri.reston.va.us>

Here's the promised response to Greg's response to my wishlist.

> On Thu, 18 Nov 1999, Guido van Rossum wrote:
> > Gordon McMillan wrote:
> >...
> > > I think imputil's emulation of the builtin importer is more of a 
> > > demonstration than a serious implementation. As for speed, it 
> > > depends on the test. 
> > 
> > Agreed.  I like some of imputil's features, but I think the API
> > need to be redesigned.
> 
> It what ways? It sounds like you've applied some thought. Do you have any
> concrete ideas yet, or "just a feeling" :-)  I'm working through some
> changes from JimA right now, and would welcome other suggestions. I think
> there may be some outstanding stuff from MAL, but I'm not sure (Marc?)

I actually think that the way the PVM (Python VM) calls the importer
ought to be changed.  Assigning to __builtin__.__import__ is a crock.
The API for __import__ is a crock.

> >...
> > So here's a challenge: redesign the import API from scratch.
> 
> I would suggest starting with imputil and altering as necessary. I'll use
> that viewpoint below.
> 
> > Let me start with some requirements.
> > 
> > Compatibility issues:
> > ---------------------
> > 
> > - the core API may be incompatible, as long as compatibility layers
> > can be provided in pure Python
> 
> Which APIs are you referring to? The "imp" module? The C functions? The
> __import__ and reload builtins?

> I'm guessing some of imp, the two builtins, and only one or two C
> functions.

All of those.

> > - support for rexec functionality
> 
> No problem. I can think of a number of ways to do this.

Agreed, I think that imputil can do this.

> > - support for freeze functionality
> 
> No problem. A function in "imp" must be exposed to Python to support this
> within the imputil framework.

Agreed.  It currently exports init_frozen() which is about the right
functionality.

> > - load .py/.pyc/.pyo files and shared libraries from files
> 
> No problem. Again, a function is needed for platform-specific loading of
> shared libraries.

Is it useful to expose the platform differences?  The current
imp.load_dynamic() should suffice.

> > - support for packages
> 
> No problem. Demo's in current imputil.
> 
> > - sys.path and sys.modules should still exist; sys.path might
> > have a slightly different meaning
> 
> I would suggest that both retain their *exact* meaning. We introduce
> sys.importers -- a list of importers to check, in sequence. The first
> importer on that list uses sys.path to look for and load modules. The
> second importer loads builtins and frozen code (i.e. modules not on
> sys.path).

This is looking like the redesign I was looking for.  (Note that
imputil's current chaining is not good since it's impossible to remove
or reorder importers, which I think is a required feature; an explicit
list would solve this.)

Actually, the order is the other way around, but by now you should
know that.  It makes sense to have separate ones for builtin and
frozen modules -- these have nothing in common.

There's another issue, which isn't directly addressed by imputil,
although with clever use of inheritance it might be doable.  I'd like
more support for this however.  Quite orthogonally to the issue of
having separate importers, I might want to recognize new extensions.
Take the example of the ILU folks.  They want to be able to drop a
file "foo.isl" in any directory on sys.path and have the ILU stubber
automatically run if you try to import foo (the client stubs) or
foo__skel (the server skeleton).

This doesn't fit in the sys.importers strategy, because they want to
be able to drop their .isl files in any directory along sys.path.
(Or, more likely, they want to have control over where in sys.modules
the directory/directories with .isl files are placed.)  This requires
an ugly modification to the _fs_import() function.  (Which should have
been a method, by the way, to make overriding it in a subclass of
PathImporter easier!)

I've been thinking here along the lines of a strategy where the
standard importer (the one that walks sys.path) has a set of hooks
that define various things it could look for, e.g. .py files, .pyc
files, .so or .dll files.  This list of hooks could be changed to
support looking for .isl files.

There's an old, subtle issue that could be solved through this as
well: whether or not a .pyc file without a .py file should be accepted
or not.  Long ago (in Python 0.9.8) a .pyc file alone would never be
loaded.  This was changed at the request of a small but vocal minority
of Python developers who wanted to distribute .pyc files without .py
files.  It has occasionally caused frustration because sometimes
developers move .py files around but forget to remove the .pyc files,
and then the .pyc file is silently picked up if it occurs on sys.path
earlier than where the .py was moved to.

Having a set of hooks for various extensions would make it possible to
have a default where lone .pyc files are ignored, but where one can
insert a .pyc importer in the list of hooks that does the right thing
here.  (Of course, it may be possible that this whole feature of lone
.pyc files should be replaced since the same need is easily taken care
of by zip importers.

I also want to support (Jim A notwithstanding :-) a feature whereby
different things besides directories can live on sys.path, as long as
they are strings -- these could be added from the PYTHONPATH env
variable.  Every piece of code that I've ever seen that uses sys.path
doesn't care if a directory named in sys.path doesn't exist -- it may
try to stat various files in it, which also don't exist, and as far as
it is concerned that is just an indication that the requested module
doesn't live there.

Again, we would have to dissect imputil to support various hooks that
deal with different kind of entities in sys.path.  The default hook
list would consist of a single item that interprets the name as a
directory name; other hooks could support zip files or URLs.  Jack's
"magic cookies" could also be supported nicely through such a
mechanism.

> Users can insert/append new importers or alter sys.path as before.
> 
> sys.modules continues to record name:module mappings.

Yes.

Note that the interpretation of __file__ could be problematic.  To
what value do you set __file__ for a module loaded from a zip archive?

> > - $PYTHONPATH and $PYTHONHOME should still be supported
> 
> No problem.
> 
> > (I wouldn't mind a splitting up of importdl.c into several
> > platform-specific files, one of which is chosen by the configure
> > script; but that's a bit of a separate issue.)
> 
> Easy enough. The standard importer can select the appropriate
> platform-specific module/function to perform the load. i.e. these can move
> to Modules/ and be split into a module-per-platform.

Again: what's the advantage of exposing the platform specificity?

> > New features:
> > -------------
> > 
> > - Integrated support for Greg Ward's distribution utilities (i.e. a
> >   module prepared by the distutil tools should install painlessly)
> 
> I don't know the specific requirements/functionality that would be
> required here (does Greg? :-), but I can't imagine any problem with this.

Probably more support is required from the other end: once it's common
for modules to be imported from zip files, the distutil code needs to
support the creation and installation of such zip files.  Also, there
is a need for the install phase of distutil to communicate the
location of the zip file to the Python installation.

> > - Good support for prospective authors of "all-in-one" packaging tool
> >   authors like Gordon McMillan's win32 installer or /F's squish.  (But
> >   I *don't* require backwards compatibility for existing tools.)
> 
> Um. *No* problem. :-)

:-)

> > - Standard import from zip or jar files, in two ways:
> > 
> >   (1) an entry on sys.path can be a zip/jar file instead of a directory;
> >       its contents will be searched for modules or packages

Note that this is what I mention above for distutil support.

> While this could easily be done, I might argue against it. Old
> apps/modules that process sys.path might get confused.

Above I argued that this shouldn't be a problem.

> If compatibility is not an issue, then "No problem."
> 
> An alternative would be an Importer instance added to sys.importers that
> is configured for a specific archive (in other words, don't add the zip
> file to sys.path, add ZipImporter(file) to sys.importers).

This would be harder for distutil: where does Python get the initial
list of importers?

> Another alternative is an Importer that looks at a "sys.py_archives" list.
> Or an Importer that has a py_archives instance attribute.

OK, but again distutil needs to be able to add to this list when it
installs a package.  (Note that package deinstallation should also be
supported!)

(Of course I don't require this to affect Python processes that are
already running; but it should be possible to easily change the
default search path for all newly started instances of a given Python
installation.)

> >   (2) a file in a directory that's on sys.path can be a zip/jar file;
> >       its contents will be considered as a package (note that this is
> >       different from (1)!)
> 
> No problem. This will slow things down, as a stat() for *.zip and/or *.jar
> must be done, in addition to *.py, *.pyc, and *.pyo.

Fine, this is where the caching comes in handy.

> >   I don't particularly care about supporting all zip compression
> >   schemes; if Java gets away with only supporting gzip compression
> >   in jar files, so can we.
> 
> I presume we would support whatever zlib gives us, and no more.

That's it. :-)

> > - Easy ways to subclass or augment the import mechanism along
> >   different dimensions.  For example, while none of the following
> >   features should be part of the core implementation, it should be
> >   easy to add any or all:
> > 
> >   - support for a new compression scheme to the zip importer
> 
> Presuming ZipImporter is a class (derived from Importer), then this
> ability is wholly dependent upon the author of ZipImporter providing the
> hook.

Agreed.  But since we're likely going to provide this as a standandard
feature, we must ensure that it provides this hook.

> The Importer class is already designed for subclassing (and its interface 
> is very narrow, which means delegation is also *very* easy; see
> imputil.FuncImporter).

But maybe it's *too* narrow; some of the hooks I suggest above seem to
require extra interfaces -- at least in some of the subclasses of the
Importer base class.

Note: I looked at the doc string for get_code() and I don't understand
what the difference is between the modname and fqname arguments.  If I
write "import foo.bar", what are modname and fqname?  Why are both
present?  Also, while you claim that the API is narrow, the multiple
return values (also the different types for the second item) make it
complicated.

> >   - support for a new archive format, e.g. tar
> 
> A cakewalk. Gordon, JimA, and myself each have archive formats. :-)
> 
> >   - a hook to import from URLs or other data sources (e.g. a
> >     "module server" imported in CORBA) (this needn't be supported
> >     through $PYTHONPATH though)
> 
> No problem at all.
> 
> >   - a hook that imports from compressed .py or .pyc/.pyo files
> 
> No problem at all.
> 
> >   - a hook to auto-generate .py files from other filename
> >     extensions (as currently implemented by ILU)
> 
> No problem at all.

See above -- I think this should be more integrated with sys.path than
you are thinking of.  The more I think about it, the more I see that
the problem is that for you, the importer that uses sys.path is a
final subclass of Importer (i.e. it is itself not further subclassed).
Several of the hooks I want seem to require additional hooks in the
PathImporter rather than new importers.

> >   - a cache for file locations in directories/archives, to improve
> >     startup time
> 
> No problem at all.
> 
> >   - a completely different source of imported modules, e.g. for an
> >     embedded system or PalmOS (which has no traditional filesystem)
> 
> No problem at all.
> 
> In each of the above cases, the Importer.get_code() method just needs to
> grab the byte codes from the XYZ data source. That data source can be
> cmopressed, across a network, on-the-fly generated, or whatever. Each
> importer can certainly create a cache based on its concept of "location".
> In some cases, that would be a mapping from module name to filesystem
> path, or to a URL, or to a compiled-in, frozen module.

See above for sys.path integration remark.

> > - Note that different kinds of hooks should (ideally, and within
> >   reason) properly combine, as follows: if I write a hook to recognize
> >   .spam files and automatically translate them into .py files, and you
> >   write a hook to support a new archive format, then if both hooks are
> >   installed together, it should be possible to find a .spam file in an
> >   archive and do the right thing, without any extra action.  Right?
> 
> Ack. Very, very difficult.

Actually, I take most of this back.  Importers that deal with new
extension types often have to go through a file system to transform
their data to .py files, and this is just too complicated.  However it
would be still nice if there was code sharing between the code that
looks for .py and .pyc files in a zip archive and the code that does
the same in a filesystem.  Hm, maybe even that shouldn't be necessary,
the zip file probably should contain only .pyc files...

(Unrelated remark: I should really try to release the set of modules
we've written here at CNRI to deal with zip files.  Unfortunately zip
files are hairy and so is our code.)

> The imputil scheme combines the concept of locating/loading into one step.
> There is only one "hook" in the imputil system. Its semantic is "map this
> name to a code/module object and return it; if you don't have it, then
> return None."

That's fine.  I actually don't recall where the find-then-load API
came from, I think it may be an artefact of the original
implementation strategy.  It is currently used as follows: we try to
see if there's a .pyc and then we try to see if there's a .py; if both
exist we compare the timestamps etc. to choose which one.  But that's
still a red herring.

> Your compositing example is based on the capabilities of the
> find-then-load paradigm of the existing "ihooks.py". One module finds
> something (foo.spam) and the other module loads it (by generating a .py).

I still don't understand why ihooks.py had to be so complicated.  I
guess I just had much less of an understanding of the issues.  (It was
also partly a compromise with an alternative design by Ken Manheimer,
who basically forced me to support packages, originally through ni.py.)

> All is not lost, however. I can easily envision the get_code() hook as
> allowing any kind of return type. If it isn't a code or module object,
> then another hook is called to transform it.
> [ actually, I'd design it similarly: a *series* of hooks would be called
>   until somebody transforms the foo.spam into a code/module object. ]

OK.  This could be a feature of a subclass of Importer.

> The compositing would be limited ony by the (Python-based) Importer
> classes. For example, my ZipImporter might expect to zip up .pyc files
> *only*. Obviously, you would want to alter this to support zipping any
> file, then use the suffic to determine what to do at unzip time.
> 
> > - It should be possible to write hooks in C/C++ as well as Python
> 
> Use FuncImporter to delegate to an extension module.

Maybe not so great, since it sounds like the C code can't benefit from
any of the infrastructure that imputil offers.  I'm not sure about
this one though.

> This is one of the benefits of imputil's single/narrow interface.

Plus its vague specs? :-)

> > - Applications embedding Python may supply their own implementations,
> >   default search path, etc., but don't have to if they want to piggyback
> >   on an existing Python installation (even though the latter is
> >   fraught with risk, it's cheaper and easier to understand).
> 
> An application would have full control over the contents of sys.importers.
> 
> For a restricted execution app, it might install an Importer that loads
> files from *one* directory only which is configured from a specific
> Win32 Registry entry. That importer could also refuse to load shared
> modules. The BuiltinImporter would still be present (although the app
> would certainly omit all but the necessary builtins from the build).
> Frozen modules could be excluded.

Actually there's little reason to exclude frozen modules or any
.py/.pyc modules -- by definition, bytecode can't be dangerous.  It's
the builtins and extensions that need to be censored.

We currently do this by subclassing ihooks, where we mask the test for
builtins with a comparison to a predefined list of names.

> > Implementation:
> > ---------------
> > 
> > - There must clearly be some code in C that can import certain
> >   essential modules (to solve the chicken-or-egg problem), but I don't
> >   mind if the majority of the implementation is written in Python.
> >   Using Python makes it easy to subclass.
> 
> I posited once before that the cost of import is mostly I/O rather than
> CPU, so using Python should not be an issue. MAL demonstrated that a good
> design for the Importer classes is also required. Based on this, I'm a
> *strong* advocate of moving as much as possible into Python (to get
> Python's ease-of-coding with little relative cost).

Agreed.  However, how do you explain the slowdown (from 9 to 13
seconds I recall) though?  Are you a lousy coder? :-)

> The (core) C code should be able to search a path for a module and import
> it. It does not require dynamic loading or packages. This will be used to
> import exceptions.py, then imputil.py, then site.py.

It does, however, need to import builtin modules.  imputil currently
imports imp, sys, strop and __builtin__, struct and marshal; note that
struct can easily be a dynamic loadable module, and so could strop in
theory.  (Note that strop will be unnecessary in 1.6 if you use string
methods.)

I don't think that this chicken-or-egg problem is particularly
problematic though.

> The platform-specific module that perform dynamic-loading must be a
> statically linked module (in Modules/ ... it doesn't have to be in the
> Python/ directory).

See earlier comments.

> site.py can complete the bootstrap by setting up sys.importers with the
> appropriate Importer instances (this is where an application can define
> its own policy). sys.path was initially set by the import.c bootstrap code
> (from the compiled-in path and environment variables).

I thing that algorithm (currently in getpath.c / getpathp.c) might
also be moved to Python code -- imported frozen.  Sadly, rebuilding
with a new version of a frozen module might be more complicated than
rebuilding with a new version of a C module, but writing and
maintaining this code in Python would be *sooooooo* much easier that I
think it's worth it.

> Note that imputil.py would not install any hooks when it is loaded. That
> is up to site.py. This implies the core C code will import a total of
> three modules using its builtin system. After that, the imputil mechanism
> would be importing everything (site.py would .install() an Importer which
> then takes over the __import__ hook).

(Three not counting the builtin modules.)

> Further note that the "import" Python statement could be simplified to use
> only the hook. However, this would require the core importer to inject
> some module names into the imputil module's namespace (since it couldn't
> use an import statement until a hook was installed). While this
> simplification is "neat", it complicates the run-time system (the import
> statement is broken until a hook is installed).

Same chicken-or-egg.  We can be pragmatic.

For a developer, I'd like a bit of robustness (all this makes it
rather hard to debug a broken imputil, and that's a fair amount of
code!).

> Therefore, the core C code must also support importing builtins. "sys" and
> "imp" are needed by imputil to bootstrap.
> 
> The core importer should not need to deal with dynamic-load modules.

Same question.  Since that all has to be coded in C anyway, why not?

> To support frozen apps, the core importer would need to support loading
> the three modules as frozen modules.

I'd like to see a description of how someone like Jim A would build a
single-file application using the new mechanism.  This could
completely replace freeze.  (Freeze currently requires a C compiler;
that's bad.)

> The builtin/frozen importing would be exposed thru "imp" for use by
> imputil for future imports. imputil would load and use the (builtin)
> platform-specific module to do dynamic-load imports.

Sure.

> > - In order to support importing from zip/jar files using compression,
> >   we'd at least need the zlib extension module and hence libz itself,
> >   which may not be available everywhere.
> 
> Yes. I don't see this as a requirement, though. We wouldn't start to use
> these by default, would we? Or insist on zlib being present? I see this as
> more along the lines of "we have provided a standardized Importer to do
> this, *provided* you have zlib support."

Agreed.  Zlib support is easy to get, but there are probably platforms
where it's not.  (E.g. maybe the Mac?  I suppose that on the Mac,
there would be some importer classes to import from a resource fork.)

> > - I suppose that the bootstrap is solved using a mechanism very
> >   similar to what freeze currently used (other solutions seem to be
> >   platform dependent).
> 
> The bootstrap that I outlined above could be done in C code. The import
> code would be stripped down dramatically because you'll drop package
> support and dynamic loading.

Not the dynamic loading.  But yes the package support.

> Alternatively, you could probably do the path-scanning in Python and
> freeze that into the interpreter. Personally, I don't like this idea as it
> would not buy you much at all (it would still need to return to C for
> accessing a number of scanning functions and module importing funcs).
> 
> > - I also want to still support importing *everything* from the
> >   filesystem, if only for development.  (It's hard enough to deal with
> >   the fact that exceptions.py is needed during Py_Initialize();
> >   I want to be able to hack on the import code written in Python
> >   without having to rebuild the executable all the time.
> 
> My outline above does not freeze anything. Everything resides in the
> filesystem. The C code merely needs a path-scanning loop and functions to
> import .py*, builtin, and frozen types of modules.

Good.  Though I think there's also a need for freezing everything.
And when we go the route of the zip archive, the zip archive handling
code needs to be somewhere -- frozen seems to be a reasonable choice.

> If somebody nukes their imputil.py or site.py, then they return to Python
> 1.4 behavior where the core interpreter uses a path for importing (i.e. no
> packages). They lose dynamically-loaded module support.

But if the path guessing is also done by site.py (as I propose) the
path will probably be wrong.  A warning should be printed.

> > Let's first complete the requirements gathering.  Are these
> > requirements reasonable?  Will they make an implementation too
> > complex?  Am I missing anything?
> 
> I'm not a fan of the compositing due to it requiring a change to semantics
> that I believe are very useful and very clean. However, I outlined a
> possible, clean solution to do that (a secondary set of hooks for
> transforming get_code() return values).

As you may see from my responses, I'm a big fan of having several
different sets of hooks.  I do withdraw the composition requirement
though.

> The requirements are otherwise reasonable to me, as I see that they can
> all be readily solved (i.e. they aren't burdensome).
> 
> While this email may be long, I do not believe the resulting system would
> be complex. From the user-visible side of things, nothing would be
> changed. sys.path is still present and operates as before. They *do* have
> new functionality they can grow into, though (sys.importers). The
> underlying C code is simplified, and the platform-specific dynamic-load
> stuff can be distributed to distinct modules, as needed
> (e.g. BeOS/dynloadmodule.c and PC/dynloadmodule.c).
> 
> > Finally, to what extent does this impact the desire for dealing
> > differently with the Python bytecode compiler (e.g. supporting
> > optimizers written in Python)?  And does it affect the desire to
> > implement the read-eval-print loop (the >>> prompt) in Python?
> 
> If the three startup files require byte-compilation, then you could have
> some issues (i.e. the byte-compiler must be present).

Another chicken-or-egg.  No biggie.

> Once you hit site.py, you have a "full" environment and can easily detect
> and import a read-eval-print loop module (i.e. why return to Python? just 
> start things up right there).

You mean "why return to C?"  I agree.  It would be cool if somehow
IDLE and Pythonwin would also be bootstrapped using the same
mechanisms.  (This would also solve the question "which interactive
environment am I using?" that some modules and apps want to see
answered because they need to do things differently when run under
IDLE,for example.)

> site.py can also install new optimizers as desired, a new Python-based
> parser or compiler, or whatever...  If Python is built without a parser or
> compiler (I hope that's an option!), then the three startup modules would
> simply be frozen into the executable.

More power to hooks!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Thu Dec  2 21:22:33 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 2 Dec 1999 16:22:33 -0500 (EST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us>
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org>
 <199912022043.PAA15108@eric.cnri.reston.va.us>
Message-ID: <14406.58137.359127.921135@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > variable.  Every piece of code that I've ever seen that uses sys.path
 > doesn't care if a directory named in sys.path doesn't exist -- it may
 > try to stat various files in it, which also don't exist, and as far as

  Not the case -- I know you've looked at some of my code in the KOE
that ensures only real directories are on the path, and each is only
there once (pathhack.py).  Given that sys.path is often too long and
includes duplicate entries in a large system (often one entry with and
one without a trailing / for a given directory), it useful to be able
to distinguish between things that should be interpretable as paths
and things that aren't.  It should not be hard to declare that
"cookies" or whatever have some special form, like "<cookie>".

 > (Unrelated remark: I should really try to release the set of modules
 > we've written here at CNRI to deal with zip files.  Unfortunately zip
 > files are hairy and so is our code.)

  It doesn't help that that code just plain stinks.  I maintain that
no one here understands the whole of it.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From jcw@equi4.com  Thu Dec  2 21:41:46 1999
From: jcw@equi4.com (Jean-Claude Wippler)
Date: Thu, 02 Dec 1999 22:41:46 +0100
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us>
Message-ID: <3846E79A.446EAFD5@equi4.com>

Guido van Rossum wrote:

[...]
> Note that the interpretation of __file__ could be problematic.  To
> what value do you set __file__ for a module loaded from a zip archive?

Makefiles use "archive(entry)" (this also supports nesting if needed).

[...] 
> I'd like to see a description of how someone like Jim A would build a
> single-file application using the new mechanism.  This could
> completely replace freeze.  (Freeze currently requires a C compiler;
> that's bad.)
[...]

This may be off-topic, but has anyone considered what it would take to
load shared libs out of an archive?  One way is to extract on-the-fly to
a temporary area.  A refinement is to leave extracted files there as
cache, and perhaps even to extract to a file with a name derived from
its MD5 digest (this way multiple users and even Python installations
can share the cache).  Would it be useful to define a "standard" area?

-- Jean-Claude


From gmcm@hypernet.com  Thu Dec  2 23:15:50 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 2 Dec 1999 18:15:50 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us>
References: Your message of "Fri, 19 Nov 1999 05:29:50 PST."             <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org>
Message-ID: <1267945992-2611810@hypernet.com>

[Guido]
 big snip
> Note that the interpretation of __file__ could be problematic. 
> To what value do you set __file__ for a module loaded from a zip
> archive?

I just left it alone (ie, as it was when I picked up the .pyc). 
Turns out OK, because then when the end user files a bug 
report, the developer can track it down.

> Note: I looked at the doc string for get_code() and I don't
> understand what the difference is between the modname and fqname
> arguments.  If I write "import foo.bar", what are modname and
> fqname?  

As I recall:
 import foo.bar
 -> get_code(None, 'foo', 'foo') # returns foo
 -> get_code(<self>, 'bar', 'foo.bar')

> Why are both present?  

I think so the importer can choose between being tree 
structured or flat.

> I'd like to see a description of how someone like Jim A would
> build a single-file application using the new mechanism.  This
> could completely replace freeze.  (Freeze currently requires a C
> compiler; that's bad.)

I have something working for Linux now. I froze exceptions.py. 
I hacked getpath.c so prefix = exec_prefix = executable's 
directory and the starting path is [prefix]. Although I did it 
differently, you could regard imputil.py and archive.py as 
frozen, too. (On WIndows it's somewhat different, because the 
result uses the stock python15.dll.) This somewhat 
oversimplifies; and I haven't really thought out all the ways 
people might try to use sym links. I'm inclined to think the 
starting path should contain both the executable's real 
directory and the sym link's directory.

> ....  I do withdraw the composition
> requirement though.

Hooray!


- Gordon


From gstein@lyra.org  Fri Dec  3 00:19:14 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 2 Dec 1999 16:19:14 -0800 (PST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <384694D8.DCA3D75E@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>

On Thu, 2 Dec 1999, M.-A. Lemburg wrote:
>...
> Still, I would like to rephrase my 0.02EUR which I already
> posted twice... why not start to think about what these
> importers would do first ? If there are only a handful of
> wishes we could just add them to the builtin machinery and
> be done with it...

I'd rather see the builtin machinery move to Python, regardless of what
system is used and/or what features are added.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Fri Dec  3 03:19:40 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 2 Dec 1999 19:19:40 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912021654100.18529-100000@nebula.lyra.org>

On Thu, 2 Dec 1999, Guido van Rossum wrote:
>...
> Sometime, Greg Stein wrote:
>...
> > On Thu, 18 Nov 1999, Guido van Rossum wrote:
>...
> > > Agreed.  I like some of imputil's features, but I think the API
> > > need to be redesigned.
> > 
> > It what ways? It sounds like you've applied some thought. Do you have any
> > concrete ideas yet, or "just a feeling" :-)  I'm working through some
> > changes from JimA right now, and would welcome other suggestions. I think
> > there may be some outstanding stuff from MAL, but I'm not sure (Marc?)
> 
> I actually think that the way the PVM (Python VM) calls the importer
> ought to be changed.  Assigning to __builtin__.__import__ is a crock.
> The API for __import__ is a crock.

Something like sys.set_import_hook() ?

The other alternative that I see would be to have the C code scan
sys.importers, assuming each are callable objects, and call them with the
appropriate params (e.g. module name). Of course, to move this scanning
into Python would require something like sys.set_import_hook() unless
Python looks for a hard-coded module and entrypoint.

>...
> > Which APIs are you referring to? The "imp" module? The C functions? The
> > __import__ and reload builtins?
> 
> > I'm guessing some of imp, the two builtins, and only one or two C
> > functions.
> 
> All of those.

We can provide Python code to provide compatibility for "imp" and the two
hooks. Nothing we can do to the C code, though. I'm not sure what the
import API looks like from C, and whether they could all stay. A brief
glance looks like most could stay.
[ removing any would change Python's API version, which might be "okay" ]

>...
> > > - load .py/.pyc/.pyo files and shared libraries from files
> > 
> > No problem. Again, a function is needed for platform-specific loading of
> > shared libraries.
> 
> Is it useful to expose the platform differences?  The current
> imp.load_dynamic() should suffice.

This comes up several times throughout this message, and in some off-list
mail Guido and I have exchanged. Namely, "should dynamic loading be part
of the core, or performed via a module?"

I would rather see it become a module, rather than inside the core
(despite the fact that the module would have to be compiled into the
interpreter). I believe this provides more flexibility for people looking
to replace/augment/update/fix dynamic loading on various architectures.
Rather than changing the core, a person can just drop in another module.
The isolation between the core and modules is nicer, aesthetically, to me.

The modules would also be exposing Just Another Importer Function, rather
than a specialized API in the builtin imp module. Also note that it is
easier to keep a module *out* of a Python-based application, than it is to
yank functions out of the core of Python. Frozen apps, embedded apps, etc
could easily leave out dynamic loading.

Are there strict advantages? Not any that I can think of right now (beyond
a bit of ease-of-use mentioned above). It just feels better to me.

>...
> > > - sys.path and sys.modules should still exist; sys.path might
> > > have a slightly different meaning
> > 
> > I would suggest that both retain their *exact* meaning. We introduce
> > sys.importers -- a list of importers to check, in sequence. The first
> > importer on that list uses sys.path to look for and load modules. The
> > second importer loads builtins and frozen code (i.e. modules not on
> > sys.path).
> 
> This is looking like the redesign I was looking for.  (Note that
> imputil's current chaining is not good since it's impossible to remove
> or reorder importers, which I think is a required feature; an explicit
> list would solve this.)

The chaining is an aspect of the current, singular import hook that Python
uses. In the past, I've suggested the installation of a "manager" that
maintains a list. sys.importers is similar in practice.

Note that this Manager would be present with the sys.set_import_hook()
scheme, while the Manager is implied if the core scans sys.importers.

> Actually, the order is the other way around, but by now you should
> know that.  It makes sense to have separate ones for builtin and
> frozen modules -- these have nothing in common.

Yes, JimA pointed this out. The latest imputil has corrected this.

I combined the builtin and frozen Importers because they were just so
similar. I didn't want to iterate over two Importers when a single one
sufficed quite well.

*shrug* Could go either way, really.

> There's another issue, which isn't directly addressed by imputil,
> although with clever use of inheritance it might be doable.  I'd like
> more support for this however.  Quite orthogonally to the issue of
> having separate importers, I might want to recognize new extensions.

Correct: while imputil doesn't address this, the standard/default Importer
classes *definitely* can.

>...
> the directory/directories with .isl files are placed.)  This requires
> an ugly modification to the _fs_import() function.  (Which should have
> been a method, by the way, to make overriding it in a subclass of
> PathImporter easier!)

I yanked that code out of the DirectoryImporter so that the PathImporter
could use it. I could see a reorg that creates a FileSystemImporter that
defines the method, and the other two just subclass from that.

> I've been thinking here along the lines of a strategy where the
> standard importer (the one that walks sys.path) has a set of hooks
> that define various things it could look for, e.g. .py files, .pyc
> files, .so or .dll files.  This list of hooks could be changed to
> support looking for .isl files.

Agreed. It should be easy to have a mapping of extension to handler.

One issue: should there be an ordering to the extensions? Exercise for the
reader to alter the data structures...

> There's an old, subtle issue that could be solved through this as
> well: whether or not a .pyc file without a .py file should be accepted
> or not.  Long ago (in Python 0.9.8) a .pyc file alone would never be
> loaded.  This was changed at the request of a small but vocal minority
> of Python developers who wanted to distribute .pyc files without .py
> files.  It has occasionally caused frustration because sometimes
> developers move .py files around but forget to remove the .pyc files,
> and then the .pyc file is silently picked up if it occurs on sys.path
> earlier than where the .py was moved to.

I think, "too bad for them."  :-)

Having just a .pyc is a very nice feature. But how can you tell whether it
was meant to be a plain .pyc or a mis-ordered one? To truly resolve that,
you would need to scan the whole path, looking for a .py. However, maybe
somebody put the .pyc there on purpose, to override the .py!

--- begin slightly-off-topic ---

Here is a neat little Bash script that allows you to use a .pyc as a CGI
(to avoid parse overhead). Normally, you can't just drop a .pyc into the
cgi-bin directory because the OS doesn't know how to execute it. Not a
problem, I say... just append your .pyc to the following Bash script and
execute! :-)

#!/bin/bash
exec - 3< $0 ; exec python -c 'import os,marshal ; f = os.fdopen(3, "rb")
; f.readline() ; f.readline() ; f.seek(8, 1) ; _c = marshal.load(f) ; del
os, marshal, f ; exec _c' $@

(the script should be two lines; and no... you can't use readlines(2))

The above script will preserve stdin, stdout, and stderr. If the caller
also use 3< ... well, that got overridden :-)

The script doesn't work on Windows for two reasons, though: 1) Bash, 2)
the "rb" mode followed by readline()

Detailed info at the bottom of http://www.lyra.org/greg/python/

--- end of off-topic ---

> Having a set of hooks for various extensions would make it possible to
> have a default where lone .pyc files are ignored, but where one can
> insert a .pyc importer in the list of hooks that does the right thing
> here.  (Of course, it may be possible that this whole feature of lone
> .pyc files should be replaced since the same need is easily taken care
> of by zip importers.

Maybe. I'd still like to see plain .pyc files, but I know I can work
around any change you might make here :-)

(i.e. whatever you'd like to do... go for it)

> I also want to support (Jim A notwithstanding :-) a feature whereby
> different things besides directories can live on sys.path, as long as
> they are strings -- these could be added from the PYTHONPATH env
> variable.  Every piece of code that I've ever seen that uses sys.path
> doesn't care if a directory named in sys.path doesn't exist -- it may
> try to stat various files in it, which also don't exist, and as far as
> it is concerned that is just an indication that the requested module
> doesn't live there.

I'm not in favor of this, but it is more-than-doable. Again: your
discretion...

> Again, we would have to dissect imputil to support various hooks that
> deal with different kind of entities in sys.path.  The default hook
> list would consist of a single item that interprets the name as a
> directory name; other hooks could support zip files or URLs.  Jack's
> "magic cookies" could also be supported nicely through such a
> mechanism.

Specifically, the PathImporter would get "dissected" :-). No problem.

> > Users can insert/append new importers or alter sys.path as before.
> > 
> > sys.modules continues to record name:module mappings.
> 
> Yes.
> 
> Note that the interpretation of __file__ could be problematic.  To
> what value do you set __file__ for a module loaded from a zip archive?

You don't (certainly in a way that is nice/compatible for modules that
refer to it). This is why I don't like __file__ and __path__. They just
don't make sense in archives or frozen code. Python code that relies on
them will create problems when that code is placed into different
packaging mechanisms.

>...
> > > (I wouldn't mind a splitting up of importdl.c into several
> > > platform-specific files, one of which is chosen by the configure
> > > script; but that's a bit of a separate issue.)
> > 
> > Easy enough. The standard importer can select the appropriate
> > platform-specific module/function to perform the load. i.e. these can move
> > to Modules/ and be split into a module-per-platform.
> 
> Again: what's the advantage of exposing the platform specificity?

See above.

>...
> Probably more support is required from the other end: once it's common
> for modules to be imported from zip files, the distutil code needs to
> support the creation and installation of such zip files.  Also, there
> is a need for the install phase of distutil to communicate the
> location of the zip file to the Python installation.

I'm quite confident that something can be designed that would satisfy the
needs here. Something akin to .pth files that a zip importer could read.

>...
> > > - Standard import from zip or jar files, in two ways:
> > > 
> > >   (1) an entry on sys.path can be a zip/jar file instead of a directory;
> > >       its contents will be searched for modules or packages
> 
> Note that this is what I mention above for distutil support.
> 
> > While this could easily be done, I might argue against it. Old
> > apps/modules that process sys.path might get confused.
> 
> Above I argued that this shouldn't be a problem.

For most code, no, but as Fred mentioned (and I surmise), there are things
out there assuming that sys.path contains strings which specify
directories.

Sure, we can do this (your discretion), but my feeling is to avoid it.

> > If compatibility is not an issue, then "No problem."
> > 
> > An alternative would be an Importer instance added to sys.importers that
> > is configured for a specific archive (in other words, don't add the zip
> > file to sys.path, add ZipImporter(file) to sys.importers).
> 
> This would be harder for distutil: where does Python get the initial
> list of importers?

Default is just the two: BuiltinImporter and PathImporter. Adding
ZipImporters (or anything else) at startup is TBD, but shouldn't pose a
problem.

>...
> > >   (2) a file in a directory that's on sys.path can be a zip/jar file;
> > >       its contents will be considered as a package (note that this is
> > >       different from (1)!)
> > 
> > No problem. This will slow things down, as a stat() for *.zip and/or *.jar
> > must be done, in addition to *.py, *.pyc, and *.pyo.
> 
> Fine, this is where the caching comes in handy.

IFF caching is enabled for the particular platform and installation.

>...
> > The Importer class is already designed for subclassing (and its interface 
> > is very narrow, which means delegation is also *very* easy; see
> > imputil.FuncImporter).
> 
> But maybe it's *too* narrow; some of the hooks I suggest above seem to
> require extra interfaces -- at least in some of the subclasses of the
> Importer base class.

Correct -- the *subclasses*. I still maintain the imputil design of a
single hook (get_code) is Right.

I'll make a swipe at PathImporter in the next few weeks to add the
capability for new extensions.

> Note: I looked at the doc string for get_code() and I don't understand
> what the difference is between the modname and fqname arguments.  If I
> write "import foo.bar", what are modname and fqname?  Why are both
> present?  Also, while you claim that the API is narrow, the multiple
> return values (also the different types for the second item) make it
> complicated.

Gordon detailed this in another note...

Yes, the multiple return values make it a bit more complicated, but I
can't think of any reasonable alternatives.

A bit more doc should do the trick, I'd guess.

>...
> > >   - a hook to auto-generate .py files from other filename
> > >     extensions (as currently implemented by ILU)
> > 
> > No problem at all.
> 
> See above -- I think this should be more integrated with sys.path than
> you are thinking of.  The more I think about it, the more I see that
> the problem is that for you, the importer that uses sys.path is a
> final subclass of Importer (i.e. it is itself not further subclassed).
> Several of the hooks I want seem to require additional hooks in the
> PathImporter rather than new importers.

Correct -- I've currently designed/implemented PathImporter as "final".

I don't forsee a problem turning it into something that can be hooked at
run-time, or subclassed at code-time. A detailing of the features needed 
would be handy:

* allow alternative file suffixes, with functions or subclasses to map the
  file into a code/module object.

>...
> > > - Note that different kinds of hooks should (ideally, and within
> > >   reason) properly combine, as follows: if I write a hook to recognize
> > >   .spam files and automatically translate them into .py files, and you
> > >   write a hook to support a new archive format, then if both hooks are
> > >   installed together, it should be possible to find a .spam file in an
> > >   archive and do the right thing, without any extra action.  Right?
> > 
> > Ack. Very, very difficult.
> 
> Actually, I take most of this back.  Importers that deal with new
> extension types often have to go through a file system to transform
> their data to .py files, and this is just too complicated.  However it
> would be still nice if there was code sharing between the code that
> looks for .py and .pyc files in a zip archive and the code that does
> the same in a filesystem.  Hm, maybe even that shouldn't be necessary,
> the zip file probably should contain only .pyc files...

Gordon replies to this... All of the archives that myself, Gordon, and
JimA have been using only store .pyc files. I don't see much code sharing
between the filesystem and archive import code.

>...
> > All is not lost, however. I can easily envision the get_code() hook as
> > allowing any kind of return type. If it isn't a code or module object,
> > then another hook is called to transform it.
> > [ actually, I'd design it similarly: a *series* of hooks would be called
> >   until somebody transforms the foo.spam into a code/module object. ]
> 
> OK.  This could be a feature of a subclass of Importer.

That would be my preference, rather than loading more into the Importer
base class itself.

>...
> > > - It should be possible to write hooks in C/C++ as well as Python
> > 
> > Use FuncImporter to delegate to an extension module.
> 
> Maybe not so great, since it sounds like the C code can't benefit from
> any of the infrastructure that imputil offers.  I'm not sure about
> this one though.

There isn't any infrastructure that needs to be accessed. get_code() is
the call-point, and there is no mechanism provided to the callee to call
back into the imputil system.

> > This is one of the benefits of imputil's single/narrow interface.
> 
> Plus its vague specs? :-)

Ouch. I thought I was actually doing quite a bit better than normal with
that long doc-string on get_code :-(

>...
> > For a restricted execution app, it might install an Importer that loads
> > files from *one* directory only which is configured from a specific
> > Win32 Registry entry. That importer could also refuse to load shared
> > modules. The BuiltinImporter would still be present (although the app
> > would certainly omit all but the necessary builtins from the build).
> > Frozen modules could be excluded.
> 
> Actually there's little reason to exclude frozen modules or any
> .py/.pyc modules -- by definition, bytecode can't be dangerous.  It's
> the builtins and extensions that need to be censored.
> 
> We currently do this by subclassing ihooks, where we mask the test for
> builtins with a comparison to a predefined list of names.

True. My concern is an invader misusing one "type" of module for another.
For example, let's say you've provided a selection of modules each
exporting function FOO, and the user can configure which module to use.
Can they do damage if some unrelated, frozen module also exports FOO?

Minor issue, anyhow. All the functionality is there.

>...
> > I posited once before that the cost of import is mostly I/O rather than
> > CPU, so using Python should not be an issue. MAL demonstrated that a good
> > design for the Importer classes is also required. Based on this, I'm a
> > *strong* advocate of moving as much as possible into Python (to get
> > Python's ease-of-coding with little relative cost).
> 
> Agreed.  However, how do you explain the slowdown (from 9 to 13
> seconds I recall) though?  Are you a lousy coder? :-)

Heh :-)

I have not spent *any* time working on optimization. Currently, each
Importer in the chain redoes some work of the prior Importer. A bit of
restructuring would split the common work out to a Manager, which then
calls a method in the Importer (and passes all the computed work). Of
course, a bit of profiling wouldn't hurt either. Some of the "imp"
interfaces could possibly be refined to better support the BuiltinImporter
or the dynamic load features.

The question is still valid, though -- at the moment, I can't explain it
because I haven't looked into it.

> > The (core) C code should be able to search a path for a module and import
> > it. It does not require dynamic loading or packages. This will be used to
> > import exceptions.py, then imputil.py, then site.py.

Note: after writing this, I realized there is really no need for the core
to do the imputil import. site.py can easily do that.

> It does, however, need to import builtin modules.  imputil currently

Correct.

> imports imp, sys, strop and __builtin__, struct and marshal; note that
> struct can easily be a dynamic loadable module, and so could strop in
> theory.  (Note that strop will be unnecessary in 1.6 if you use string
> methods.)

I knew about strop, but imputil would be harder to use today if it relied
on the string methods. So... I've delayed that change.

The struct module is used in a couple teeny cases, dealing with
constructing a network-order, 4-byte, binary integer value. It would be
easy enough to just do that with a bit of Python code instead.

> I don't think that this chicken-or-egg problem is particularly
> problematic though.

Right.

In my ideal world, the core couldn't do a dynamic load, so that would need
to be considered within the bootstrap process.

>...
> > site.py can complete the bootstrap by setting up sys.importers with the
> > appropriate Importer instances (this is where an application can define
> > its own policy). sys.path was initially set by the import.c bootstrap code
> > (from the compiled-in path and environment variables).
> 
> I thing that algorithm (currently in getpath.c / getpathp.c) might
> also be moved to Python code -- imported frozen.  Sadly, rebuilding
> with a new version of a frozen module might be more complicated than
> rebuilding with a new version of a C module, but writing and
> maintaining this code in Python would be *sooooooo* much easier that I
> think it's worth it.

I think we can find a better way to freeze modules and to use them.
Especially for the cases where we have specific "core" functions
implemented in Python. (e.g. freezing parsers, compilers, and/or the
read-eval loop)

I don't forsee an issue that the build process becomes more complicated.
If we nuke "makesetup" in favor of a Python script, then we could create a
stub Python executable which runs the build script which writes the Setup
file and the getpath*.c file(s).

> > Note that imputil.py would not install any hooks when it is loaded. That
> > is up to site.py. This implies the core C code will import a total of
> > three modules using its builtin system. After that, the imputil mechanism
> > would be importing everything (site.py would .install() an Importer which
> > then takes over the __import__ hook).
> 
> (Three not counting the builtin modules.)

Correct, although I'll modify my statement to "two plus the builtins".

> > Further note that the "import" Python statement could be simplified to use
> > only the hook. However, this would require the core importer to inject
> > some module names into the imputil module's namespace (since it couldn't
> > use an import statement until a hook was installed). While this
> > simplification is "neat", it complicates the run-time system (the import
> > statement is broken until a hook is installed).
> 
> Same chicken-or-egg.  We can be pragmatic.
> 
> For a developer, I'd like a bit of robustness (all this makes it
> rather hard to debug a broken imputil, and that's a fair amount of
> code!).

True. I threw that out as an alternative, and then presented the counter
argument :-)

>...
> > Therefore, the core C code must also support importing builtins. "sys" and
> > "imp" are needed by imputil to bootstrap.
> > 
> > The core importer should not need to deal with dynamic-load modules.
> 
> Same question.  Since that all has to be coded in C anyway, why not?

It simplifies the core's import code to not deal with that stuff at all.

> > To support frozen apps, the core importer would need to support loading
> > the three modules as frozen modules.
> 
> I'd like to see a description of how someone like Jim A would build a
> single-file application using the new mechanism.  This could
> completely replace freeze.  (Freeze currently requires a C compiler;
> that's bad.)

The portable mechanism for freezing will always need a compiler. Platform
specific mechanisms (e.g. append to the .EXE, or use the linker to create
a new ELF section) can optimize the freeze process in different ways.

I don't have a design in my head for the freeze issues -- I've been
considering that the mechanism would remain about the same. However, I can
easily see that different platforms may want to use different freeze
processes... hmm...

>...
> > Yes. I don't see this as a requirement, though. We wouldn't start to use
> > these by default, would we? Or insist on zlib being present? I see this as
> > more along the lines of "we have provided a standardized Importer to do
> > this, *provided* you have zlib support."
> 
> Agreed.  Zlib support is easy to get, but there are probably platforms
> where it's not.  (E.g. maybe the Mac?  I suppose that on the Mac,
> there would be some importer classes to import from a resource fork.)

Exactly. And importer classes to load from a Win32 resources (modifying a
.EXE's resources post-link is cleaner than the append solution)

>...
> > My outline above does not freeze anything. Everything resides in the
> > filesystem. The C code merely needs a path-scanning loop and functions to
> > import .py*, builtin, and frozen types of modules.
> 
> Good.  Though I think there's also a need for freezing everything.
> And when we go the route of the zip archive, the zip archive handling
> code needs to be somewhere -- frozen seems to be a reasonable choice.

Sure.

> > If somebody nukes their imputil.py or site.py, then they return to Python
> > 1.4 behavior where the core interpreter uses a path for importing (i.e. no
> > packages). They lose dynamically-loaded module support.
> 
> But if the path guessing is also done by site.py (as I propose) the
> path will probably be wrong.  A warning should be printed.

All right. Doesn't Python already print a warning if it can't find
site.py?

> > > Let's first complete the requirements gathering.  Are these
> > > requirements reasonable?  Will they make an implementation too
> > > complex?  Am I missing anything?
> > 
> > I'm not a fan of the compositing due to it requiring a change to semantics
> > that I believe are very useful and very clean. However, I outlined a
> > possible, clean solution to do that (a secondary set of hooks for
> > transforming get_code() return values).
> 
> As you may see from my responses, I'm a big fan of having several
> different sets of hooks.

Yes. However, I've only recognized one so far. Propose more... I'm
confident we can update the PathImporter design to accomodate (and retain
the underlying imputil paradigm).

> I do withdraw the composition requirement
> though.

:-)

>...
> > Once you hit site.py, you have a "full" environment and can easily detect
> > and import a read-eval-print loop module (i.e. why return to Python? just 
> > start things up right there).
> 
> You mean "why return to C?"  I agree.  It would be cool if somehow

Heh. Yah, that's what I meant :-)

> IDLE and Pythonwin would also be bootstrapped using the same
> mechanisms.  (This would also solve the question "which interactive
> environment am I using?" that some modules and apps want to see
> answered because they need to do things differently when run under
> IDLE,for example.)

Haven't thought on this. Should be doable, I'd think.

> > site.py can also install new optimizers as desired, a new Python-based
> > parser or compiler, or whatever...  If Python is built without a parser or
> > compiler (I hope that's an option!), then the three startup modules would
> > simply be frozen into the executable.
> 
> More power to hooks!

:-) You betcha!

I believe my next order of business:

* update PathImporter with the file-extension hook
* dynload C code reorg, per the other email
* create new-model site.py and trash import.c
* review freeze mechanisms and process
* design mechanism for frozen core functionality (eg. getpath*.c)
  (coding and building design)
* shift core functions to Python, using above design

I'll just plow ahead, but also recognize that any/all may change. ie. I'll
build examples/finals/prototypes and Guido can pick/choose/reimplement/etc
as needed. I'm out next week, but should start on the above items by the
end of the month (will probably do another mod_dav release in there
somewhere).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik@pythonware.com  Fri Dec  3 10:10:10 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 3 Dec 1999 11:10:10 +0100
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com>
Message-ID: <023601bf3d78$0ec3dc30$f29b12c2@secret.pythonware.com>

Jean-Claude Wippler <jcw@equi4.com> wrote:
> This may be off-topic, but has anyone considered what it would take to
> load shared libs out of an archive?

well, we do that in a number of applications.

(lazy installers are really cool... if you've installed works,
you've seen some weird stuff -- for example, when the
application starts the first time, it's loading everything
from inside the installer.  the rest of the installation is
done from within the application itself, using archives
in the installation executable)

I think things like this are better left for the application
designers, though...

</F>


From mal@lemburg.com  Fri Dec  3 10:03:31 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 03 Dec 1999 11:03:31 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>
Message-ID: <38479573.B2CFDD2B@lemburg.com>

Greg Stein wrote:
> 
> On Thu, 2 Dec 1999, M.-A. Lemburg wrote:
> >...
> > Still, I would like to rephrase my 0.02EUR which I already
> > posted twice... why not start to think about what these
> > importers would do first ? If there are only a handful of
> > wishes we could just add them to the builtin machinery and
> > be done with it...
> 
> I'd rather see the builtin machinery move to Python, regardless of what
> system is used and/or what features are added.

In the long run that's probably the right direction, but right now
we are only talking a very small set of additional features,
which can easily be added to the existing code without too much
fuzz.

Plus it won't slow things down, which is important since
Python startup time is already an issue all by itself. The
imputil.py approach of doing (a whole bunch of) recursive Python
function calls to all kinds of importers will not speed this up,
I'm afraid. A on-disk lookup table would speed this up, but
it would also break the current logic in imputil.py, which
puts importer independence above all.

--

IMHO, we should retreat to a more centralized interface,
one which more resembles a manager rather than the agent
interface implemented in imputil.py. Add-ons can then
register themselves to say "hey, I can handle pyz-archives"
or "I know how to import .so modules" or "I provide a
search function which you can call to have me scan
my module container (directory, web-site, archive)".

The manager would take care of what to call and in which
order, plus delegate requests to add-ons which implement
the needed logic, e.g. add-ons for signature checking, unzipping
archives, file system lookup tables, etc.

It could also trace its actions and then keep an on-disk
knowledge base for what it did in the past to find certain
modules under certain conditions.

Anyway, all this is extra magic for some future version of
Python.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    28 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@CNRI.Reston.VA.US  Fri Dec  3 13:45:07 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 03 Dec 1999 08:45:07 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Fri, 03 Dec 1999 11:03:31 +0100."
 <38479573.B2CFDD2B@lemburg.com>
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>
 <38479573.B2CFDD2B@lemburg.com>
Message-ID: <199912031345.IAA16376@eric.cnri.reston.va.us>

[Greg]
> > I'd rather see the builtin machinery move to Python, regardless of what
> > system is used and/or what features are added.

[Marc]
> In the long run that's probably the right direction, but right now
> we are only talking a very small set of additional features,
> which can easily be added to the existing code without too much
> fuzz.

I disagree.  We should do the redisign right rather than tweaking the
existing code.

> Plus it won't slow things down, which is important since
> Python startup time is already an issue all by itself. The
> imputil.py approach of doing (a whole bunch of) recursive Python
> function calls to all kinds of importers will not speed this up,
> I'm afraid. A on-disk lookup table would speed this up, but
> it would also break the current logic in imputil.py, which
> puts importer independence above all.

I don't care about the current logic in imputil.  It's only a prototype!

> IMHO, we should retreat to a more centralized interface,
> one which more resembles a manager rather than the agent
> interface implemented in imputil.py. Add-ons can then
> register themselves to say "hey, I can handle pyz-archives"
> or "I know how to import .so modules" or "I provide a
> search function which you can call to have me scan
> my module container (directory, web-site, archive)".

This makes sense.

> The manager would take care of what to call and in which
> order, plus delegate requests to add-ons which implement
> the needed logic, e.g. add-ons for signature checking, unzipping
> archives, file system lookup tables, etc.
> 
> It could also trace its actions and then keep an on-disk
> knowledge base for what it did in the past to find certain
> modules under certain conditions.
> 
> Anyway, all this is extra magic for some future version of
> Python.

I would say the manager API design and a basic set of specific
handlers should go into 1.6.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Fri Dec  3 14:14:00 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 3 Dec 1999 15:14:00 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us>
Message-ID: <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com>

MAL wrote:
> > IMHO, we should retreat to a more centralized interface,
> > one which more resembles a manager rather than the agent
> > interface implemented in imputil.py. Add-ons can then
> > register themselves to say "hey, I can handle pyz-archives"
> > or "I know how to import .so modules" or "I provide a
> > search function which you can call to have me scan
> > my module container (directory, web-site, archive)".

but why?  in my small-minded view of how python
works, an importer carries out a very simple task:

    given a name, check if you have a
    module with that name, and install
    it.  if you cannot, fail (in which case
    python asks the next importer along
    the path).

why do you have to complicate things beyond that?
why not just let Python provide a few base classes
and mixins for people who want to create custom
importers, and be done with it?

rationale, please.

</F>


From jim@interet.com  Fri Dec  3 14:34:40 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 03 Dec 1999 09:34:40 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org> <38479573.B2CFDD2B@lemburg.com>
Message-ID: <3847D500.53833D06@interet.com>

"M.-A. Lemburg" wrote:
> 
> Greg Stein wrote:

> > I'd rather see the builtin machinery move to Python, regardless of what
> > system is used and/or what features are added.
> 
> In the long run that's probably the right direction, but right now
> we are only talking a very small set of additional features,
> which can easily be added to the existing code without too much
> fuzz.

I volunteer to write a Python archive in either Python or C.  In
fact I currently have prototypes for both.  But I have to agree
with Greg here.  I think a Python importer is the way to go.  The
C code is 300 lines mostly in import.c and parallel to existing code.
The Python archive is about 100 lines and is prettier, easy to read,
alter and re-use (obviously).

> Plus it won't slow things down, which is important since
> Python startup time is already an issue all by itself. The

I think archive files should be able to be fast, and should
help, not hurt, startup time.  Provided that the use of sys.path
is curtailed, os.readdir() is not needed, and the
specifications are not complicated.

Although archive files are my special concern, I realize that
imputil is not just about archives.

JimA


From guido@CNRI.Reston.VA.US  Fri Dec  3 14:39:25 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 03 Dec 1999 09:39:25 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Thu, 02 Dec 1999 19:19:40 PST."
 <Pine.LNX.4.10.9912021654100.18529-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912021654100.18529-100000@nebula.lyra.org>
Message-ID: <199912031439.JAA16524@eric.cnri.reston.va.us>

Greg,

Great response.  I think we know where we each stand.  Please go ahead
with a new design.  (That's trust, not carte blanche.)

Just one thought: the more I think about it, the less I like
sys.importers: functionality which is implemented through
sys.importers must necessarily be placed either in front of all of
sys.path or after it.  While this is helpful for "canned" apps that
want *everything* to be imported from a fixed archive, I think that
for regular Python installations sys.path should remain the point of
attack.  In particular, installing a new package (e.g. PIL) should
affect sys.path, regardless of the way of delivery of the modules
(shared libs, .py files, .pyc files, or a zip archive).

I'm not too worried about code that inspects sys.path and expects
certain invariants; that code is most likely interfering with the
import mechanism so should be revisited anyway.

On the lone .pyc issue: I'd like to see this disappear when using the
filesystem, I see no use for it there if we support .pyc files in zip
archives.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@interet.com  Fri Dec  3 14:44:54 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 03 Dec 1999 09:44:54 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com>
Message-ID: <3847D766.1E5FFAF3@interet.com>

Jean-Claude Wippler wrote:
> 
> Guido van Rossum wrote:
> 
> [...]
> > Note that the interpretation of __file__ could be problematic.  To
> > what value do you set __file__ for a module loaded from a zip archive?
> 
> Makefiles use "archive(entry)" (this also supports nesting if needed).

I discovered the hard way this entry is not optional.  I just
used the archive file name for __file__.

> This may be off-topic, but has anyone considered what it would take to
> load shared libs out of an archive?  One way is to extract on-the-fly to
> a temporary area.  A refinement is to leave extracted files there as
> cache, and perhaps even to extract to a file with a name derived from
> its MD5 digest (this way multiple users and even Python installations
> can share the cache).  Would it be useful to define a "standard" area?

IMHO putting shared libs in an archive is a bad idea because the OS
can not use them there.  They must be extracted as you say.  But then
storage is wasted by using space in the archive and the external file.
Deleting them after use wastes time.  Better to leave them out of the
archive and provide for them in the installer.  IMHO the
archive is a basic simple feature, and people make installers on top
of that.  Archives shouldn't try to do it all.

JimA


From mal@lemburg.com  Fri Dec  3 14:14:09 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 03 Dec 1999 15:14:09 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>
 <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us>
Message-ID: <3847D030.2C936E24@lemburg.com>

Guido van Rossum wrote:
> 
> [Greg]
> > > I'd rather see the builtin machinery move to Python, regardless of what
> > > system is used and/or what features are added.
> 
> [Marc]
> > In the long run that's probably the right direction, but right now
> > we are only talking a very small set of additional features,
> > which can easily be added to the existing code without too much
> > fuzz.
> 
> I disagree.  We should do the redisign right rather than tweaking the
> existing code.

Ok, then...
 
> > IMHO, we should retreat to a more centralized interface,
> > one which more resembles a manager rather than the agent
> > interface implemented in imputil.py. Add-ons can then
> > register themselves to say "hey, I can handle pyz-archives"
> > or "I know how to import .so modules" or "I provide a
> > search function which you can call to have me scan
> > my module container (directory, web-site, archive)".
> 
> This makes sense.
> 
> > The manager would take care of what to call and in which
> > order, plus delegate requests to add-ons which implement
> > the needed logic, e.g. add-ons for signature checking, unzipping
> > archives, file system lookup tables, etc.
> >
> > It could also trace its actions and then keep an on-disk
> > knowledge base for what it did in the past to find certain
> > modules under certain conditions.
> >
> > Anyway, all this is extra magic for some future version of
> > Python.
> 
> I would say the manager API design and a basic set of specific
> handlers should go into 1.6.

BTW, is there a timeline for the 1.6 release ? I mean which
things will have to be in 1.6 ?

Some recent topics as hints:

1. Unicode
2. Import Manager API + default handlers
3. Python style coercion at C type level
4. Rich comparisons
5. __doc__ string extraction tool

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    28 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Fri Dec  3 14:24:04 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 03 Dec 1999 15:24:04 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com>
Message-ID: <3847D284.8CBF2A9C@lemburg.com>

Fredrik Lundh wrote:
> 
> MAL wrote:
> > > IMHO, we should retreat to a more centralized interface,
> > > one which more resembles a manager rather than the agent
> > > interface implemented in imputil.py. Add-ons can then
> > > register themselves to say "hey, I can handle pyz-archives"
> > > or "I know how to import .so modules" or "I provide a
> > > search function which you can call to have me scan
> > > my module container (directory, web-site, archive)".
> 
> but why?  in my small-minded view of how python
> works, an importer carries out a very simple task:
> 
>     given a name, check if you have a
>     module with that name, and install
>     it.  if you cannot, fail (in which case
>     python asks the next importer along
>     the path).
> 
> why do you have to complicate things beyond that?
> why not just let Python provide a few base classes
> and mixins for people who want to create custom
> importers, and be done with it?

Because importing in Python has become *much* more
complicated over time. There are requests for new
features which touch subjects such as storage mechanisms,
lookups, signatures (for trusted code), lazy imports, etc.

A chain of simple minded importers won't work together
too well, duplicate work and downgrade performance
considerably due to the many recursive function calls.
Also, centralized caching strategies are hard to implement
across import handlers.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    28 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jeremy@cnri.reston.va.us  Fri Dec  3 16:47:54 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Fri, 3 Dec 1999 11:47:54 -0500 (EST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <14406.58137.359127.921135@weyr.cnri.reston.va.us>
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org>
 <199912022043.PAA15108@eric.cnri.reston.va.us>
 <14406.58137.359127.921135@weyr.cnri.reston.va.us>
Message-ID: <14407.62522.360386.757519@goon.cnri.reston.va.us>

>>>>> "FLD" == Fred L Drake, <fdrake@acm.org> writes:

  >> (Unrelated remark: I should really try to release the set of
  >> modules we've written here at CNRI to deal with zip files.
  >> Unfortunately zip files are hairy and so is our code.)

  FLD>   It doesn't help that that code just plain stinks.  I maintain
  FLD> that no one here understands the whole of it.

I'm all for improving the code and getting it out.  The real problem
is that interfaces have been glommed on for every new use of a Zip
file.  (You want to read one off a socket and extract files before
you've got the whole thing?  No problem! Add a new class.)  We need to
figure out the common patterns for using the archives and write a new
set of interfaces to support that.

Jeremy


From guido@CNRI.Reston.VA.US  Fri Dec  3 17:12:07 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 03 Dec 1999 12:12:07 -0500
Subject: [Python-Dev] What to do with our Zip code?
In-Reply-To: Your message of "Fri, 03 Dec 1999 11:47:54 EST."
 <14407.62522.360386.757519@goon.cnri.reston.va.us>
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <14406.58137.359127.921135@weyr.cnri.reston.va.us>
 <14407.62522.360386.757519@goon.cnri.reston.va.us>
Message-ID: <199912031712.MAA17061@eric.cnri.reston.va.us>

[Jeremy, on our Zip code]
> I'm all for improving the code and getting it out.  The real problem
> is that interfaces have been glommed on for every new use of a Zip
> file.  (You want to read one off a socket and extract files before
> you've got the whole thing?  No problem! Add a new class.)  We need to
> figure out the common patterns for using the archives and write a new
> set of interfaces to support that.

If we gave you the code we currently have, would someone else in this
forum be willing to redesign it?  Eventually it would become part of
the Python distribution.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Sat Dec  4 09:54:30 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 4 Dec 1999 10:54:30 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com>
Message-ID: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com>

M.-A. Lemburg <mal@lemburg.com> wrote:
> >     given a name, check if you have a
> >     module with that name, and install
> >     it.  if you cannot, fail (in which case
> >     python asks the next importer along
> >     the path).
> > 
> > why do you have to complicate things beyond that?
> > why not just let Python provide a few base classes
> > and mixins for people who want to create custom
> > importers, and be done with it?
> 
> Because importing in Python has become *much* more
> complicated over time. There are requests for new
> features which touch subjects such as storage mechanisms,
> lookups, signatures (for trusted code), lazy imports, etc.

sorry, I still don't understand it.  our applications already
use different storage mechanisms, databases, signatures,
lazy importing, version handling, etc, etc.  now, if *we*
have managed to build all that on top of an old version
of imputil.py, how come it's not sufficient for the rest
of you?

> A chain of simple minded importers won't work together
> too well

why?  it sure works for us...

> duplicate work

avoiding duplicate work is what object oriented design
is all about.  and last time I checked, Python had excellent
support for that.

> and downgrade performance considerably due to the
> many recursive function calls

now that's what I call premature optimization.  and this
scares the hell out of me: if the rest of the python-dev
crowd don't seriously believe that Python is (or can be
made) fast enough to implement things like this, why
the heck are you using Python at all?  am I the only
one here who doesn't believe in osterhout's talk about
"the great system vs. scripting language divide"?

</F>


From fredrik@pythonware.com  Sat Dec  4 09:54:42 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 4 Dec 1999 10:54:42 +0100
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com>
Message-ID: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com>

James C. Ahlstrom <jim@interet.com> wrote:
> IMHO putting shared libs in an archive is a bad idea because the OS
> can not use them there.  They must be extracted as you say.  But then
> storage is wasted by using space in the archive and the external file.
> Deleting them after use wastes time.  Better to leave them out of the
> archive and provide for them in the installer.  IMHO the
> archive is a basic simple feature, and people make installers on top
> of that.  Archives shouldn't try to do it all.

have you tried it?  if not, why do you think you should
be allowed to forbid others from doing it?

in "the inmates are running the asylum", alan cooper
points out that the *major* reason people all over the
world love web applications are that there are no
bloody installers.  and here you are advocating that
we all should be forced to use installers, when python
makes it trivial to write self-installing apps. double-argh!

(on the other hand, why do I complain? all pythonworks
customers is going to be able to do all this anyway...).

<rant size="major">

frankly, this "design by committee" (or is it "design by
people who've never even been close to implementing
something because they thought it was too hard, and
thus think they're qualified to argue against those of
us who didn't even realize that it was a hard problem"?)
trend I've been seeing in all kinds of python forums
makes me sooooo sad.  the more of this I see (dist-
utils-sig, doc-sig, here, c.l.python), the sadder I get,
and the more I sympathise with John Skaller who's
defining his own python-like universe...

if someone needs me, I'll be down in the pub having
a beer with the mad scientist, the shiny eff-bot, and
mr. nitpicker.  if we're not there, you'll find us in the
lab, working on new string matching facilities for 1.6,
SOAP [1], tkinter replacements for the masses, and
whatever else we can come up with...  see you!

</rant>

1) http://www.newsalert.com/bin/story?StoryId=Coenz0bWbu0znmdKXqq


From gstein@lyra.org  Sat Dec  4 10:42:27 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 02:42:27 -0800 (PST)
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
In-Reply-To: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com>
Message-ID: <Pine.LNX.4.10.9912040232240.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, Fredrik Lundh wrote:
> M.-A. Lemburg <mal@lemburg.com> wrote:
>...
> > Because importing in Python has become *much* more
> > complicated over time. There are requests for new
> > features which touch subjects such as storage mechanisms,
> > lookups, signatures (for trusted code), lazy imports, etc.
> 
> sorry, I still don't understand it.  our applications already
> use different storage mechanisms, databases, signatures,
> lazy importing, version handling, etc, etc.  now, if *we*
> have managed to build all that on top of an old version
> of imputil.py, how come it's not sufficient for the rest
> of you?

I agree. The imputil mechanism has been proven in combat to work for many
scenarios. I have not (yet) heard of a case where the model has proven
insufficient.

> > A chain of simple minded importers won't work together
> > too well
> 
> why?  it sure works for us...

Exactly. "Why?" Please provide an example.

>...
> > and downgrade performance considerably due to the
> > many recursive function calls
> 
> now that's what I call premature optimization.  and this
> scares the hell out of me: if the rest of the python-dev
> crowd don't seriously believe that Python is (or can be
> made) fast enough to implement things like this, why
> the heck are you using Python at all?  am I the only
> one here who doesn't believe in osterhout's talk about
> "the great system vs. scripting language divide"?

Don't worry Fredrik... I'm with you on this one. I do not believe there is
a problem with the speed. Nobody has yet profiled imputil to find out
where/how the time is being spent. Nobody has tried to speed it up.
Therefore, any claims about its performance are simply FUD.

I claim that its interface is correct, and you (Fredrik) stated it well:
"given a name, please give me a module if you can (otherwise None)."

Underneath that semantic, there are a lot of things that can be done to
alter the performance and organization. Claims about speed are entirely
premature.

Yes, I'm biased. But, in truth, I haven't seen a better mechanism yet.
I've tossed out a few ideas on how imputil could be improved (which are
solely based on guess, rather than empirical evidence of profiling
output). When those changes are completed and there is still an issue,
then I'll admit defeat and wait for somebody else to provide a new design.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From Vladimir.Marangozov@inrialpes.fr  Sat Dec  4 11:15:53 1999
From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov)
Date: Sat, 4 Dec 1999 12:15:53 +0100 (CET)
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
In-Reply-To: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> from "Fredrik Lundh" at Dec 04, 1999 10:54:42 AM
Message-ID: <199912041115.MAA00539@python.inrialpes.fr>

Fredrik Lundh wrote:
> 
[snip]
> 
> <rant size="major">
> 
> frankly, this "design by committee"...
[snip]
> ...  see you!
> 
> </rant>
> 

C'mon /F, it's a battle of ideas and that's the way it works before
filtering the good ones from the bad ones, then focusing on the
appropriate implementation.

I'm in sync with the discussion, although I haven't posted my partial
notes on it due to lack of time. But let me say that overall, this
discussion is a good thing and the more opinions we get, the better.

BTW, you just _can't_ leave like this and start playing solitaire at
the bar, first, because we need beer too and it's unlikely that you'll
find a bar we don't know already, and second, because it was you who
revived this discussion with 1 word, repeated 3 times:

> Subject: Re: [Python-Dev] Python 1.6 status
> Date: Wed, 17 Nov 1999 12:46:01 +0100
> 
> Guido van Rossum <guido@CNRI.Reston.VA.US> wrote:
> > - suggestions for new issues that maybe ought to be settled in 1.6
> 
> three things: imputil, imputil, imputil
> 
> </F>

Thus, with no visible argumentation (so don't shoot on others when they
argue instead of you), and with this one word, you pushed Guido to the
extreme of suggesting a complete redesign of the import machinery from
scratch, based on a "Grand Architecture" :-). Right? -- Right!

This is a fact and a fairly amount of the credits go entirely to you!

Since then, however, I haven't really seen your arguments, and I believe
that nobody here got exactly your point. I, for one, may well argue
against imputil as being just another brick on top of the grand mess.
But because I haven't made the time to write properly my notes, I don't
dare to express a partial opinion, not blame those who argue good or
bad in the meantime, when I'm silent.

So, why are you showing us your back when you have clearly something
to say, but like me, you haven't made the time to say it?  Please don't
waste my time with emotional rants ;-). Everybody here tries to contribute
according to its knowledge, experience and availability.

Later,
-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From mal@lemburg.com  Sat Dec  4 10:45:52 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 04 Dec 1999 11:45:52 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com>
Message-ID: <3848F0E0.B8132AD2@lemburg.com>

Fredrik Lundh wrote:
> 
> M.-A. Lemburg <mal@lemburg.com> wrote:
> > >     given a name, check if you have a
> > >     module with that name, and install
> > >     it.  if you cannot, fail (in which case
> > >     python asks the next importer along
> > >     the path).
> > >
> > > why do you have to complicate things beyond that?
> > > why not just let Python provide a few base classes
> > > and mixins for people who want to create custom
> > > importers, and be done with it?
> >
> > Because importing in Python has become *much* more
> > complicated over time. There are requests for new
> > features which touch subjects such as storage mechanisms,
> > lookups, signatures (for trusted code), lazy imports, etc.
> 
> sorry, I still don't understand it.  our applications already
> use different storage mechanisms, databases, signatures,
> lazy importing, version handling, etc, etc.  now, if *we*
> have managed to build all that on top of an old version
> of imputil.py, how come it's not sufficient for the rest
> of you?

I've tried to get (an older) imputil.py version up and running
too. It did work, but only after some considerable tweaking
and even with integrated cache mechanisms did not reach
the performance of the builtin importer (which doesn't
use the kinds of caching strategies I had built into
imputil.py). Getting the whole setup to work wasn't easy
at all, because of the way imputil importers delegate work
and things get even more confusing when it starts to "take
over" certain parts of packages by installing temselves
as importers for a particular package.
 
> > A chain of simple minded importers won't work together
> > too well
> 
> why?  it sure works for us...

An example: 

A path importer knows how to scan directories and how to use
a path to tell the correct order. It can maybe also import
.py/.pyc/.pyo files. Now what happens if it finds a shared
lib as module... the usual imputil way would be to delegate
the request to some other importer which can handle shared
libs... but wait: how does the shared lib importer know
where to look ? It will have to rescan the directories,
etc...
 
> > duplicate work
> 
> avoiding duplicate work is what object oriented design
> is all about.  and last time I checked, Python had excellent
> support for that.

See my example above.

The agent approach used by imputil does not support
OO design too well: even though you can avoid duplicate
programming work on the importers by using a few
base classes which implement dir scans, shared lib
imports, etc. the imputil design does not provide
means to avoid duplicate actions taken by the importers.

> > and downgrade performance considerably due to the
> > many recursive function calls
> 
> now that's what I call premature optimization.  and this
> scares the hell out of me: if the rest of the python-dev
> crowd don't seriously believe that Python is (or can be
> made) fast enough to implement things like this, why
> the heck are you using Python at all?  am I the only
> one here who doesn't believe in osterhout's talk about
> "the great system vs. scripting language divide"?

Looks like you are in ranting mode here ;-) Seriously,
I've checked my imputil.py version (with caches enabled)
against the builtin importer and noticed a performance
downgrade by factor >2. This was enough to convince me
of looking for other techniques to handle the problems
I had at the time... you know, relative imports and things.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    27 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Sat Dec  4 11:04:15 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 04 Dec 1999 12:04:15 +0100
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com>
Message-ID: <3848F52F.5F5B748F@lemburg.com>

Fredrik Lundh wrote:
> 
> <rant size="major">
> 
> frankly, this "design by committee" (or is it "design by
> people who've never even been close to implementing
> something because they thought it was too hard, and
> thus think they're qualified to argue against those of
> us who didn't even realize that it was a hard problem"?)

Huh ? Two points:

1. How can you be sure that people haven't tried
   implementing their ideas and for various reasons
   have come to some conclusion about those ideas ?

2. Would you seriously disqualify people from joining a
   discussion by the simple arguement that they
   have not implemented anything yet ?

Just take the Unicode discussion as example: it was
very lively and resulted in a decent proposal which
is now subject to further investigation by the
implementors ;-) Many people have joined in even though
they did not and/or will not implement anything. Still,
their arguments were very useful to show up weaknesses
in the proposal.

Now, let's rather have a beer in the pub around the corner
than go on ranting about :-).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    27 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Sat Dec  4 11:53:33 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 04 Dec 1999 12:53:33 +0100
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
References: <Pine.LNX.4.10.9912040232240.18529-100000@nebula.lyra.org>
Message-ID: <384900BD.D16E72BC@lemburg.com>

Greg Stein wrote:
> > > [me:]
> > > A chain of simple minded importers won't work together
> > > too well
> >
> > why?  it sure works for us...
> 
> Exactly. "Why?" Please provide an example.

See my reply to Fredrik.
 
> >...
> > > and downgrade performance considerably due to the
> > > many recursive function calls
> >
> > now that's what I call premature optimization.  and this
> > scares the hell out of me: if the rest of the python-dev
> > crowd don't seriously believe that Python is (or can be
> > made) fast enough to implement things like this, why
> > the heck are you using Python at all?  am I the only
> > one here who doesn't believe in osterhout's talk about
> > "the great system vs. scripting language divide"?
> 
> Don't worry Fredrik... I'm with you on this one. I do not believe there is
> a problem with the speed. Nobody has yet profiled imputil to find out
> where/how the time is being spent. Nobody has tried to speed it up.

Sorry, Greg, but that is simply not true. I've spend a few
days on trying to get more performance out of it and have
succeeded, but in the end it wasn't enough to convince me
of the approach.

> Therefore, any claims about its performance are simply FUD.

BTW, did anybody mention that an import manager  wouldn't
be able to provide an API which is useable for imputil
style importers ? I'm not argueing against the possibility
to use imputil style importers, just against making it the
sole method of adding wisdom to Python imports.

The imputil importers could well benefit from a manager
providing logic to do basic things like importing
shared libs, checking signatures, downloading modules
from the web, etc.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    27 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein@lyra.org  Sat Dec  4 12:15:13 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 04:15:13 -0800 (PST)
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
In-Reply-To: <384900BD.D16E72BC@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912040402120.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, M.-A. Lemburg wrote:
>...
> > Don't worry Fredrik... I'm with you on this one. I do not believe there is
> > a problem with the speed. Nobody has yet profiled imputil to find out
> > where/how the time is being spent. Nobody has tried to speed it up.
> 
> Sorry, Greg, but that is simply not true. I've spend a few
> days on trying to get more performance out of it and have
> succeeded, but in the end it wasn't enough to convince me
> of the approach.

You sent me your changes... I don't believe that you were aggressive
enough. As I've mentioned before, I think it is quite possible to retain
the general Importer style and get_code() interface, but to shift some
functionality out (to be computed once) to a higher-level mechanism. The
patches that you sent me did not do this, so I'm not surprised that you
hit a wall.

Ack. See? Now I'm getting into discussions about performance and
implementation without truly knowing where the timing is spent. Eyeballing
it, I have an idea, but it would be best too see a profile output. My
mantra is always "90% of the time you're wrong about where 90% of the time
is being spent."

I am unconcerned about performance, but will work on it so that I don't
need to continue this conversation. That burden is on me.

> > Therefore, any claims about its performance are simply FUD.
> 
> BTW, did anybody mention that an import manager  wouldn't
> be able to provide an API which is useable for imputil
> style importers ? I'm not argueing against the possibility
> to use imputil style importers, just against making it the
> sole method of adding wisdom to Python imports.

Since the core will delegate out to Python (note: current working theory),
then it certainly is not the "sole method" (since you can just replace the
Python code). But there must be a default mechanism.

The ihooks stuff was too complicated. imputil seems to be much easier. I'd
love to see a third mechanism.... so I can steal ideas :-)

> The imputil importers could well benefit from a manager
> providing logic to do basic things like importing
> shared libs, checking signatures, downloading modules
> from the web, etc.

For shared libs, yes. For the others: geez... I don't want to see that in
the core infrastructure. Shift that out to specialized Importers. The
infrstructure ought to be teeny and agnostic about how to map a module
name to a module.


Side note to python-dev people: I apologize... I realize that I'm
beginning to get a bit defensive here. I'm going to be at XML '99 until
Friday, so that should give me a breather. When I get back, I'll skip the
talk and do some code.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec  4 12:32:04 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 04:32:04 -0800 (PST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <3848F0E0.B8132AD2@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912040416220.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, M.-A. Lemburg wrote:
> Fredrik Lundh wrote:
>...
> > sorry, I still don't understand it.  our applications already
> > use different storage mechanisms, databases, signatures,
> > lazy importing, version handling, etc, etc.  now, if *we*
> > have managed to build all that on top of an old version
> > of imputil.py, how come it's not sufficient for the rest
> > of you?
> 
> I've tried to get (an older) imputil.py version up and running
> too. It did work, but only after some considerable tweaking
> and even with integrated cache mechanisms did not reach
> the performance of the builtin importer (which doesn't
> use the kinds of caching strategies I had built into
> imputil.py).

1) yes, it was an older version and did not have the PathImporter class.
   As a by product, the DirectoryImporters that it *did* have were much
   slower. It still did not support builtins, frozen modules, or dynamic
   loads. All of that is present now, so it works "out of the box" much
   better.

2) Performance: as I wrote in the other email, I don't believe that is an
   argument against the design. The imputil approach *will* be slower than
   the current Python mechanism, but there is some more coding to do to
   truly see how much. The side benefits (e.g. ZipImporter and caching)
   may outweigh the result. Time will tell.

> Getting the whole setup to work wasn't easy
> at all, because of the way imputil importers delegate work
> and things get even more confusing when it starts to "take
> over" certain parts of packages by installing temselves
> as importers for a particular package.

I don't understand this. If it is relevant, then please expand. Thx.

> > > A chain of simple minded importers won't work together
> > > too well
> > 
> > why?  it sure works for us...
> 
> An example: 
> 
> A path importer knows how to scan directories and how to use
> a path to tell the correct order. It can maybe also import
> .py/.pyc/.pyo files. Now what happens if it finds a shared
> lib as module... the usual imputil way would be to delegate
> the request to some other importer which can handle shared
> libs... but wait: how does the shared lib importer know
> where to look ? It will have to rescan the directories,
> etc...

No, the "usual imputil way" is that the PathImporter understands searching
a path and loading stuff from that path. An Importer is a combination of
locating and loading (since they are, typically, tightly bound). The next
rev will allow user-plugging of support for new file types.

> > > duplicate work
> > 
> > avoiding duplicate work is what object oriented design
> > is all about.  and last time I checked, Python had excellent
> > support for that.
> 
> See my example above.
> 
> The agent approach used by imputil does not support
> OO design too well: even though you can avoid duplicate
> programming work on the importers by using a few
> base classes which implement dir scans, shared lib
> imports, etc. the imputil design does not provide
> means to avoid duplicate actions taken by the importers.

There is always a balance to be struck between independence and coupling.
I chose to reduce coupling and increase independence. If you shift a bunch
of stuff out of the Importers, then you will increase the coupling between
the imputil framework and the Importers. That coupling will then close off
future possibilities.

Within the framework itself (e.g. between _import_hook and get_code),
there is a lot of opportunity for change. Since that is behind the covers,
it is no big deal to shift functionality around. I plan to do so.

>...
> Looks like you are in ranting mode here ;-) Seriously,
> I've checked my imputil.py version (with caches enabled)
> against the builtin importer and noticed a performance
> downgrade by factor >2. This was enough to convince me
> of looking for other techniques to handle the problems
> I had at the time... you know, relative imports and things.

I have run a long series of tests. Without doing any performance work on
imputil, the ratio is 9 to 13. The 13 may have bumped up to about 15 or 16
when I added some dynamic loading code (I forget). Regardless, it is
definitely less than a 2X increase. And that is with zero optimization.

*shrug*

I'm done. I'll do some code in a couple weeks.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec  4 13:12:32 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 05:12:32 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912031439.JAA16524@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912040456180.18529-100000@nebula.lyra.org>

On Fri, 3 Dec 1999, Guido van Rossum wrote:
>...
> Great response.  I think we know where we each stand.  Please go ahead
> with a new design.  (That's trust, not carte blanche.)

Accepted gratefully. Thx.

> Just one thought: the more I think about it, the less I like
> sys.importers: functionality which is implemented through
> sys.importers must necessarily be placed either in front of all of
> sys.path or after it.  While this is helpful for "canned" apps that
> want *everything* to be imported from a fixed archive, I think that
> for regular Python installations sys.path should remain the point of
> attack.  In particular, installing a new package (e.g. PIL) should
> affect sys.path, regardless of the way of delivery of the modules
> (shared libs, .py files, .pyc files, or a zip archive).

Okay. I'll design with respect to this model.

To be explicit/clear and to be sure I'm hearing you right: sys.path may
contain Importer instances. Given the name FOO, the system will step
through sys.path looking for the first occurence of FOO (looking in a
directory or delegating). FOO may be found with any number of
(configurable) file extensions, which are ordered (e.g. ".so" before
".py" before ".isl").

> I'm not too worried about code that inspects sys.path and expects
> certain invariants; that code is most likely interfering with the
> import mechanism so should be revisited anyway.

The Benevolent Dictator has spoken. So be it.

:-)

> On the lone .pyc issue: I'd like to see this disappear when using the
> filesystem, I see no use for it there if we support .pyc files in zip
> archives.

No problem. This actually creates a simplification in the system, as I'm
seeing it now. I'm also seeing opportunities for a code reorg which may
work towards MAL's issues with performance.

I hope to have something in two or three weeks. I also hope people can be
patient :-), but I certainly wouldn't mind seeing some alternative code!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gmcm@hypernet.com  Sat Dec  4 14:59:44 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Sat, 4 Dec 1999 09:59:44 -0500
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
In-Reply-To: <384900BD.D16E72BC@lemburg.com>
Message-ID: <1267803104-11215142@hypernet.com>

M.-A. Lemburg wrote:
> Greg Stein wrote:

> > Don't worry Fredrik... I'm with you on this one. I do not
> > believe there is a problem with the speed. Nobody has yet
> > profiled imputil to find out where/how the time is being spent.
> > Nobody has tried to speed it up.
> 
> Sorry, Greg, but that is simply not true. I've spend a few
> days on trying to get more performance out of it and have
> succeeded, but in the end it wasn't enough to convince me
> of the approach.
 
Remember those comparisons of Perl and Python, to which 
you added cgipython? I've added to the list a version that uses 
an old version of imputil (probably the one you optimized) and 
a compressed std lib. Note that my Linux python (1.5.2) is 
built in the RedHat style - even struct and strop are .so's; so 
that accounts for the majority of the open calls. This is a full 
Python (runs code.py if you don't pass it a script name). For 
lack of a better name, I've called it "pykit".

 First, the size of log files (in lines), i.e. number of system 
calls:
 
                Solaris     Linux    IRIX[1]
   Perl              88        85      70
   Python           425       316     257
   cgipython                  182 
   pykit                      136

 Next, the number of "open" calls:

                Solaris     Linux    IRIX
   Perl             16         10       9
   Python          107         71      48
   cgipython                   33 
   pykit                        9

 And the number of unsuccessful "open" calls:
 
                Solaris     Linux    IRIX
   Perl              6          1       3
   Python           77         49      32
   cgipython                   28
   pykit                        2
 
 Number of "mmap" calls:
 
                Solaris     Linux    IRIX
   Perl              25        25       1
   Python            36        24       1
   cgipython                   13
   pykit                       21

This test would show off more if it went beyond startup. An 
import of a standard lib module in my stock Python involves 2 
failed stats and 6 failed opens, then 2 successful opens and 2 
fstats before the module is loaded. None of these occur in 
pykit.

The downside (asking my Importer for a .so or a module not in 
the importer) takes no system calls, and involves a dozen or 
so lines of Python and a check of a dictionary.


- Gordon


From tismer@appliedbiometrics.com  Sat Dec  4 15:29:03 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sat, 04 Dec 1999 16:29:03 +0100
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
References: <Pine.LNX.4.10.9912040402120.18529-100000@nebula.lyra.org>
Message-ID: <3849333F.1DF2A201@appliedbiometrics.com>


Greg Stein wrote:
...

> My mantra is always "90% of the time you're wrong about where 90% 
> of the time is being spent."

What a great sentence! We all know it, but many of us
(especially me) forget about it during 90% of our coding time.
Much better to spend this on design (as you did).

thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From jim@interet.com  Sat Dec  4 17:27:44 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Sat, 04 Dec 1999 12:27:44 -0500
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com>
Message-ID: <38494F10.C644BA7@interet.com>

Fredrik Lundh wrote:
> 
> James C. Ahlstrom <jim@interet.com> wrote:
> > IMHO putting shared libs in an archive is a bad idea because the OS

Dear Fredrik,

I thought the point of Python-Dev was to propose designs and get
feedback, right?  Well, I got feedback :-).

OK, I agree to alter my archive format so it provides the
ability to store shared libs and not just *.pyd.  I will
add the string length and if needed a flag indicating the
name is a shared lib.

Now the details:

> have you tried it?  if not, why do you think you should
> be allowed to forbid others from doing it?

Yes I have tried it, and I am currently on my fourth version
of an archive format which is based on formats by Greg Stein
and Gordon McMillan.  I hope it meets with the favor of the
Grand Inquisition, and becomes the standard format.  But
maybe it won't.  Oh well.

> bloody installers.  and here you are advocating that
> we all should be forced to use installers, when python
> makes it trivial to write self-installing apps. double-argh!

I am not forcing anyone to do anything, only proposing that
shared libs are best handled directly by imputil and not
the class within imputil which handles archive files.  It
is just a geeky design issue, nothing more.

JimA


From jim@interet.com  Sat Dec  4 18:31:48 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Sat, 04 Dec 1999 13:31:48 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com>
Message-ID: <38495E14.9C2FB107@interet.com>

"M.-A. Lemburg" wrote:

> An example:
> 
> A path importer knows how to scan directories and how to use
> a path to tell the correct order. It can maybe also import
> .py/.pyc/.pyo files. Now what happens if it finds a shared
> lib as module... the usual imputil way would be to delegate
> the request to some other importer which can handle shared
> libs... but wait: how does the shared lib importer know
> where to look ? It will have to rescan the directories,
> etc...

The above refers to an earlier but still very recent version
of imputil.  On that basis is is perfectly accurate.  Here is
another example from my own experience almost identical to
the above:

One possible archive file format holds its list of archived
*.pyc file names as keys in a dictionary.  This is simple and
efficient, but fails to correctly address the problem of shared
libs (aka DLL's in Windows) with names identical to names of
*.pyc files in the archive.  For example, suppose foo.pyc is in the
archive, and foo.dll is in a directory.  Suppose sys.path is to be
used to decide whether to load foo.pyc or foo.dll.  Then an
"archive importer" will fail to do this.  Specifically you can't
see if foo.pyc is in the archive and then check sys.path, nor can
you do the reverse.  You must call the "archive importer" repeatedly
for each element of sys.path and search the directory at the same time.

JimA


From jim@interet.com  Sat Dec  4 19:51:47 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Sat, 04 Dec 1999 14:51:47 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912040456180.18529-100000@nebula.lyra.org>
Message-ID: <384970D3.26A9ECDB@interet.com>

Greg Stein wrote:
> 
> On Fri, 3 Dec 1999, Guido van Rossum wrote:

> > attack.  In particular, installing a new package (e.g. PIL) should
> > affect sys.path, regardless of the way of delivery of the modules
> > (shared libs, .py files, .pyc files, or a zip archive).

> To be explicit/clear and to be sure I'm hearing you right: sys.path may
> contain Importer instances. Given the name FOO, the system will step
> through sys.path looking for the first occurence of FOO (looking in a
> directory or delegating). FOO may be found with any number of
> (configurable) file extensions, which are ordered (e.g. ".so" before
> ".py" before ".isl").

This is basically a gripe about this design spec.  So if the answer
turns out to be "we need this functionality so shut up" then just
say that and don't flame me.

This spec is painful.  Suppose sys.path has 10 elements, and there
are six file extensions.  Then the simple algorithm is slow:
  for path in sys.path:		# Yikes, may not be a string!
    for ext in file_extensions:
      name = "%s.%s" % (module_name, ext)
      full_path = os.path.join(path, name)
      if os.path.isfile(full_path):
        # Process file here

And sys.path can contain class instances
which only makes things slower.  You could do a readdir() and cache
the results, but maybe that would be slower.  A better
algorithm might be faster, but a lot more complicated.

In the context of archive files, it is also painful.  It prevents
you from saving a single dictionary of module names.  Instead you
must have len(sys.path) dictionaries.  You could try to
save in the archive information about whether (say) a foo.dll was
present in the file system, but the list of extensions is extensible.

The above problem only exists to support equally-named modules; that
is, to support a run-time choice of whether to load foo.pyc, foo.dll,
foo.isl, etc.  I claim (without having written it) that the fastest
algorithm to solve the unique-name case is much faster than the fastest
algorithm to solve the choose-among-equal-names case.

Do we really need to support the equal-name case [Jim runs for
cover...]?
If so, how about inventing a new way to support it.  Maybe if equal
names exist, these must be pre-loaded from a known location?

JimA


From gstein@lyra.org  Sat Dec  4 21:59:00 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 13:59:00 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <384970D3.26A9ECDB@interet.com>
Message-ID: <Pine.LNX.4.10.9912041350200.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
> Greg Stein wrote:
>...
> > To be explicit/clear and to be sure I'm hearing you right: sys.path may
> > contain Importer instances. Given the name FOO, the system will step
> > through sys.path looking for the first occurence of FOO (looking in a
> > directory or delegating). FOO may be found with any number of
> > (configurable) file extensions, which are ordered (e.g. ".so" before
> > ".py" before ".isl").
> 
> This is basically a gripe about this design spec.  So if the answer
> turns out to be "we need this functionality so shut up" then just
> say that and don't flame me.
> 
> This spec is painful.  Suppose sys.path has 10 elements, and there
> are six file extensions.  Then the simple algorithm is slow:
>   for path in sys.path:		# Yikes, may not be a string!
>     for ext in file_extensions:
>       name = "%s.%s" % (module_name, ext)
>       full_path = os.path.join(path, name)
>       if os.path.isfile(full_path):
>         # Process file here

This is the algorithm that Python uses today, and my standard Importers
follow.

> And sys.path can contain class instances
> which only makes things slower.

IMO, we don't know this, or whether it is significant.

> You could do a readdir() and cache
> the results, but maybe that would be slower.  A better
> algorithm might be faster, but a lot more complicated.

Who knows. BUT: the import process is now in Python -- it makes it *much*
easier to run these experiments. We could not really do this when the
import process is "hard-coded" in C code.

> In the context of archive files, it is also painful.  It prevents
> you from saving a single dictionary of module names.  Instead you
> must have len(sys.path) dictionaries.  You could try to
> save in the archive information about whether (say) a foo.dll was
> present in the file system, but the list of extensions is extensible.

I am not following this. What/where is the "single dictionary of module
names" ? Are you referring to a cache? Or is this about building an
archive?

An archive would look just like we have now: map a name to a module. It
would not need multiple dictionaries.

> The above problem only exists to support equally-named modules; that
> is, to support a run-time choice of whether to load foo.pyc, foo.dll,
> foo.isl, etc.  I claim (without having written it) that the fastest
> algorithm to solve the unique-name case is much faster than the fastest
> algorithm to solve the choose-among-equal-names case.
> 
> Do we really need to support the equal-name case [Jim runs for
> cover...]?
> If so, how about inventing a new way to support it.  Maybe if equal
> names exist, these must be pre-loaded from a known location?

I don't understand what the problem is. I don't see one. We are still
mapping a name to a module. sys.path defines a precedence.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sun Dec  5 01:17:57 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 17:17:57 -0800 (PST)
Subject: [Python-Dev] pyc archives (was: .DLL vs .PYD search order)
In-Reply-To: <38495E14.9C2FB107@interet.com>
Message-ID: <Pine.LNX.4.10.9912041713580.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
>...
> One possible archive file format holds its list of archived
> *.pyc file names as keys in a dictionary.  This is simple and
> efficient, but fails to correctly address the problem of shared
> libs (aka DLL's in Windows) with names identical to names of
> *.pyc files in the archive.  For example, suppose foo.pyc is in the
> archive, and foo.dll is in a directory.  Suppose sys.path is to be
> used to decide whether to load foo.pyc or foo.dll.  Then an
> "archive importer" will fail to do this.  Specifically you can't
> see if foo.pyc is in the archive and then check sys.path, nor can
> you do the reverse.  You must call the "archive importer" repeatedly
> for each element of sys.path and search the directory at the same time.

What? The archive is independent of each .pyc's original position in
sys.path. There is no reason/need to carry that information into an
archive.

If the archive contains "foo", then you're done. If it doesn't, then move
on to the next element of sys.path (directory or Importer instance) and
look there.

Basically: if you deploy an archive, then all of its files will take
precedence over any file found later on sys.path. This is exactly what
sys.path is about: establishing precedence.

If I understand you correctly, then you're trying to say there is some
sort of interleaving that must occur. If so, then I don't understand why.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik@pythonware.com  Mon Dec  6 12:20:34 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 6 Dec 1999 13:20:34 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> <384B7E32.F7B81D82@lemburg.com>
Message-ID: <004401bf3fe4$4cab6ea0$f29b12c2@secret.pythonware.com>

> > you obviously attempted to use imputil to implement
> > non-standard import behaviour on top of the standard
> > storage system -- while we've used it to implement
> > standard import behaviour on top of non-standard
> > storage systems.
> 
> No, I tried to make the imputil approach work as replacement
> for the standard builtin importer.

I'm confused.  earlier, you said (or rather, I think you
said) that you looked at imputil to see if it could "handle
the problems you had at the time"...  and now you say
that you tried to use it as a drop-in replacement for the
"standard path importer".  I must be missing something
here...

> After I got that to work, I added some caching
> to avoid duplicated stats. The resulting importer was
> around twice as slow as the builtin one for the following
> imports:
> 
> # the default one Python does at startup, plus:
> from mx import HTMLTools,DateTime,ODBC
> 
> This is a pretty common setup for my scripts, so its
> preformance is relevant to me.

did you try stuffing all your PYC's into an archive file,
and running them from there?

</F>


From fredrik@pythonware.com  Sun Dec  5 18:22:57 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 5 Dec 1999 19:22:57 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com>
Message-ID: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com>

> I've checked my imputil.py version (with caches enabled)
> against the builtin importer and noticed a performance
> downgrade by factor >2. This was enough to convince me
> of looking for other techniques to handle the problems
> I had at the time... you know, relative imports and things.

hmm.  I think I see the problem here...

you obviously attempted to use imputil to implement
non-standard import behaviour on top of the standard
storage system -- while we've used it to implement
standard import behaviour on top of non-standard
storage systems.

I don't know if imputil is good enough for the former,
and I don't think I care...  I've spent too many nights
debugging code that relied on clever, non-standard
hacks.

</F>

PS. on the performance side of things, did you know
that 're' can be up to ten times slower than 'regex'?
but people don't complain -- probably because it
allows them to do things they couldn't do before...


From jim@interet.com  Mon Dec  6 19:40:01 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 06 Dec 1999 14:40:01 -0500
Subject: [Python-Dev] Re: pyc archives (was: .DLL vs .PYD search order)
References: <Pine.LNX.4.10.9912041713580.18529-100000@nebula.lyra.org>
Message-ID: <384C1111.92984B5A@interet.com>

Greg Stein wrote:
> 
> On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
> >...
> > One possible archive file format holds its list of archived
> > *.pyc file names as keys in a dictionary.  This is simple and
> > efficient, but fails to correctly address the problem of shared

> What? The archive is independent of each .pyc's original position in
> sys.path. There is no reason/need to carry that information into an
> archive.
> 
> If the archive contains "foo", then you're done. If it doesn't, then move
> on to the next element of sys.path (directory or Importer instance) and
> look there.
> 
> Basically: if you deploy an archive, then all of its files will take
> precedence over any file found later on sys.path. This is exactly what
> sys.path is about: establishing precedence.

Sorry, I am a little slow today.  My daughter got me up at 6 am to
work on her computer video editor.  No disk space, fragmentation,
2 gig limit on AVI files, ........

Are you saying this?  If foo is imported, the archive importer is
consulted first to see if it can provide foo.  If not, sys.path is
searched  for foo.pyc, foo.pyl etc., and if foo.pyl is found, then
its contents are added to the single archive importer dictionary.
The order of addition to the archive dictionary is determined by
sys.path, and duplicate names are not entered because they lie later
on sys.path.  But once a file is recognized as in an archive, it
effectively precedes all of sys.path.

Or this?  If foo is imported, sys.path is searched for
foo.pyc, foo.pyl, etc., and also all archive files found
at each element of sys.path are searched for foo.  If "bar"
is imported, it may be found in foo.pyl.  That is,
there is an instance of an archive importer for each element
of sys.path.

What if the user names an archive file not on sys.path?  What
order does it have?

JimA


From jim@interet.com  Mon Dec  6 18:34:41 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 06 Dec 1999 13:34:41 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912041350200.18529-100000@nebula.lyra.org>
Message-ID: <384C01C1.8D1AFFFF@interet.com>

Greg Stein wrote:
> 
> On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
> >         # Process file here
> 
> This is the algorithm that Python uses today, and my standard Importers
> follow.

Agreed.
 
> > And sys.path can contain class instances
> > which only makes things slower.
> 
> IMO, we don't know this, or whether it is significant.

Agreed.
 
> > You could do a readdir() and cache
> > the results, but maybe that would be slower.  A better
> > algorithm might be faster, but a lot more complicated.
> 
> Who knows. BUT: the import process is now in Python -- it makes it *much*
> easier to run these experiments. We could not really do this when the
> import process is "hard-coded" in C code.

Agreed.
 
> > In the context of archive files, it is also painful.  It prevents
> > you from saving a single dictionary of module names.  Instead you
> > must have len(sys.path) dictionaries.  You could try to
> > save in the archive information about whether (say) a foo.dll was
> > present in the file system, but the list of extensions is extensible.
> 
> I am not following this. What/where is the "single dictionary of module
> names" ? Are you referring to a cache? Or is this about building an
> archive?
> 
> An archive would look just like we have now: map a name to a module. It
> would not need multiple dictionaries.

The "single dictionary of names" is in the single archive importer
instance and has nothing to do with creating the archive.  It
is currently programmed this way.

Suppose the user specifies by name 12 archive files to be searched.
That is, the user hacks site.py to add archive names to the importer.
The "single dictionary" means that the archive importer takes the 12
dictionaries in the 12 files and merges them together into one
dictionary
in order to speed up the search for a name.  The good news is you can
always just call the archive importer to get a module.  The bad news is
you can't do that for each entry on sys.path because there is no
necessary identity between archive files and sys.path.  The user
specified the archive files by name, and they may or may not be on
sys.path, and the user may or may not have specified them in the
same order as sys.path even if they are.

Suppose archive files must lie on sys.path and are processed in order.
Then to find them you must know their name.  But IMHO you want to
avoid doing a readdir() on each element of sys.path and looking for
files *.pyl.

Suppose archive file names in general are the known name "lib.pyl"
for the Python library, plus the names "package.pyl" where "package"
can be the name of a Python package as a single archive file.  Then
if the user tries to import foo, imputil will search along sys.path
looking for foo.pyc, foo.pyl, etc.  If it finds foo.pyl, the archive
importer will add it to its list of known archive files.  But it must
not add it to its single dictionary, because that would destroy the
information about its position along sys.path.  Instead, it must keep
a separate dictionary for each element of sys.path and search the
separate dictionaries under control of imputil.  That is, get_code()
needs a new argument for the element of sys.path being searched.
Alternatively, you could create a new importer instance for each
archive file found, but then you still have multiple dictionaries.
They are in the multiple instances.

All this is needed only to support import of identically named
modules.  If there are none, there is no problem because sys.path
is being used only to find modules, not to disambiguate them.

See also my separate reply to your other post which discusses
this same issue.

JimA


From gstein@lyra.org  Tue Dec  7 00:43:21 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 6 Dec 1999 16:43:21 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <384C01C1.8D1AFFFF@interet.com>
Message-ID: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>

On Mon, 6 Dec 1999, James C. Ahlstrom wrote:
> Greg Stein wrote:
>...
> > I am not following this. What/where is the "single dictionary of module
> > names" ? Are you referring to a cache? Or is this about building an
> > archive?
> > 
> > An archive would look just like we have now: map a name to a module. It
> > would not need multiple dictionaries.
> 
> The "single dictionary of names" is in the single archive importer
> instance and has nothing to do with creating the archive.  It
> is currently programmed this way.

Ah. There is the problem. In Guido's suggestion for the "next path of
inquiry" :-), there is no "single dictionary of names". Instead, you have
Importer instances as items in sys.path. Each instance maintains its
dictionary, and they are not (necessarily) combined.

If we were to combine them, then we would need to maintain the ordering
requirements implied by sys.path. However, this would be problematic if
sys.path changed -- we would have to detect the situation and rebuild a
merged dict.

> Suppose the user specifies by name 12 archive files to be searched.
> That is, the user hacks site.py to add archive names to the importer.
> The "single dictionary" means that the archive importer takes the 12
> dictionaries in the 12 files and merges them together into one
> dictionary
> in order to speed up the search for a name.  The good news is you can
> always just call the archive importer to get a module.  The bad news is
> you can't do that for each entry on sys.path because there is no
> necessary identity between archive files and sys.path.  The user
> specified the archive files by name, and they may or may not be on
> sys.path, and the user may or may not have specified them in the
> same order as sys.path even if they are.

The importer must be inserted into sys.path to establish a precedence. If
the user wants to add 12 libraries... fine. But *all* of those modules
will fall under a precedence defined by the Importer's position on
sys.path.

> Suppose archive files must lie on sys.path and are processed in order.
> Then to find them you must know their name.  But IMHO you want to
> avoid doing a readdir() on each element of sys.path and looking for
> files *.pyl.

I do not believe that we will arbitrarily locate and open library files.
They must be specified explicitly.

> Suppose archive file names in general are the known name "lib.pyl"
> for the Python library, plus the names "package.pyl" where "package"
> can be the name of a Python package as a single archive file.  Then
> if the user tries to import foo, imputil will search along sys.path
> looking for foo.pyc, foo.pyl, etc.  If it finds foo.pyl, the archive
> importer will add it to its list of known archive files.  But it must
> not add it to its single dictionary, because that would destroy the
> information about its position along sys.path.  Instead, it must keep
> a separate dictionary for each element of sys.path and search the
> separate dictionaries under control of imputil.  That is, get_code()
> needs a new argument for the element of sys.path being searched.
> Alternatively, you could create a new importer instance for each
> archive file found, but then you still have multiple dictionaries.
> They are in the multiple instances.

If the user installs ".pyl" as a recognized extension (i.e. installs into
the PathImporter), then the above scenario is possible. In my
in-head-design, I had not imagined any state being retained for
extension-recognizer hooks. Of course, state can be retained simply by
using a bound-method for the hook function.

get_code() would not need to change. The foo.pyl would be consulted at the
appropriate time based on where it is found in sys.path. Note that file-
extension hooks would definitely have a complete path to the target file.
Those are not Importers, however (although they will closely follow the
get_code() hook since the extension is called from get_code).

From a pure theoretical standpoint, you can also see that get_code()
should not have a pathname passed -- that would introduce filesystem
semantics into what is otherwise an independent semantic (map name to
module).

More detail: the extension recognizer could certainly retain cache about
each of the archives that are located. However, the recognizer would be
consulted (by the PathImporter) once for each archive found, in an
ordering defined by sys.path.

> All this is needed only to support import of identically named
> modules.  If there are none, there is no problem because sys.path
> is being used only to find modules, not to disambiguate them.

But the current (and future) semantics of Python states that you may have
identically named modules, and that sys.path *does* disambiguate them.

In fact, I use this feature all the time -- I use my new httplib.py rather
than the standard library version. I do this by placing the specific
directly "first" in my sys.path.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Tue Dec  7 05:11:25 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 7 Dec 1999 00:11:25 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com>
Message-ID: <001601bf4071$8278cc20$88a0143f@tim>

[/F]
> PS. on the performance side of things, did you know
> that 're' can be up to ten times slower than 'regex'?
> but people don't complain -- probably because it
> allows them to do things they couldn't do before...

Bad example:  people do complain about this.  Those who care a lot continue
to use regex, temporarily pacified by the promise that re.py will get
recoded in C and thus regain a good chunk of regex's speed.  Those who care
a whale of a lot continue to use Perl <0.9 wink>.


From guido@CNRI.Reston.VA.US  Tue Dec  7 12:45:25 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 07 Dec 1999 07:45:25 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Mon, 06 Dec 1999 16:43:21 PST."
 <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>
Message-ID: <199912071245.HAA21596@eric.cnri.reston.va.us>

> If we were to combine them, then we would need to maintain the ordering
> requirements implied by sys.path. However, this would be problematic if
> sys.path changed -- we would have to detect the situation and rebuild a
> merged dict.

No need to worry about this: just don't merge the caches.  Compared to
the hundreds of failed open() calls that are done now, it's no big
deal to do 12 failed Python dictionary lookups instead of one.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Tue Dec  7 13:25:54 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 7 Dec 1999 14:25:54 +0100
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>
Message-ID: <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com>

Greg Stein <gstein@lyra.org> wrote:
> > The "single dictionary of names" is in the single archive importer
> > instance and has nothing to do with creating the archive.  It
> > is currently programmed this way.
> 
> Ah. There is the problem. In Guido's suggestion for the "next path of
> inquiry" :-), there is no "single dictionary of names". Instead, you have
> Importer instances as items in sys.path. Each instance maintains its
> dictionary, and they are not (necessarily) combined.

so the "sys.path contains importers (or strings)" strategy
is now officially sanctioned?  cool!!!

(a quick look in our code base says that this will cause
some trouble, unless os.path.isdir() is modified to reject
non-strings...  after all, if it's not a string, it cannot be
a valid directory path, so this does make some sense ;-)

another aside: can we have a standard mechanism for
listing the contents of a given archive, please?  we have
a lot of "path scanning" stuff (PIL and PST, among others),
and it would be great if things didn't break down if you
stuff it all in an archive.

something like:

    for path in sys.path:
        if os.path.isdir(path):
            files = os.listdir(path)
        else:
            try:
                files = path.listdir()
            except AttributeError:
                files = None
        if files is None:
            # no idea what's in here
        else:
            # path provides (at least) these modules

would be really useful.

and yes, it shouldn't have to be mentioned, since squeeze
have done it since early 1997, but archive importers should
provide a standard way to include non-module resources in
the archive, and a standard way to access such resources
as ordinary python streams.

e.g:

    file = path.open(name, "rb")

or something...

</F>


From jim@interet.com  Tue Dec  7 15:20:15 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Tue, 07 Dec 1999 10:20:15 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org> <199912071245.HAA21596@eric.cnri.reston.va.us>
Message-ID: <384D25AF.4C4F5107@interet.com>

Guido van Rossum wrote:

> No need to worry about this: just don't merge the caches.  Compared to
> the hundreds of failed open() calls that are done now, it's no big
> deal to do 12 failed Python dictionary lookups instead of one.

Agreed.

JimA


From jim@interet.com  Tue Dec  7 15:31:30 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Tue, 07 Dec 1999 10:31:30 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>
Message-ID: <384D2852.3C36C216@interet.com>

Greg Stein wrote:

> Ah. There is the problem. In Guido's suggestion for the "next path of
> inquiry" :-), there is no "single dictionary of names". Instead, you have
> Importer instances as items in sys.path. Each instance maintains its
> dictionary, and they are not (necessarily) combined.

> [A large number of other design issues]

OK, all design issues agreed.  I will make needed changes.

JimA


From jim@interet.com  Tue Dec  7 15:37:36 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Tue, 07 Dec 1999 10:37:36 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org> <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com>
Message-ID: <384D29C0.3D3A2194@interet.com>

Fredrik Lundh wrote:

> another aside: can we have a standard mechanism for
> listing the contents of a given archive, please?

I will add this.

> and yes, it shouldn't have to be mentioned, since squeeze
> have done it since early 1997, but archive importers should
> provide a standard way to include non-module resources in
> the archive, and a standard way to access such resources
> as ordinary python streams.

I will add this.

JimA


From gstein@lyra.org  Tue Dec  7 16:53:49 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 7 Dec 1999 08:53:49 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912071245.HAA21596@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912070853230.21367-100000@nebula.lyra.org>

On Tue, 7 Dec 1999, Guido van Rossum wrote:
> > If we were to combine them, then we would need to maintain the ordering
> > requirements implied by sys.path. However, this would be problematic if
> > sys.path changed -- we would have to detect the situation and rebuild a
> > merged dict.
> 
> No need to worry about this: just don't merge the caches.  Compared to
> the hundreds of failed open() calls that are done now, it's no big
> deal to do 12 failed Python dictionary lookups instead of one.

Have no fear... I wasn't planning on this... complicates too much stuff
for too little gain.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US  Wed Dec  8 12:07:31 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 07:07:31 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 02:46:02 EST."
 <000201bf4150$46749da0$5aa2143f@tim>
References: <000201bf4150$46749da0$5aa2143f@tim>
Message-ID: <199912081207.HAA00040@eric.cnri.reston.va.us>

[Great analysis, Tim!]

> 4) The audience is Python end-users "in general", and the product is pure
> Python.  I think this is the most important one for Distutils to address,
> and compilation isn't a part of it.  So far, though, what Gordon is doing
> seems more appropriate than what Distutils has been up to.  I hope his work
> gets folded into this.

I'm not sure what stuff by which Gordon you're referring to.  I am
only familiar with his installer, which I thought is win32 only (but
I may be mistaken) and is an installer for a whole application, not
just a bunch of modules.  Please correct me if I'm wrong.

But this reminds me of a different issue, which Jim Ahlstrom has been
hammering about before: there's a completely separate set of cases
where what you are distributing is a stand-alone application, and the
target consists of end users who are entirely uninterested in whether
it's written in Python, C or Elvish.  (And then there's still the
distinction between Win32, Unix or both.)  The current distutil dools
don't deal with this at all.  I think it should though, and I think
its framework is powerful enough to be able to add this, e.g. as a new
"appdist" command.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com  Wed Dec  8 14:16:07 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 09:16:07 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 02:46:02 EST."             <000201bf4150$46749da0$5aa2143f@tim>
Message-ID: <1267460464-31845181@hypernet.com>

Guido wrote:

> [Great analysis, Tim!]
> 
> > 4) The audience is Python end-users "in general", and the
> > product is pure Python.  I think this is the most important one
> > for Distutils to address, and compilation isn't a part of it. 
> > So far, though, what Gordon is doing seems more appropriate
> > than what Distutils has been up to.  I hope his work gets
> > folded into this.
> 
> I'm not sure what stuff by which Gordon you're referring to.  I
> am only familiar with his installer, which I thought is win32
> only (but I may be mistaken) and is an installer for a whole
> application, not just a bunch of modules.  Please correct me if
> I'm wrong.

It needed a name. I hate the word "Installer", but it expresses 
in one word the most common use of my stuff.

I'll be releasing a beta for Linux real soon. Only some of the 
tricks are Windows only (such as self-extracting executables, 
which is only culturally appropriate on Windows, anyway).

But more importantly it's not just for installing. The Python I 
use (interactively) on my wife's machine is 1 directory with 
about 6 files in it. On my Linux box I've been using the std lib 
in a .pyz for about a month now. Someone distributing a pure 
Python package could instead ship 3 files (imputil.py, 
archive.py and <package>.pyz) with the "install" consisting of 
adding one line to site.py in the user's perfectly normal Python 
installation.

And yeah, I solved the "manifest" problem, too. Mine predates 
Distutils, so don't accuse me of duplicate effort, (I pointed 
them to it a couple times). It uses ConfigParser and a config 
file, so it allows finer control.

While .pyz's are completely cross-platform, I have yet to work 
out endianness issues in the other archive I use (which should 
probably be zip format - it can hold anything). And at the 
"Installer" end, I have yet to work out how things should work 
on non-ELF/COFF platforms (where I can't append the archive 
to the executable). But there aren't any technical issues 
involved; just lack of time.

So no, it's not just for Windows; and no, it's not just for 
creating standalones (though that's what almost everyone 
uses it for).

- Gordon


From guido@CNRI.Reston.VA.US  Wed Dec  8 14:56:42 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 09:56:42 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 09:16:07 EST."
 <1267460464-31845181@hypernet.com>
References: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim>
 <1267460464-31845181@hypernet.com>
Message-ID: <199912081456.JAA00200@eric.cnri.reston.va.us>

> It needed a name. I hate the word "Installer", but it expresses 
> in one word the most common use of my stuff.
> 
> I'll be releasing a beta for Linux real soon. Only some of the 
> tricks are Windows only (such as self-extracting executables, 
> which is only culturally appropriate on Windows, anyway).
> 
> But more importantly it's not just for installing. The Python I 
> use (interactively) on my wife's machine is 1 directory with 
> about 6 files in it. On my Linux box I've been using the std lib 
> in a .pyz for about a month now. Someone distributing a pure 
> Python package could instead ship 3 files (imputil.py, 
> archive.py and <package>.pyz) with the "install" consisting of 
> adding one line to site.py in the user's perfectly normal Python 
> installation.
> 
> And yeah, I solved the "manifest" problem, too. Mine predates 
> Distutils, so don't accuse me of duplicate effort, (I pointed 
> them to it a couple times). It uses ConfigParser and a config 
> file, so it allows finer control.
> 
> While .pyz's are completely cross-platform, I have yet to work 
> out endianness issues in the other archive I use (which should 
> probably be zip format - it can hold anything). And at the 
> "Installer" end, I have yet to work out how things should work 
> on non-ELF/COFF platforms (where I can't append the archive 
> to the executable). But there aren't any technical issues 
> involved; just lack of time.
> 
> So no, it's not just for Windows; and no, it's not just for 
> creating standalones (though that's what almost everyone 
> uses it for).

Gordon, I'm sorry, but from this description I still have no idea what
your stuff is (and I forgot the URL so I can't look it up).  For
example, if it's not (just) for installing, what *is* it for?

What is the ``"manifest" problem'' and how did you solve it?

Also, note that editing site.py is a no-no!  You can create/edit
sitecustomize.py, but you should leave site.py alone!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com  Wed Dec  8 16:17:03 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 11:17:03 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081456.JAA00200@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 09:16:07 EST."             <1267460464-31845181@hypernet.com>
Message-ID: <1267453215-32281635@hypernet.com>

Guido,
 
> Gordon, I'm sorry, but from this description I still have no idea
> what your stuff is (and I forgot the URL so I can't look it up). 

http://starship.python.org/crew/gmcm/installer.html

The Linux stuff has a couple alpha testers and will probably 
get announced in a week or two.

> For example, if it's not (just) for installing, what *is* it for?
 
At the bottom level, it's a bunch of tools using freeze's 
modulefinder, imputil.py and 2 kinds of archives. There's at 
least 2 layers above that, with "Installer" being the top.  
There's a clean separation between the layers, so you can 
break in wherever you like.

> What is the ``"manifest" problem'' and how did you solve it?

The problem is specifying a set of resources, hopefully without 
having to list them explicitly. I solve this with a config file that 
lets you specify packages, directories, directory trees.. with 
filters that can work from paths, names, extensions, regular 
expressions...
 
> Also, note that editing site.py is a no-no!  You can create/edit
> sitecustomize.py, but you should leave site.py alone!

That would work fine. One of the standalone configurations will 
write a site.py, but that's for a completely self-contained 
installation (ie, one which will have no conflicts with another 
Python installation). 

I'd also note that, for Windows at least, the path-expanding 
mechanism created by site.py has not caught on. I've got lots 
installed, and no site-python, site-packages or sitecustomize.


- Gordon


From guido@CNRI.Reston.VA.US  Wed Dec  8 16:23:34 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 11:23:34 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 11:17:03 EST."
 <1267453215-32281635@hypernet.com>
References: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com>
 <1267453215-32281635@hypernet.com>
Message-ID: <199912081623.LAA04119@eric.cnri.reston.va.us>

[me]
> > Also, note that editing site.py is a no-no!  You can create/edit
> > sitecustomize.py, but you should leave site.py alone!

[Gordon]
> That would work fine. One of the standalone configurations will 
> write a site.py, but that's for a completely self-contained 
> installation (ie, one which will have no conflicts with another 
> Python installation). 
> 
> I'd also note that, for Windows at least, the path-expanding 
> mechanism created by site.py has not caught on. I've got lots 
> installed, and no site-python, site-packages or sitecustomize.

You shouldn't see site-python or site-packages, they only exist on
Unix.  On Windows, everything is installed in the top Python
directory.  However you should see .pth files there, which is what
site.py looks for.  I believe NumPy and PIL use those.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com  Wed Dec  8 16:55:51 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 11:55:51 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081623.LAA04119@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 11:17:03 EST."             <1267453215-32281635@hypernet.com>
Message-ID: <1267450887-32421651@hypernet.com>

> [Gordon]
> > That would work fine. One of the standalone configurations will
> > write a site.py, but that's for a completely self-contained
> > installation (ie, one which will have no conflicts with another
> > Python installation). 
> > 
> > I'd also note that, for Windows at least, the path-expanding
> > mechanism created by site.py has not caught on. I've got lots
> > installed, and no site-python, site-packages or sitecustomize.
[Guido] 
> You shouldn't see site-python or site-packages, they only exist
> on Unix.  

You mean "they only exist _for_ Unix", (site.py looks for them 
on Windows). I don't like that. For one thing, modulo a few 
platform differences, the same mechanism should work for 
multi-user Unix and Windows LAN installations. And single-
user Windows (I know, redundant, even on NT) should be a 
degenerate case of the above.

> On Windows, everything is installed in the top Python
> directory.  However you should see .pth files there, which is
> what site.py looks for.  I believe NumPy and PIL use those.

No NumPy, no PIL, no .pth files. 99% of everything out there 
just says "unzip this somewhere on your Python path".

In this case, Jim Ahlstrom may be right - there are too many 
options, or at least an insufficiently emphasized "proper" 
method. Until I worked out my own way of installing stuff, I 
used to lose a large number of packages whenever I upgraded 
my Windows Python.

Much as I love Mark's stuff (and hesitate to criticize crazy 
Aussies), I wish there weren't so much special casing here for 
Windows.

And no, I don't have any solutions to this, I'm just griping...

- Gordon


From guido@CNRI.Reston.VA.US  Wed Dec  8 17:07:30 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 12:07:30 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 11:55:51 EST."
 <1267450887-32421651@hypernet.com>
References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com>
 <1267450887-32421651@hypernet.com>
Message-ID: <199912081707.MAA04242@eric.cnri.reston.va.us>

> [Guido] 
> > You shouldn't see site-python or site-packages, they only exist
> > on Unix.  

[Gordon]
> You mean "they only exist _for_ Unix", (site.py looks for them 
> on Windows).

No it doesn't.  The code in site.py only adds site-packages and
site-python when os.sep is '/'.  RTSL.

> I don't like that. For one thing, modulo a few 
> platform differences, the same mechanism should work for 
> multi-user Unix and Windows LAN installations. And single-
> user Windows (I know, redundant, even on NT) should be a 
> degenerate case of the above.

What do you mean by "the same mechanism should work"?  The same
mechanism for what?  Are you talking about sharing the installed
files somehow?

> > On Windows, everything is installed in the top Python
> > directory.  However you should see .pth files there, which is
> > what site.py looks for.  I believe NumPy and PIL use those.
> 
> No NumPy, no PIL, no .pth files. 99% of everything out there 
> just says "unzip this somewhere on your Python path".

Fair enough.  Of course I know about .pth files so I unzipped them
elsewhere and added a .pth file pointing there...

> In this case, Jim Ahlstrom may be right - there are too many 
> options, or at least an insufficiently emphasized "proper" 
> method. Until I worked out my own way of installing stuff, I 
> used to lose a large number of packages whenever I upgraded 
> my Windows Python.

The .pth files are designed for this.  Maybe they haven't been
explained as well as they should.

> Much as I love Mark's stuff (and hesitate to criticize crazy 
> Aussies), I wish there weren't so much special casing here for 
> Windows.

It's not Mark's fault, it's Microsoft's fault.  If you don't do things
the way MS wants you to, experienced Windows users will gripe,
misunderstand what you do, etc.

> And no, I don't have any solutions to this, I'm just griping...

Ditto.  Understanding the problems is half of the solution though.
The problems seem pretty complex!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com  Wed Dec  8 18:25:50 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 13:25:50 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 11:55:51 EST."             <1267450887-32421651@hypernet.com>
Message-ID: <1267445488-32746429@hypernet.com>

[Guido] 
> No it doesn't.  The code in site.py only adds site-packages and
> site-python when os.sep is '/'.  RTSL.

Oops. Missed that.

> > I don't like that. For one thing, modulo a few 
> > platform differences, the same mechanism should work for 
> > multi-user Unix and Windows LAN installations. And single- user
> > Windows (I know, redundant, even on NT) should be a degenerate
> > case of the above.
> 
> What do you mean by "the same mechanism should work"?  The same
> mechanism for what?  Are you talking about sharing the installed
> files somehow?

In the above, "mechanism" basically meant that which creates 
sys.path. 

Basically, this came up for me because in standalone 
configurations (my Installer again), I have to take complete 
control of sys.path. After doing so differently on Windows and 
Linux, I finally realized that I can do it the same way on both.
 
Which makes me question why they are so different.

> The .pth files are designed for this.  Maybe they haven't been
> explained as well as they should.

I'd say "badgered" or "browbeaten" instead of "explained" ;-).
 
> > Much as I love Mark's stuff (and hesitate to criticize crazy
> > Aussies), I wish there weren't so much special casing here for
> > Windows.
> 
> It's not Mark's fault, it's Microsoft's fault.  If you don't do
> things the way MS wants you to, experienced Windows users will
> gripe, misunderstand what you do, etc.

Even MS doesn't do things the way MS says they want you to.

I find MS users equally divided between those who scream 
bloody murder if you touch the registry, and those who 
scream if you don't.

It's not like *nixen suffer from an excessive degree of 
conformity in preferred installation procedures, but somehow 
Python survives there...

> > And no, I don't have any solutions to this, I'm just griping...
> 
> Ditto.  Understanding the problems is half of the solution
> though. The problems seem pretty complex!

Grumpily agreed ;-).


- Gordon


From jim@interet.com  Wed Dec  8 18:33:51 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Wed, 08 Dec 1999 13:33:51 -0500
Subject: [Python-Dev] Linux Journal confirms evil rumor
References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com>
 <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us>
Message-ID: <384EA48F.F5190180@interet.com>

I finally got around to reading the current Linux
Journal (which just keeps getting better and better)
and lo! there was a picture of a familiar face I just
couldn't quite....

Oh no!  Could it be true?  I heard rumors but I refused to
believe them until now.  The glasses are gone!  Guido now
looks like an investment banker!  The sky is falling!

Next will probably be a Python 1.6 as a 27 Meg DLL, and
a Python IPO.  Well, maybe not.  Now that I look more
closely, he is wearing a black and white and mustard
(??MUSTARD) T-shirt which says "You Need Python".

At least we ought to make him wear a name tag at IPC8.

JimA


From fdrake@acm.org  Wed Dec  8 18:37:44 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 8 Dec 1999 13:37:44 -0500 (EST)
Subject: [Python-Dev] Linux Journal confirms evil rumor
In-Reply-To: <384EA48F.F5190180@interet.com>
References: <1267453215-32281635@hypernet.com>
 <1267450887-32421651@hypernet.com>
 <199912081707.MAA04242@eric.cnri.reston.va.us>
 <384EA48F.F5190180@interet.com>
Message-ID: <14414.42360.309237.967766@weyr.cnri.reston.va.us>

James C. Ahlstrom writes:
 > Oh no!  Could it be true?  I heard rumors but I refused to
 > believe them until now.  The glasses are gone!  Guido now
 > looks like an investment banker!  The sky is falling!

  I'm afraid this non-distinctive look was introduced at IPC7... it's
too bad we can't tell people Python was invented by the guy with the
glasses anymore.

 > Next will probably be a Python 1.6 as a 27 Meg DLL, and
 > a Python IPO.  Well, maybe not.  Now that I look more
 > closely, he is wearing a black and white and mustard
 > (??MUSTARD) T-shirt which says "You Need Python".

  It's really the blue & white & orange IPC7 shirt.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Wed Dec  8 18:41:51 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Wed, 8 Dec 1999 13:41:51 -0500 (EST)
Subject: [Python-Dev] Linux Journal confirms evil rumor
References: <1267453215-32281635@hypernet.com>
 <1267450887-32421651@hypernet.com>
 <199912081707.MAA04242@eric.cnri.reston.va.us>
 <384EA48F.F5190180@interet.com>
Message-ID: <14414.42607.701538.783684@anthem.cnri.reston.va.us>

>>>>> "JCA" == James C Ahlstrom <jim@interet.com> writes:

    JCA> Oh no!  Could it be true?  I heard rumors but I refused to
    JCA> believe them until now.  The glasses are gone!  Guido now
    JCA> looks like an investment banker!  The sky is falling!

He's not the only one who's, like, "gone corporate", but I won't
mention any names, so as to protect the guilty.


From jim@digicool.com  Wed Dec  8 19:03:42 1999
From: jim@digicool.com (Jim Fulton)
Date: Wed, 08 Dec 1999 14:03:42 -0500
Subject: [Python-Dev] Linux Journal confirms evil rumor
References: <1267453215-32281635@hypernet.com>
 <1267450887-32421651@hypernet.com>
 <199912081707.MAA04242@eric.cnri.reston.va.us>
 <384EA48F.F5190180@interet.com> <14414.42607.701538.783684@anthem.cnri.reston.va.us>
Message-ID: <384EAB8E.EBA595B5@digicool.com>

"Barry A. Warsaw" wrote:
> 
> He's not the only one who's, like, "gone corporate", but I won't
> mention any names, so as to protect the guilty.

OK, Buzz.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From tim_one@email.msn.com  Thu Dec  9 05:31:52 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 9 Dec 1999 00:31:52 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us>
Message-ID: <000301bf4206$b39e5b80$36a2143f@tim>

[Guido]
> [Great analysis, Tim!]

I beg to differ:  it's internally inconsistent and should have identified at
least 3 axes and hence at least 8 cases.  Still, you got more than you paid
for <wink>.

>> 4) The audience is Python end-users "in general", and the
>> product is pure Python.  I think this is the most important one
>> for Distutils to address, and compilation isn't a part of it.
>> So far, though, what Gordon is doing seems more appropriate
>> than what Distutils has been up to.  I hope his work gets folded
>> into this.

> I'm not sure what stuff by which Gordon you're referring to.

You guessed right!

> I am only familiar with his installer, which I thought is win32
> only (but I may be mistaken) and is an installer for a whole
> application, not just a bunch of modules.  Please correct me if
> I'm wrong.

If it can install a whole app, what makes you suspect it couldn't install
just a bunch of modules <0.5 wink>?

It started life as Windows-only, and I believe it's been virtually ignored
by non-Windows folk because of that.  Bad blind spot.  It supplies
already-working approaches to many of the issues that are still being
*talked* about on Distutils (at least archive formats, code to manipulate
same, manifest files (how do you tell the tool which files to package?), and
transparently bundling a Python interpreter when needed).

> But this reminds me of a different issue, which Jim Ahlstrom has
> been hammering about before: there's a completely separate set of
> cases where what you are distributing is a stand-alone application,
> and the target consists of end users who are entirely uninterested
> in whether it's written in Python, C or Elvish.

I include part of that in my case #4 above, where the app happens to be
written in Pure Python -- but the user doesn't have to know that.  Gordon is
addressing at least that part of it.  AFAIK he can't deal with transparently
compiling C or exorcising Elvish on the target platform, but if you're just
distributing the binaries I expect his work is directly usable already.

> (And then there's still the distinction between Win32, Unix or
> both.)

I vote "both".  The world really doesn't need another Win32-only (or
Unix-only) installer, archive format, compression format, or distribution
model.

Jim seems mostly interested in Win32-only to me, and his concerns haven't
been about the mechanics of distribution but about how-- regardless of
tool --to create a bulletproof Python installation by hook or by crook.
Last time we went thru this, it was concluded that one couldn't without
patching the Python Windows binary with a resource editor (to point to its
own infernal <0.5 wink> registry entries).

Distutils hasn't talked about that at all (that I've seen, anyway); if there
were a less radical approach to that, I suspect Jim would be delighted to
use one of the commercial Win32 installation pkgs (and if that's what his
customers expect, delighted or not that's what he'll do).

> The current distutil dools don't deal with this at all.

That's why I said I thought what Gordon is doing seems more appropriate to
case #4 than what Distutils has been doing.

> I think it should though,

Ditto.

> and I think its framework is powerful enough to be able to
> add this, e.g. as a new "appdist" command.

I cordially invite (since Gordon will uncordially browbeat <wink>) people to
look seriously at what he's done.  Best I can tell, for apps that don't need
compilation "on the other end", it's mostly "there" already!

give-the-man-a-hand-ly y'rs  - tim


From tim_one@email.msn.com  Thu Dec  9 05:52:23 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 9 Dec 1999 00:52:23 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <1267453215-32281635@hypernet.com>
Message-ID: <000601bf4209$90a90c80$36a2143f@tim>

> http://starship.python.org/crew/gmcm/installer.html

Eh?  Doesn't work for me.  This does:

    http://starship.python.net/crew/gmcm/distribute.html


From tim_one@email.msn.com  Thu Dec  9 06:38:54 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 9 Dec 1999 01:38:54 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us>
Message-ID: <000701bf4210$10925a40$36a2143f@tim>

[Gordon]
>> Much as I love Mark's stuff (and hesitate to criticize crazy
>> Aussies), I wish there weren't so much special casing here for
>> Windows.

[Guido]
> It's not Mark's fault, it's Microsoft's fault.  If you don't do
> things the way MS wants you to, experienced Windows users will
> gripe, misunderstand what you do, etc.

Something just occurred to me:  MS's guidelines aren't arbitrary, they
actually have very good reasons.  In the case of putting all an app's
crucial info in the Registry, it's the only way to allow a site
administrator to set policy and site options remotely (an admin can fiddle
other machines' registries remotely).  This works very well indeed when
there's only "one copy" of an app on a machine (or at most one copy "per
user").

What just occurred to me is that JimA is concerned with *not* letting any
info from a previously-installed Python affect the app he's installing.
Similarly, Gordon's Win32 "standalone installer" modifies python.exe and
pythonw.exe to use a PYTHONPATH he forces, leaving the registry out of it.
Similarly, the woes I've had in trying to sell Python as a general Win32
scripting tool at work mostly boil down to that there's no effortless way to
do it that doesn't risk picking up info from-- or forcing info
onto --pre-existing or future distinct Python installations (in contrast,
Perl "just works" in this respect).

IOW, the three of us find getting path info out of the registry intolerable
because we are in fact trying to do the opposite of what the registry
mechanism was *designed* for:  we want perfect isolation, not perfect
sharing.

This has come up on Python-Help a few times too, in the guise of someone
installing a product that in turn installs an older version of Python, which
in turn confuses another product that relies on features in a newer version
of Python.

So while the traditional Windows .ini file (like Unix this-or-that.rc file)
model was replaced by the registry for excellent reasons, those reasons
don't apply to the way we're using Python!  The .ini file model was exactly
right for what most of us seem to want to do, and the registry model is
exactly wrong.

just-thought-i'd-cheer-you-up<wink>-ly y'rs  - tim


From skip@mojam.com (Skip Montanaro)  Thu Dec  9 07:38:36 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 9 Dec 1999 01:38:36 -0600 (CST)
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <000701bf4210$10925a40$36a2143f@tim>
References: <199912081707.MAA04242@eric.cnri.reston.va.us>
 <000701bf4210$10925a40$36a2143f@tim>
Message-ID: <14415.23676.775163.786028@dolphin.mojam.com>

    Tim> So while the traditional Windows .ini file (like Unix
    Tim> this-or-that.rc file) model was replaced by the registry for
    Tim> excellent reasons, those reasons don't apply to the way we're using
    Tim> Python!  The .ini file model was exactly right for what most of us
    Tim> seem to want to do, and the registry model is exactly wrong.

Alright!  Now I understand what all the hubbub is about!  My eyes have
mostly been glazing over trying to follow all this Windows registry/path/ini
stuff.  MS believes that Python is the application.  Those of us writing
Python programs view those programs as the applications, not the Python
interpreter per se.  Is there some way that people writing applications in
Python can set up registry entries that are specific to their application
(e.g. tabnanny.py) instead of only specific to the Python interpreter?

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From gmcm@hypernet.com  Thu Dec  9 14:17:27 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 9 Dec 1999 09:17:27 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <000701bf4210$10925a40$36a2143f@tim>
References: <199912081707.MAA04242@eric.cnri.reston.va.us>
Message-ID: <1267374045-37047016@hypernet.com>

[Guido]
> > It's not Mark's fault, it's Microsoft's fault.  If you don't do
> > things the way MS wants you to, experienced Windows users will
> > gripe, misunderstand what you do, etc.
[Tim] 
> Something just occurred to me:  MS's guidelines aren't arbitrary,
> they actually have very good reasons.  In the case of putting all
> an app's crucial info in the Registry, it's the only way to allow
> a site administrator to set policy and site options remotely (an
> admin can fiddle other machines' registries remotely).  This
> works very well indeed when there's only "one copy" of an app on
> a machine (or at most one copy "per user").

And actually, the business about separate subtrees for the 
machine's configuration and the user's configuration is pretty 
clever. MS doesn't explain it well, and it gets misused, but 
when done right, it's a lot simpler than the maze of .xxxrc files 
you sometimes find in other OSes.
 
> What just occurred to me is that JimA is concerned with *not*
> letting any info from a previously-installed Python affect the
> app he's installing. Similarly, Gordon's Win32 "standalone
> installer" modifies python.exe and pythonw.exe to use a
> PYTHONPATH he forces, leaving the registry out of it. Similarly,
> the woes I've had in trying to sell Python as a general Win32
> scripting tool at work mostly boil down to that there's no
> effortless way to do it that doesn't risk picking up info from--
> or forcing info onto --pre-existing or future distinct Python
> installations (in contrast, Perl "just works" in this respect).

In my Linux version, I went to the heart of the matter - 
getpath.c. It occurs to me that getpath.c might do better to 
follow a normal bootstrap process - ie,  create the absolute 
minimal sys.path required to go to the next step. Then the 
rest of what goes on in getpath.c could be written in Python. 
Maybe that Python code needs to get frozen in (to prevent 
bozos from destroying an installation by stepping on 
getpath.py), but it would make it a lot easier to create 
independent installations, and also reduce the variations 
between platforms at the C level. (Then again, I've never heard 
of anyone stepping on exceptions.py.)

If some registry manipulation primitives were exposed (say, 
through ntpath) that would mean that Windows developers 
could (if they wanted) play by the MS rules with at least the 
option of not stepping on each other.
 

- Gordon


From jim@interet.com  Thu Dec  9 15:02:18 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 10:02:18 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim>
Message-ID: <384FC47A.BB4DA517@interet.com>

Tim Peters wrote:

> Jim seems mostly interested in Win32-only to me, and his concerns haven't
> been about the mechanics of distribution but about how-- regardless of
> tool --to create a bulletproof Python installation by hook or by crook.

Not exactly.  I am interested in how to create a bullet-proof
installation.
But I am equally interested in Unix (especially Linux) and dislike the
current dichotomy in the code base.

Lately I have been more active in distribution via archive files.
Part of the solution is an archive file format which is identical on
Unix and Windows, and which can hold the Python library and packages
as single files.  For my own efforts on this see:

    ftp://ftp.interet.com/pub/pylib.html

This is an archive file format similar to Gordon's format, although
Gordon's work goes well beyond just file formats.  I currently have
fifth generation code for this format, and am adding features as
suggested by Fredrik Lundt.  I hope it gets considered as a candidate
for a Python standard format.

> Distutils hasn't talked about that at all (that I've seen, anyway);

Gordon, Greg Stein and I have discussed file formats before.  I think
it was on distutils.  Anyway that was months ago.

JimA


From guido@CNRI.Reston.VA.US  Thu Dec  9 16:17:18 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 11:17:18 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 09:17:27 EST."
 <1267374045-37047016@hypernet.com>
References: <199912081707.MAA04242@eric.cnri.reston.va.us>
 <1267374045-37047016@hypernet.com>
Message-ID: <199912091617.LAA05742@eric.cnri.reston.va.us>

> [Guido]
> > > It's not Mark's fault, it's Microsoft's fault.  If you don't do
> > > things the way MS wants you to, experienced Windows users will
> > > gripe, misunderstand what you do, etc.
> [Tim] 
> > Something just occurred to me:  MS's guidelines aren't arbitrary,
> > they actually have very good reasons.  In the case of putting all
> > an app's crucial info in the Registry, it's the only way to allow
> > a site administrator to set policy and site options remotely (an
> > admin can fiddle other machines' registries remotely).  This
> > works very well indeed when there's only "one copy" of an app on
> > a machine (or at most one copy "per user").
[Gordon]
> And actually, the business about separate subtrees for the 
> machine's configuration and the user's configuration is pretty 
> clever. MS doesn't explain it well, and it gets misused, but 
> when done right, it's a lot simpler than the maze of .xxxrc files 
> you sometimes find in other OSes.

I agree.  And I am guilty of not even try to find MS' explanation -- I
just looked in the registry at what other apps did and tried to mimic
that (plus what Mark had already done), without really knowing what I
was doing.  I now know a little better -- see the end of this message.

> In my Linux version, I went to the heart of the matter - 
> getpath.c. It occurs to me that getpath.c might do better to 
> follow a normal bootstrap process - ie,  create the absolute 
> minimal sys.path required to go to the next step. Then the 
> rest of what goes on in getpath.c could be written in Python. 
> Maybe that Python code needs to get frozen in (to prevent 
> bozos from destroying an installation by stepping on 
> getpath.py), but it would make it a lot easier to create 
> independent installations, and also reduce the variations 
> between platforms at the C level. (Then again, I've never heard 
> of anyone stepping on exceptions.py.)

Yes, this is exactly what was proposed in the thread on the Big Import
Rewrite.

> If some registry manipulation primitives were exposed (say, 
> through ntpath) that would mean that Windows developers 
> could (if they wanted) play by the MS rules with at least the 
> option of not stepping on each other.

That's a good idea.  These functions are already available through
Mark's win32api extension -- much of which will eventually (I hope
before 1.6 is out!) become part of the core distribution.

In the mean time, I've been thinking a bit more about how Python
should be using the Windows registry.  (It's clear to me that Python
should use the registry -- those who disagree can go build their own
Python distribution.)

The basic ideas of Python's current registry usage are sound: there's
a resource built into the DLL which is part of the key into the
registry used for all information.

The problem lies in which key is used.  All versions of Python 1.5.x
(1.5, 1.5.1, 1.5.2) use the same key!  This is a main cause of
trouble, because it means that different versions cannot peacefully
live together even if the user installs them into different
directories -- they will all use the registry keys of the last version
installed.  This, in turn, means that someone who writes a Python
application that has a dependency on a particular Python version (and
which application worth distributing doesn't :-) cannot trust that if
a Python installation is present, it is the right one.  But they also
cannot simply bundle the standard installer for the correct Python
version with their program, because its installation would overwrite
an existing Python application, thus breaking some *other* Python apps
that the user might already have installed.

(There's a solution for app builders who are willing to do a lot of
work -- you can change the registry key resource in the DLL.  For
example, Alice comes with its own version of Python 1.5.1 and it uses
"1.5.1-alice" as its registry key.  The Alice installer installs
Python in a subdirectory of the Alice installation directory and
points the 1.5.1-alice registry entries there.  The problem is that
this is a lot of work for the average app builder.)

I thought a bit about how VB solves this.  I think that when you wrap
up a VB app in, all the support code (mostly a big DLL) is wrapped
with it.  When the user runs the installer, the DLL is installed
(probably in the WINDOWS directory).  If a user installs several VB
apps built with the same VB version, they all attempt to install the
exact same DLL; of course the installers notice this and optimize it
away, keeping a reference count.  (Ignoring for now the fact that
those reference counts don't always work!)  If an app builty with a
different VB version is installed, it has a DLL with a different name,
and that is installed separately.  Other support files, I presume, are
dealt with in much the same way.  Voila, there's the theory.

How can we do something similar for Python?

A app written in Python should need to install only three or four
files:

- a driver EXE to start the app
- a copy of the Python DLL
- the Python library in an archive
- the app code in an archive

The latter two could be combined into a single archive, but I propose
that we use two archives so that the DLL and the Python library
archive can be shared between installations of independent Python apps
as long as they use the exact same Python version and don't need
additional 3rd party packages.  (I believe that Jim A's proposal
combines the archives with the EXE and the DLL, reducing the number of
files to two.  That's fine too.)

Is there a use for the registry here at all?  Maybe not.  (I notice
that VB seems to have a single registry entry, pointing to a DLL; all
other VB files also seem to live there.)

Complications:

- Some apps may need a custom extension module, which has to be
  installed as a PYD file.  So it seems that there needs to be a
  directory per app, and perhaps per version of the app (if the app
  distributor cares).

- Some apps need other, non-pyc files (e.g. data tables or help
  files); it would be handy if these could be stored in the archives as
  well.

- Some standard extension modules are in their own PYD files; these
  also need to be installed.  They aren't typically marked with a
  version, so perhaps a path directory per version of Python (if not per
  installed app) is wise.

- How to distribute an app that needs 3rd party stuff, e.g. Tcl/Tk, or
  PIL, or NumPy?  Their Python code can easily be wrapped up in another
  archive with a standard name incorporating a version number; but the
  required PYD and DLL files are a separate story.  (E.g. for Tkinter,
  you need _tkinter.pyd which links against tcl80.dll.)  Basically the
  same solution as for standard PYD files can work; the needed DLL files
  can be installed either systemwide (if they have a reliable version
  number in their name, like tcl80.dll) or in the per-app or per-package
  directory (like NumPy).

- Presumably, the archives will contain PYC files only.  This means
  that tracebacks will not show source code, only line numbers.  For Jim
  A, this is probably exactly what he wants (if the user gets a
  traceback, his "robust app" has miserably failed, and he takes it in
  pride that this doesn't happen).  But for some others, access to the
  sources could be essential.

  For example, I might want to distribute IDLE using this mechanism;
  users of IDLE who are curious about the standard library (or about
  IDLE itself) should be able to open the source for an arbitrary module
  (and maybe even edit it, although that's not a priority and perhaps
  should even be discouraged).  Library source access is an important
  feature of the IDLE debugger as well.

  A way out for IDLE is to install a classic distribution of the Python
  library sources, into the filesystem at an IDLE specific location.
  Other apps, with only the need for source code in tracebacks, might
  choose to to have the PY files in the archives sitting next to the PYC
  files, and somehow the traceback mechanism should be accessing the
  archive to get a hold of the source.

And yes, I realize that Jim A's latest offering solves most of these
problems to a large extent -- well done.  (Jim, would you care to
comment on the issues that you don't address?  Will you address them
in a future version?)

Final notes:

There are two different problems here.  One is how to distribute
Python apps robustly to end users who don't particular care about
Python.  This is Jim A's problem (and he has a solution that works for
him).  In general the solutions here try to isolate the installed app
from other Python installations.  I'm proposing that at least the DLL
and the Python library archive can probably be shared between apps
without reducing robustness if we keep track more carefully of version
numbers.

The other problem is how to distribute packages of Python and
extension modules for use by Python users.  These typically need to
drop into some existing Python installation.  This is Paul Dubois'
problem with NumPy (amongst others) and is the current focus of the
distutil SIG.

However I believe that there could be a lot of common infrastructure
that would help us create better solutions for both problems.  For
package distribution, common infrastructure (a.k.a. standards) is
essential.  For app distribution, common infrastructure isn't so
important (since the solutions strive for total isolation, there's no
problem if different apps use solutions).  However, this changes when
app creators want to distribute robust self-sufficient apps that use
3rd party packages -- then the 3rd party packages must allow being
packaged up using the app distribution creator of choice.

Solving this compound problem (creating package distributions that can
be redistributed easily as part of robust Python app distributions)
should be an important goal for the infrastructure we're building
here.  The Big Import Rewrite ought to add this to its list of
objectives if it isn't already on it.  My guess is that the solution
for this compound problem will increase the dependency of app
distribution tools on the package distribution infrastructure; which
to me seems like a Good Thing because it would lead to more code
sharing.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@interet.com  Thu Dec  9 16:24:40 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 11:24:40 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000701bf4210$10925a40$36a2143f@tim>
Message-ID: <384FD7C8.12832BF1@interet.com>

Tim Peters wrote:

> Something just occurred to me:  MS's guidelines aren't arbitrary, they
> actually have very good reasons.  In the case of putting all an app's
> crucial info in the Registry, it's the only way to allow a site
> administrator to set policy and site options remotely (an admin can fiddle
> other machines' registries remotely).  This works very well indeed when
> there's only "one copy" of an app on a machine (or at most one copy "per
> user").

The registry is still a bad idea because it lumps critical and app data
into single files and brings up the ugly problem of protecting
individual registry entries instead of just files.  Microsoft
should have put all app config into the app directory and provided
for remote admin of that.  But that is not really your point (just
ranting about the registry again).

> IOW, the three of us find getting path info out of the registry intolerable
> because we are in fact trying to do the opposite of what the registry
> mechanism was *designed* for:  we want perfect isolation, not perfect
> sharing.
> 
> This has come up on Python-Help a few times too, in the guise of someone
> installing a product that in turn installs an older version of Python, which
> in turn confuses another product that relies on features in a newer version
> of Python.

Or, in other words, no isolation is possible if critical info
depends on global data like PYTHONPATH or a _common_ registry
entry.  We could have different registry entries, but this is
confusing and not documented.

I think we can solve this with archive files in a way compatible
with Unix without going off on a Windows-only wavelength.  If the
archive file contains everything, and it is in the dir of the app,
and the app looks there and finds it, then it Just Works.

See also my reply to Skip.

JimA


From akuchlin@mems-exchange.org  Thu Dec  9 16:32:08 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Thu, 9 Dec 1999 11:32:08 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
Message-ID: <199912091632.LAA09236@amarok.cnri.reston.va.us>

After poking around in the O'Reilly POSIX book, here's a list of POSIX
functions that don't seem to be available in Python.  Not all of them
seem worth supporting.   Ironically, Greg Ward's daemonize() Perl
subroutine, which started me on this, doesn't actually seem to need
anything that Python doesn't have.

I'm looking for corrections to the list; are there other POSIX
functions I've missed, or are some of them actually in Python?

I think implementing most of these functions is straightforward, with
the exception of opendir/readdir/closedir.

Worth adding?
=============
opendir(), readdir(), closedir() -- 
	   most of their functionality is available through
	   os.listdir(), but it might be useful to have a direct
	   interface.  Downside is that this would require a new
	   extension type for the C DIR struct.  My (lazy) inclination
	   is to not bother.

Worth adding:
=============

abort() -- used in Py_FatalError(), but not accessible to Python code

ctermid(), ctermid_r() -- returns the terminal pathname 
	   -- probably just add ctermid(), but use ctermid_r() for
thread-safety
            
fpathconf(fd, name) -- Get configuration limit for a file
	    -- would need constants from unistd.h

getlogin() -- returns user's login name
	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
	 getlogin() apparently looks in utmp

getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs

pathconf(path, name) -- Gets config variables for a path
	    -- would need constants from unistd.h

sysconf(int name) -- Gets system configuration information
	    -- would need constants from unistd.h

Not worth adding:
=================
clearerr() -- looks like fileobjects call clearerr() before raising errors

cuserid() -- returns user's login name
	  -- ORA book says "Do not use this function" -- removed in 1990 POSIX

difftime
	  -- seems only required in C "because no addition properties
are defined for time_t" (Solaris man page)              

tmpfile(), tmpnam() -- Create temp file, generate temp filename
		    -- Similar functionality available in tempfile.py

mblen(), mbstowcs(), mbtowc(), wcstombs(),  wctomb()
	 -- Multi-byte character functions: 
	 -- Don't bother; wait for the Unicode type.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
I'm sorry I became abusive just now ... calling you worms... I was just
speaking relatively, you understand.
    -- Dekko, in ZOT! #3


From jcw@equi4.com  Thu Dec  9 16:38:13 1999
From: jcw@equi4.com (Jean-Claude Wippler)
Date: Thu, 09 Dec 1999 17:38:13 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com>
Message-ID: <384FDAF5.C25C447C@equi4.com>

"James C. Ahlstrom" wrote:

[...]
>     ftp://ftp.interet.com/pub/pylib.html

Ouch - what's wrong with zip archives?

There are utilities to convert to/from zip, to re-pack, to mount zip
transparently so it's entries look like regular files, FTP servers, etc.

Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format.

Zips would seem natural with JPython.  And suppose that scripting ever
starts to consolidate to a common scripting kernel (yah, well), do you
really want a system which is closing all doors to cross-fertilization?

Zip has an advantage over .tar.gz in that its table of contents is
available without having to decompress the whole kaboodle.

Your format has no checksum, which for deployment and long-term storage
can be important.

If you want a marshalled TOC, then why not add a manifest entry for it,
sort of like what ranlib does with ar?

You designed the format so archives can be concatenated without any tool
(other than "cat"), but this works just as well with zip files, as the
Tcl Wrap approach demonstrates.

Allow me to very, very loosely paraphrase Guido here: sure, everyone can
design an archive format, but they are likely to make the same mistakes
all over again - so why not adopt a format which is tried and tested?

With all due respect - I sincerely hope you will reconsider and alter
your code to work with zip files.  It's probably a small adjustment?

Unless your *intent* is to create a diverging standard, of course...

-- Jean-Claude


From jim@interet.com  Thu Dec  9 16:46:35 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 11:46:35 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <199912081707.MAA04242@eric.cnri.reston.va.us>
 <000701bf4210$10925a40$36a2143f@tim> <14415.23676.775163.786028@dolphin.mojam.com>
Message-ID: <384FDCEB.2226C1C1@interet.com>

Skip Montanaro wrote:

> MS believes that Python is the application.  Those of us writing
> Python programs view those programs as the applications, not the Python
> interpreter per se.

I think this is a good point.  Windows app programmers (mostly)
view Python as part of their app and try it install it in their
app directory.  Unix installs Python as a system app in multiple
versions and users use PATH to pick a version.  Unix users view
the Python interpreter as a system service which is needed for
running their app.

I think this is because a Windows app is a visual program,
and the Python release compiles to a console app (not really
a visual program).  So all
(?most) Windows Python apps are custom mains with Python
as a component, but the stock python.exe is not the main.
This makes it difficult to document a way to install Python
in the Unix fashion, since all apps need their own binary main
and python15.dll is the only thing in common.

IMHO archive files can solve this a lot more simply.

JimA


From guido@CNRI.Reston.VA.US  Thu Dec  9 16:55:40 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 11:55:40 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 17:38:13 +0100."
 <384FDAF5.C25C447C@equi4.com>
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com>
 <384FDAF5.C25C447C@equi4.com>
Message-ID: <199912091655.LAA05928@eric.cnri.reston.va.us>

> "James C. Ahlstrom" wrote:
> 
> [...]
> >     ftp://ftp.interet.com/pub/pylib.html

Jean-Claude Wippler replied:

> Ouch - what's wrong with zip archives?
> 
> There are utilities to convert to/from zip, to re-pack, to mount zip
> transparently so it's entries look like regular files, FTP servers, etc.
> 
> Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format.
> 
> Zips would seem natural with JPython.  And suppose that scripting ever
> starts to consolidate to a common scripting kernel (yah, well), do you
> really want a system which is closing all doors to cross-fertilization?
> 
> Zip has an advantage over .tar.gz in that its table of contents is
> available without having to decompress the whole kaboodle.
> 
> Your format has no checksum, which for deployment and long-term storage
> can be important.
> 
> If you want a marshalled TOC, then why not add a manifest entry for it,
> sort of like what ranlib does with ar?
> 
> You designed the format so archives can be concatenated without any tool
> (other than "cat"), but this works just as well with zip files, as the
> Tcl Wrap approach demonstrates.
> 
> Allow me to very, very loosely paraphrase Guido here: sure, everyone can
> design an archive format, but they are likely to make the same mistakes
> all over again - so why not adopt a format which is tried and tested?
> 
> With all due respect - I sincerely hope you will reconsider and alter
> your code to work with zip files.  It's probably a small adjustment?
> 
> Unless your *intent* is to create a diverging standard, of course...

Exactly my sentiments.  We have rough Python code to deal with zip
files; it's very rough because we got kind of carried away adding
features and ended up with spaghetti code :-(  But it's working code
nevertheless and we're offering it up for anyone in this group to
clean up (we could do that ourselves but it's not high on our current
priority list).

I don't know anything about Tcl Wrap.  I do know a great deal about
the ZIP format, but apparently I missed the concatenation feature.
How does this work?  Does that work for all zip tools, or just for the
ZIP reader in Wrap?  (I looked up how Jim A does it -- his central
directory at the end of the file contains the total size of the data
covered by that directory, so he seeks back to the beginning of it and
sees if another magic number precedes it; and so on.  Very simple.)

I quickly looked at the Wrap page; it shows how to access data files
stored in the archive.  Question: does the wrap::open code go out to
the regular filesystem if it finds there's no wrap archive?  That
would be handy so you can test the code in its unwrapped form without
change.  Python needs this too.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward@cnri.reston.va.us  Thu Dec  9 17:12:00 1999
From: gward@cnri.reston.va.us (Greg Ward)
Date: Thu, 9 Dec 1999 12:12:00 -0500
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Dec 09, 1999 at 11:32:08AM -0500
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <19991209121159.B20179@cnri.reston.va.us>

On 09 December 1999, Andrew M. Kuchling said:
> After poking around in the O'Reilly POSIX book, here's a list of POSIX
> functions that don't seem to be available in Python.  Not all of them
> seem worth supporting.   Ironically, Greg Ward's daemonize() Perl
> subroutine, which started me on this, doesn't actually seem to need
> anything that Python doesn't have.

I think I already pointed this your way, but don't forget the man page
for Perl's POSIX module: "perldoc POSIX".  I suspect POSIX functions
that don't make sense in Perl also don't make sense in Python.

I agree with all your assessments about what's worth adding and what's
not, and that {close,read,open}dir() are questionable and probably not
worth the bother.  Random thoughts:

> abort() -- used in Py_FatalError(), but not accessible to Python code

Would this do the same as in C, ie. terminate the process and dump core?

> getlogin() -- returns user's login name
> 	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
> 	 getlogin() apparently looks in utmp

With a documentation proviso that utmp is very old-fashioned, and you
really should do the getuid() thing unless you definitely want to get
the login ID from utmp.  Perhaps an alternate "getlogin" (different
name?) that does the getuid() thing could be provided.

        Greg


From guido@CNRI.Reston.VA.US  Thu Dec  9 17:16:03 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 12:16:03 -0500
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: Your message of "Thu, 09 Dec 1999 12:12:00 EST."
 <19991209121159.B20179@cnri.reston.va.us>
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
 <19991209121159.B20179@cnri.reston.va.us>
Message-ID: <199912091716.MAA06063@eric.cnri.reston.va.us>

> > getlogin() -- returns user's login name
> > 	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
> > 	 getlogin() apparently looks in utmp
> 
> With a documentation proviso that utmp is very old-fashioned, and you
> really should do the getuid() thing unless you definitely want to get
> the login ID from utmp.  Perhaps an alternate "getlogin" (different
> name?) that does the getuid() thing could be provided.

There's the getpass module which has a getuser() function that looks
in various env vars and if all else fails uses getuid() and pwd.

If the goal is to get the user ID without being fooled, using
os.getuid() or os.geteuid() directly seems to be the right thing to
do; I don't see the need for a shorthand for
pwd.getpwuid(os.getuid())[0] (which is what getuser() uses).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Thu Dec  9 17:18:10 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 12:18:10 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 10:02:18 EST."
 <384FC47A.BB4DA517@interet.com>
References: <000301bf4206$b39e5b80$36a2143f@tim>
 <384FC47A.BB4DA517@interet.com>
Message-ID: <199912091718.MAA06087@eric.cnri.reston.va.us>

[Jim A]
> Lately I have been more active in distribution via archive files.
> Part of the solution is an archive file format which is identical on
> Unix and Windows, and which can hold the Python library and packages
> as single files.  For my own efforts on this see:
> 
>     ftp://ftp.interet.com/pub/pylib.html

Apart from agreeing with Jean-Claude's rant about inventing a new
archive format, I think this is a good proposal because it is very
clear about the problem it tries to solve and doesn't get distracted
by other issues.  I also commend Jim for building upon Greg Stein's
imputil (like Gordon did).  I wish I could present a solution this
simple as The Standard Way, but (as explained in my long post earlier
today) there just are so many wrinkles that I'd rather hold out for
the Right Solution...  But I've taken good notice of Jim's solution.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From beazley@cs.uchicago.edu  Thu Dec  9 17:16:57 1999
From: beazley@cs.uchicago.edu (David Beazley)
Date: Thu, 9 Dec 1999 11:16:57 -0600 (CST)
Subject: [Python-Dev] Missing POSIX functions: the list
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
 <19991209121159.B20179@cnri.reston.va.us>
Message-ID: <199912091716.LAA15624@gargoyle.cs.uchicago.edu>

Greg Ward writes:
> 
> I think I already pointed this your way, but don't forget the man page
> for Perl's POSIX module: "perldoc POSIX".  I suspect POSIX functions
> that don't make sense in Perl also don't make sense in Python.
> 
> I agree with all your assessments about what's worth adding and what's
> not, and that {close,read,open}dir() are questionable and probably not
> worth the bother.  Random thoughts:
> 

I disagree.  I think that the POSIX module should strive to be as
complete as possible--even if certain functions are closely related
other functionality in the library (tmpfile for instance).  I suspect
that this sort of thing is probably the cause of the missing
functionality in the current library (as in, "why would anyone want to
do that?" when in fact there may be a perfectly good reason in certain
situations).  

> > abort() -- used in Py_FatalError(), but not accessible to Python code
> 
> Would this do the same as in C, ie. terminate the process and dump core?
> 

Sure, why not?  This might be a useful thing to do every so
often---when trying to figure out what's wrong with a C extension
module for instance.

Cheers,

Dave


From jim@interet.com  Thu Dec  9 17:43:57 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 12:43:57 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com>
Message-ID: <384FEA5D.A07F23EC@interet.com>

Jean-Claude Wippler wrote:

> Ouch - what's wrong with zip archives?

Thanks very much for looking over the format.

In general Zip archives store whole branches of a file
system.  A Python ./Lib zip archive would contain:

  N:/python/Python-1.5.2/Lib/string.pyc
  N:/python/Python-1.5.2/Lib/os.pyc
  N:/python/Python-1.5.2/Lib/copy.pyc
  N:/python/Python-1.5.2/Lib/test/testall.pyc

Zip archives are isomorphic to branches of a file system.
That means there must be a sys.path for each zip archive file.
How would this be specified?

The archive format stores modules as dotted names, just as they
appear in the import statement.  The search path is "." in every
archive file by definition.  The import statement "import foo"
just results in a dictionary lookup for key "foo", not a search
through a zip directory along a local search path for "foo.something"
where "something" can be pyc, pyo, py, etc.

The intent was to link the archives to the import statement, not
re-create a directory tree.  It borrowed this feature from
the archive formats of Greg and Gordon.

> There are utilities to convert to/from zip, to re-pack, to mount zip
> transparently so it's entries look like regular files, FTP servers, etc.

Basic operations (to, from, repack) are easy in Python.

> Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format.

Hmmm....
 
> Your format has no checksum, which for deployment and long-term storage
> can be important.

Actually the pylib.py "dir()" method reads all *.pyc with marshal,
and I am depending on marshal to object to bad data and also
out-of-date magic numbers.  But this is a good point.

> If you want a marshalled TOC, then why not add a manifest entry for it,
> sort of like what ranlib does with ar?

Sorry, I don't understand.  Please explain.

> You designed the format so archives can be concatenated without any tool
> (other than "cat"), but this works just as well with zip files, as the
> Tcl Wrap approach demonstrates.

Are you saying that cat zip1.zip zip2.zip > myzip.zip works?

An important feature is the ability to concatenate to a binary:
  cat python.exe zip1.zip > myapp.exe
Searching for this isn't fast unless magic numbers are at the
end.  Are zip files recognizable from the end (I don't know)?

> Allow me to very, very loosely paraphrase Guido here: sure, everyone can
> design an archive format, but they are likely to make the same mistakes
> all over again - so why not adopt a format which is tried and tested?
> 
> With all due respect - I sincerely hope you will reconsider and alter
> your code to work with zip files.  It's probably a small adjustment?
> 
> Unless your *intent* is to create a diverging standard, of course...

The intent is to create a standard but not a diverging standard.

Are there any zip experts out there?  Can zip files satisfy all the
design requirements I listed in pylib.html?  Is there zip code
available?  All my code is in Python.

JimA


From jcw@equi4.com  Thu Dec  9 17:57:33 1999
From: jcw@equi4.com (Jean-Claude Wippler)
Date: Thu, 09 Dec 1999 18:57:33 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com>
 <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us>
Message-ID: <384FED8D.3C535D38@equi4.com>

Guido van Rossum wrote:
> 
> [... my not-really-meant-as-rant about adopting zip as format ...]
>
[zip concatenation feature]

> How does this work?  Does that work for all zip tools, or just for the
> ZIP reader in Wrap?  (I looked up how Jim A does it -- his central
> directory at the end of the file contains the total size of the data
> covered by that directory, so he seeks back to the beginning of it and
> sees if another magic number precedes it; and so on.  Very simple.)

Same for Wrap.  Standard tools would not see the preceding ZIP groups.

In terms of maintenance, I'd avoid this trick.  I merely wanted to point
out that zip archives can be stacked, if the reader is set up to it.

> Question: does the wrap::open code go out to the regular filesystem
> if it finds there's no wrap archive?  That would be handy so you can
> test the code in its unwrapped form without change.

IIRC, Wrap overrides "open" for embedded entries as "file.zip/abc.py".
There's more being developed in this area: a "virtual file system" which
lets you mount archives and such (VFS by Matt Newman, mentioned with his
permission), so that the file-system model can be extended to navigate
into a lot more things than real file systems.

Andrew Kuchling's post hints at another tangent: opendir/readdir is of
course simply an enumeration.  There's a lot of "genericity" lurking in
scanning across file systems, trees, networks, and resources in general.

<minirant> The filesystem <-> OO dichotomy needs a review. </minirant>

> Python needs this too.

<voice location=in-the-desert level=timid>
Concepts like these have a lot to offer - and would make even more sense
if they were done in a way which benefits multiple scripting languages.
Feel free to reply by email if you ever want to further discuss this.
</voice>

-- Jean-Claude


From fdrake@acm.org  Thu Dec  9 18:10:44 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 13:10:44 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <14415.61604.415084.520092@weyr.cnri.reston.va.us>

Andrew M. Kuchling writes:
 > After poking around in the O'Reilly POSIX book, here's a list of POSIX
 > functions that don't seem to be available in Python.  Not all of them
 > seem worth supporting.   Ironically, Greg Ward's daemonize() Perl

  I think your assessment is reasonable.  I looked at posixmodule.c
and note also that the functions use PyArg_Parse() and PyArg_NoArgs()
instead of using PyArg_ParseTuple().  The advantage of
PyArg_ParseTuple() is that the name of the function can be specified
for inclusion in TypeError messages when the arguments are not of the
right type.
  I'm doing some work to correct this now.  I've also added ctermid(), 
and will try to add at least a few more before I check in the changes.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Thu Dec  9 18:17:35 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Thu, 9 Dec 1999 13:17:35 -0500 (EST)
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim>
 <384FC47A.BB4DA517@interet.com>
 <384FDAF5.C25C447C@equi4.com>
 <199912091655.LAA05928@eric.cnri.reston.va.us>
 <384FED8D.3C535D38@equi4.com>
Message-ID: <14415.62015.856931.750279@anthem.cnri.reston.va.us>

>>>>> "JW" == Jean-Claude Wippler <jcw@equi4.com> writes:

    JW> Same for Wrap.  Standard tools would not see the preceding ZIP
    JW> groups.

    JW> In terms of maintenance, I'd avoid this trick.  I merely
    JW> wanted to point out that zip archives can be stacked, if the
    JW> reader is set up to it.

I agree.  I can't recall the details now, but I had a lot of problems
with zip concatenation in JPython.  I think at least some of the older
Java tools for groking zips don't work with contatenation.

-Barry


From guido@CNRI.Reston.VA.US  Thu Dec  9 18:21:42 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 13:21:42 -0500
Subject: [Python-Dev] Virtual filesystem APIs
In-Reply-To: Your message of "Thu, 09 Dec 1999 18:57:33 +0100."
 <384FED8D.3C535D38@equi4.com>
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us>
 <384FED8D.3C535D38@equi4.com>
Message-ID: <199912091821.NAA06209@eric.cnri.reston.va.us>

Jean-Claude Wippler:
> There's more being developed in this area: a "virtual file system" which
> lets you mount archives and such (VFS by Matt Newman, mentioned with his
> permission), so that the file-system model can be extended to navigate
> into a lot more things than real file systems.

I agree.  We have experimented with this a bunch in the Knowbot
sofware, where we have some code that wants to look at a "filesystem"
but could be talking to some kind of filesystem emulation across an
RPC connection or alternatively could be accessing a zip file.  Our
conclusion is that a convenient interface is modeled after (a subset
of) the os and os.path functionality.  In fact, the only thing you
would need to add to the os module would be a function to open a file
object; I've proposed to add os.fopen() as an alias for the built-in
open().

The idea that you could mount one VFS inside another is nice, although
I'm not sure how practical it is.  For one thing, in our fs code,
os.path.sep and friends (e.g. os.path.normcase behavior) were set per
filesystem; what would happen if you mounted a Unix filesystem in an
NT tree?  Doing the translations is hard too; e.g. on a Mac fs, the
separator is ':' and a '/' can be part of a filename -- do you simply
swap them?  What if a Mac file has both '/' and '\'  and you mount it
on a Windows FS?  I'd rather stay away from this.

On the other hand the VFS concept could be used as a totally different
solution to the sys.importers vs. sys.path 

> Andrew Kuchling's post hints at another tangent: opendir/readdir is of
> course simply an enumeration.  There's a lot of "genericity" lurking in
> scanning across file systems, trees, networks, and resources in general.

I'd still rather see listdir() (which our sample virtual FS API
supported).  I don't think it necessarily makes sense to do this on a
more generic basis -- other trees and graphs have sufficiently
different semantics that using a FS like API doesn't necessarily cut
it.  Take for example the Windows registry -- looks a lot like a
filesystem, doesn't it?  Yet it has one fundamental property that a
typical FS doesn't: directory nodes can have data *and* children...

I've written a tree widget and found that it's remarkably hard to come
up with a workable API to talk to trees *in general*.  Trees are a
universal concept, but code sharing is still elusive...  Perhaps
because the concept is so simple?

> <minirant> The filesystem <-> OO dichotomy needs a review. </minirant>

I think that my proposal above should cover this.  (We looked briefly
at doing a similar thing for Java, and found that it's actually harder
there -- they have all these nice objects representing paths, but it's
not easily subclassable to represent paths in some virtual
filesystem.)

> Concepts like these have a lot to offer - and would make even more sense
> if they were done in a way which benefits multiple scripting languages.
> Feel free to reply by email if you ever want to further discuss this.

I see only very hope for this point of view, but I will refrain to
comment more.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com  Thu Dec  9 18:23:14 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 9 Dec 1999 13:23:14 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <384FEA5D.A07F23EC@interet.com>
Message-ID: <1267359311-37934097@hypernet.com>

James C. Ahlstrom wrote:

> Jean-Claude Wippler wrote:
> 
> > Ouch - what's wrong with zip archives?

> In general Zip archives store whole branches of a file
> system.  

> The archive format stores modules as dotted names, just as they
> appear in the import statement.  The search path is "." in every
> archive file by definition.  The import statement "import foo"
> just results in a dictionary lookup for key "foo", not a search
> through a zip directory along a local search path for
> "foo.something" where "something" can be pyc, pyo, py, etc.
> 
> The intent was to link the archives to the import statement, not
> re-create a directory tree.  It borrowed this feature from the
> archive formats of Greg and Gordon.

As I've stated before, I have 2 archive formats. This may seem 
a needless complication, but my suspicion is that sooner or 
later, people will want 2 different kinds.

One is a .pyz format, which corresponds closely to Jim's .pyl 
format (with a number of minor differences: it's compressed, 
the archive as a whole has the Python magic number, instead 
of each entry, and it's not designed for concatenation).
 
The other is like a zip, and probably should be zip format.  It's 
designed to hold _anything_, and can be manipulated from C 
and from Python. It can be concatenated and / or embedded 
(and the innner one opened without extraction). It's table of 
contents is more file-system like. Importing from one is 
slower, but that's not really what it's for. It's for packaging up 
arbitrary resources. Like .pyz's, or Tcl/Tk for Tkinter apps, or 
configuration files.

Jim is correct that a good importer (which can say "No, it's not 
mine" as quickly as possible) is better satisfied by a simple 
dictionary lookup than fooling with file extensions and 
directories (virtual or real).

> > If you want a marshalled TOC, then why not add a manifest entry
> > for it, sort of like what ranlib does with ar?
> 
> Sorry, I don't understand.  Please explain.

The table of contents is just another entry.
 
> An important feature is the ability to concatenate to a binary:
>   cat python.exe zip1.zip > myapp.exe
> Searching for this isn't fast unless magic numbers are at the
> end.  Are zip files recognizable from the end (I don't know)?

Where do you think we got this idea?

> Are there any zip experts out there?  Can zip files satisfy all
> the design requirements I listed in pylib.html?  Is there zip
> code available?  All my code is in Python.

Hmm. My bookmark appears to be dead (I was there not long 
ago):
http://www.cubic.org/source/archive/fileform/packers/appnote.t
xt

There have been several references on this list to Guido et al 
having some Python / zip code.


- Gordon


From guido@CNRI.Reston.VA.US  Thu Dec  9 18:23:27 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 13:23:27 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 13:17:35 EST."
 <14415.62015.856931.750279@anthem.cnri.reston.va.us>
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com>
 <14415.62015.856931.750279@anthem.cnri.reston.va.us>
Message-ID: <199912091823.NAA06243@eric.cnri.reston.va.us>

> I agree.  I can't recall the details now, but I had a lot of problems
> with zip concatenation in JPython.  I think at least some of the older
> Java tools for groking zips don't work with contatenation.

The Java "jar" tool mostly ignores the central directory -- it seems
to read the archive from the front, using the local header records,
and ignoring the central directory (of course it writes one when it
creates an archive).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Thu Dec  9 18:32:15 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 13:32:15 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 12:43:57 EST."
 <384FEA5D.A07F23EC@interet.com>
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com>
 <384FEA5D.A07F23EC@interet.com>
Message-ID: <199912091832.NAA06287@eric.cnri.reston.va.us>

> In general Zip archives store whole branches of a file
> system.  A Python ./Lib zip archive would contain:
> 
>   N:/python/Python-1.5.2/Lib/string.pyc
>   N:/python/Python-1.5.2/Lib/os.pyc
>   N:/python/Python-1.5.2/Lib/copy.pyc
>   N:/python/Python-1.5.2/Lib/test/testall.pyc
> 
> Zip archives are isomorphic to branches of a file system.
> That means there must be a sys.path for each zip archive file.
> How would this be specified?

Not true.  It's easy (using the proper Zip tools) to creat an archive
containing this instead:

  string.pyc
  os.pyc
  copy.pyc
  testall.pyc

Thus the entire archive is considered the directory.  The Java "jar"
tool uses this approach.  It's also easy to have packages in there
(again this is what Java does):

  test/
  test/__init__.pyc
  test/pystone.pyc
  test_support.pyc
  (etc.)

> The archive format stores modules as dotted names, just as they
> appear in the import statement.  The search path is "." in every
> archive file by definition.  The import statement "import foo"
> just results in a dictionary lookup for key "foo", not a search
> through a zip directory along a local search path for "foo.something"
> where "something" can be pyc, pyo, py, etc.
> 
> The intent was to link the archives to the import statement, not
> re-create a directory tree.  It borrowed this feature from
> the archive formats of Greg and Gordon.

Maybe you've gone overboard.  The time it takes to translate the dots
into slashes really isn't the big deal.

> Are there any zip experts out there?  Can zip files satisfy all the
> design requirements I listed in pylib.html?  Is there zip code
> available?  All my code is in Python.

Yes (all of us here at CNRI), yes, yes (we have the spaghetti code).
While zip files support compression, they support uncompressed files
as well and we could go either way.  Their most popular compression
format is gzip compatible and can be read and written with the zlib
module, which is in the standard Python distribution (even on Windows)
-- though to build it you need the zlib C library which is of course
external (but solid open source).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Thu Dec  9 18:41:22 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 13:41:22 -0500 (EST)
Subject: [Python-Dev] Virtual filesystem APIs
In-Reply-To: <199912091821.NAA06209@eric.cnri.reston.va.us>
References: <000301bf4206$b39e5b80$36a2143f@tim>
 <384FC47A.BB4DA517@interet.com>
 <384FDAF5.C25C447C@equi4.com>
 <199912091655.LAA05928@eric.cnri.reston.va.us>
 <384FED8D.3C535D38@equi4.com>
 <199912091821.NAA06209@eric.cnri.reston.va.us>
Message-ID: <14415.63442.92911.748132@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > os.path.sep and friends (e.g. os.path.normcase behavior) were set per

  Hah!  Caught you in public!  "sep" & friends are defined in the os
module; this is where the separation breaks down.
  I think these should be located in os.path, and os can just pick
them up from there to be backward compatible.
  os.pathsep is a problem, somewhat; it is related to os.sep, but is
very different in many ways.  I don't think there's a good way to deal 
with it.

 > filesystem; what would happen if you mounted a Unix filesystem in an
 > NT tree?  Doing the translations is hard too; e.g. on a Mac fs, the
 > separator is ':' and a '/' can be part of a filename -- do you simply
 > swap them?  What if a Mac file has both '/' and '\'  and you mount it
 > on a Windows FS?  I'd rather stay away from this.

  And this is tightly related to the sep/pathsep problem as well.  I
agree, we should stay away from it.

 > I think that my proposal above should cover this.  (We looked briefly
 > at doing a similar thing for Java, and found that it's actually harder
 > there -- they have all these nice objects representing paths, but it's
 > not easily subclassable to represent paths in some virtual

  But it was easy to create a set of interfaces with a reasonable API; 
getting back to the "typical" Java classes was what really changed the 
most.
  For those of us not working on the KOE:  I set up Filesystem and
FSFile interfaces; the Filesystem represented the entire filesystem
and the FSFile was very similar to the java.io.File class, but had
additional methods to get input and output stream objects (of the
standard Java flavor); all the buffering and such could be wrapped on
top of that just like any other Java I/O.
  The specific application was to provide access to an isolated
directory structure which untrusted code "owned", but ensured that
parent directories were unreachable.  Additional security checks can
be worked into such a structure as applicable.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From fdrake@acm.org  Thu Dec  9 19:06:32 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 14:06:32 -0500 (EST)
Subject: [Python-Dev] posix module test suite
Message-ID: <14415.64952.780974.8124@weyr.cnri.reston.va.us>

  There's not a test for the posix or os modules; if anyone would like 
to contribute one, this would be a good time!  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From jcw@equi4.com  Thu Dec  9 20:51:11 1999
From: jcw@equi4.com (Jean-Claude Wippler)
Date: Thu, 09 Dec 1999 21:51:11 +0100
Subject: [Python-Dev] Virtual filesystem APIs
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us>
 <384FED8D.3C535D38@equi4.com> <199912091821.NAA06209@eric.cnri.reston.va.us>
Message-ID: <3850163F.80BDCB75@equi4.com>

Guido van Rossum wrote:
>
[... horrors of cross-OS mounts and ":\/" separators ...]

I agree, this has some very hairy sides to it.  But VFS is really more
about mounting non-FS things in a "root" FS (presumably the real one).

> On the other hand the VFS concept could be used as a totally different
> solution to the sys.importers vs. sys.path

Heck, I'll be the "enfant terrible" once more: yes, and this stuff could
well be implemented generically across scripting languages.  Of course
the act of "importing" is a very Pythonic issue - but FS/VFS traversal
and the actual shared library load need not be.  Anyway, enough of that.

> Take for example the Windows registry -- looks a lot like a 
> filesystem, doesn't it?  Yet it has one fundamental property that a
> typical FS doesn't: directory nodes can have data *and* children...

What you're saying is that dir = set-of-subdirs + set-of-files, and that
this is a more general requirement than plain FS's.  Doesn't that simply
mean that the more general model is needed as basis to handle both?

> Trees are a universal concept, but code sharing is still elusive...

Ah, but think of the implications: archives, networks, XML, the world!

-- Jean-Claude


From fdrake@acm.org  Thu Dec  9 21:16:00 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 16:16:00 -0500 (EST)
Subject: [Python-Dev] forwarded message from Fred L. Drake
Message-ID: <14416.7184.255000.342231@weyr.cnri.reston.va.us>

--KHBYcjBZ+r
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit


  OK, I've checked in some changes to the posix module to add support
for a few of the POSIX interfaces Andrew expressed interest in seeing
(and some he said weren't such a good idea, or at least not necessary,
but about which I decided I disagreed after all).
  For those of you who aren't on the checkins list (??), I've attached 
the message so you'll know what functions were added.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


--KHBYcjBZ+r
Content-Type: message/rfc822
Content-Description: forwarded message

Received: from cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1])
	by weyr.cnri.reston.va.us (8.9.1b+Sun/8.9.1) with SMTP id QAA22917
	for <fdrake@weyr.cnri.reston.va.us>; Thu, 9 Dec 1999 16:13:16 -0500 (EST)
Received: from dinsdale.python.org (dinsdale [132.151.1.21])
	by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id QAA01352;
	Thu, 9 Dec 1999 16:12:41 -0500 (EST)
Received: from dinsdale.python.org (dinsdale.python.org [132.151.1.21])
	by dinsdale.python.org (Postfix) with ESMTP
	id 710BB1CE73; Thu,  9 Dec 1999 16:12:39 -0500 (EST)
Delivered-To: python-checkins@dinsdale.python.org
Received: from python.org (parrot.python.org [132.151.1.90])
	by dinsdale.python.org (Postfix) with ESMTP id EA9681CE71
	for <python-checkins@dinsdale.python.org>; Thu,  9 Dec 1999 16:12:37 -0500 (EST)
Received: from cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1] (may be forged))
	by python.org (8.9.1a/8.9.1) with ESMTP id QAA14229
	for <python-checkins@python.org>; Thu, 9 Dec 1999 16:12:38 -0500 (EST)
Received: from weyr.cnri.reston.va.us (weyr [132.151.1.174])
	by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id QAA01348
	for <python-checkins@python.org>; Thu, 9 Dec 1999 16:12:37 -0500 (EST)
Received: (from fdrake@localhost)
	by weyr.cnri.reston.va.us (8.9.1b+Sun/8.9.1) id QAA22913
	for python-checkins@python.org; Thu, 9 Dec 1999 16:13:10 -0500 (EST)
Message-Id: <199912092113.QAA22913@weyr.cnri.reston.va.us>
Errors-To: python-checkins-admin@python.org
X-BeenThere: python-checkins@python.org
X-Mailman-Version: 1.2 (experimental)
Precedence: bulk
List-Id: Check-in messages from the Python maintainers <python-checkins.python.org>
Content-Length: 1821
From: "Fred L. Drake" <fdrake@weyr.cnri.reston.va.us>
Sender: python-checkins-admin@python.org
To: python-checkins@python.org
Subject: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.115,2.116
Date: Thu, 9 Dec 1999 16:13:10 -0500 (EST)
MIME-Version: 1.0

Update of /projects/cvsroot/python/dist/src/Modules
In directory weyr:/home/fdrake/projects/python/Modules

Modified Files:
	posixmodule.c 
Log Message:

Added support for abort(), ctermid(), tmpfile(), tempnam(), tmpnam(),
and TMP_MAX.

Converted all functions that used PyArg_Parse() or PyArg_NoArgs() to
use PyArg_ParseTuple() and specified all function names using the
:name syntax in the format strings, to allow better error messages
when TypeError is raised for parameter type mismatches.


Index: posixmodule.c
===================================================================
RCS file: /projects/cvsroot/python/dist/src/Modules/posixmodule.c,v
retrieving revision 2.115
retrieving revision 2.116
diff -u -C2 -r2.115 -r2.116
*** posixmodule.c	1999/10/19 13:29:23	2.115
--- posixmodule.c	1999/12/09 21:13:07	2.116
***************
*** 432,442 ****
  
  static PyObject *
! posix_int(args, func)
          PyObject *args;
  	int (*func) Py_FPROTO((int));
  {
  	int fd;
  	int res;
! 	if (!PyArg_Parse(args,  "i", &fd))
  		return NULL;
[...1720 lines suppressed...]
  #endif
+ #ifdef HAVE_TEMPNAM
+ 	{"tempnam",	posix_tempnam, METH_VARARGS, posix_tempnam__doc__},
+ #endif
+ #ifdef HAVE_TMPNAM
+ 	{"tmpnam",	posix_tmpnam, METH_VARARGS, posix_tmpnam__doc__},
+ #endif
+ 	{"abort",	posix_abort, METH_VARARGS, posix_abort__doc__},
  	{NULL,		NULL}		 /* Sentinel */
  };
***************
*** 3426,3429 ****
--- 3586,3592 ----
          if (ins(d, "X_OK", (long)X_OK)) return -1;
  #endif        
+ #ifdef TMP_MAX
+         if (ins(d, "TMP_MAX", (long)TMP_MAX)) return -1;
+ #endif
  #ifdef WNOHANG
          if (ins(d, "WNOHANG", (long)WNOHANG)) return -1;


_______________________________________________
Python-checkins mailing list
Python-checkins@python.org
http://www.python.org/mailman/listinfo/python-checkins


--KHBYcjBZ+r--


From guido@CNRI.Reston.VA.US  Thu Dec  9 21:19:57 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 16:19:57 -0500
Subject: [Python-Dev] forwarded message from Fred L. Drake
In-Reply-To: Your message of "Thu, 09 Dec 1999 16:16:00 EST."
 <14416.7184.255000.342231@weyr.cnri.reston.va.us>
References: <14416.7184.255000.342231@weyr.cnri.reston.va.us>
Message-ID: <199912092119.QAA06731@eric.cnri.reston.va.us>

>   OK, I've checked in some changes to the posix module to add support
> for a few of the POSIX interfaces Andrew expressed interest in seeing
> (and some he said weren't such a good idea, or at least not necessary,
> but about which I decided I disagreed after all).

I wish you'd made your disagreement public before checking it in...
But it's not too late...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Thu Dec  9 21:32:26 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Thu, 9 Dec 1999 16:32:26 -0500 (EST)
Subject: [Python-Dev] forwarded message from Fred L. Drake
In-Reply-To: <14416.7184.255000.342231@weyr.cnri.reston.va.us>
References: <14416.7184.255000.342231@weyr.cnri.reston.va.us>
Message-ID: <14416.8170.18298.33796@amarok.cnri.reston.va.us>

Fred L. Drake, Jr. writes (in a CVS checkin):
>Added support for abort(), ctermid(), tmpfile(), tempnam(), tmpnam(),
>and TMP_MAX.

For those of you following along, the tmpfile(), tempnam(), tmpnam()
functions were ones I listed as probably not worth adding.  On the
other hand, David Beazley wrote:

>  I think that the POSIX module should strive to be as
>complete as possible--even if certain functions are closely related
>other functionality in the library (tmpfile for instance).  I suspect

... and that's a good point, too.  The POSIX functions may provide
adaptability that a Python analog doesn't; for example, you could read
/etc/passwd in pure Python, but that wouldn't handle NIS or shadow
passwords.  So I guess I'll vote for completeness over lack of
overlap; leave tmpfile() & friends in.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
This supports reflection, which is the 90s way of writing self-modifying code.
    -- John Aycock at IPC7, during his parsing talk


From guido@CNRI.Reston.VA.US  Thu Dec  9 21:38:42 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 16:38:42 -0500
Subject: [Python-Dev] forwarded message from Fred L. Drake
In-Reply-To: Your message of "Thu, 09 Dec 1999 16:32:26 EST."
 <14416.8170.18298.33796@amarok.cnri.reston.va.us>
References: <14416.7184.255000.342231@weyr.cnri.reston.va.us>
 <14416.8170.18298.33796@amarok.cnri.reston.va.us>
Message-ID: <199912092138.QAA06790@eric.cnri.reston.va.us>

> ... and that's a good point, too.  The POSIX functions may provide
> adaptability that a Python analog doesn't; for example, you could read
> /etc/passwd in pure Python, but that wouldn't handle NIS or shadow
> passwords.  So I guess I'll vote for completeness over lack of
> overlap; leave tmpfile() & friends in.

OK, I agree now.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Thu Dec  9 22:30:52 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 17:30:52 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <14416.11676.888918.511932@weyr.cnri.reston.va.us>

Andrew M. Kuchling writes:
 > After poking around in the O'Reilly POSIX book, here's a list of POSIX

  Ok, here's my comments on the remainder of these.

 > Worth adding?
 > =============
 > opendir(), readdir(), closedir() -- 
 > 	   most of their functionality is available through
 > 	   os.listdir(), but it might be useful to have a direct
 > 	   interface.  Downside is that this would require a new
 > 	   extension type for the C DIR struct.  My (lazy) inclination
 > 	   is to not bother.

  [rewinddir() and seekdir() should be considered as well, where
supported.]

  There's more tedium than anything in implementing a new C type.  I'm 
a little concerned that there might not be any real value here, but
it's hard to be sure about that.  Is there any real reason not to use
os.listdir().

 > Worth adding:
 > =============
...
 > fpathconf(fd, name) -- Get configuration limit for a file
 > 	    -- would need constants from unistd.h

  This is mostly a matter of setting up the constants; not hard, just
more distracting than I want to deal with right now.

 > getlogin() -- returns user's login name
 > 	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
 > 	 getlogin() apparently looks in utmp

  Per Guido's comments, I'm not sure how valuable it is.  It may make
sense strictly for completeness, but I've never heard of utmp being
considered reliable in any way.  Maybe I'm too new at all this.

 > getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs

  This should be easy enough.

 > pathconf(path, name) -- Gets config variables for a path
 > 	    -- would need constants from unistd.h

  (Same as for fpathconf().)

 > sysconf(int name) -- Gets system configuration information
 > 	    -- would need constants from unistd.h
 > 
 > Not worth adding:
 > =================

  Aside from the ones I've already added, I agree.  ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From jim@digicool.com  Thu Dec  9 23:31:40 1999
From: jim@digicool.com (Jim Fulton)
Date: Thu, 09 Dec 1999 18:31:40 -0500
Subject: [Python-Dev] Thankyou for fsync :)
Message-ID: <38503BDC.CB91FB29@digicool.com>

I found recently that I needed fsync and was pleasantly surprized 
to find that it is provided in the posix module, where available.

Can I count on it staying in the posix module, when available, 
for the forseeable future?

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From gstein@lyra.org  Fri Dec 10 00:32:33 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 9 Dec 1999 16:32:33 -0800 (PST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <14416.11676.888918.511932@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912091630540.10472-100000@nebula.lyra.org>

On Thu, 9 Dec 1999, Fred L. Drake, Jr. wrote:
> Andrew M. Kuchling writes:
>...
>  > opendir(), readdir(), closedir() -- 
>  > 	   most of their functionality is available through
>  > 	   os.listdir(), but it might be useful to have a direct
>  > 	   interface.  Downside is that this would require a new
>  > 	   extension type for the C DIR struct.  My (lazy) inclination
>  > 	   is to not bother.
> 
>   [rewinddir() and seekdir() should be considered as well, where
> supported.]
> 
>   There's more tedium than anything in implementing a new C type.  I'm 
> a little concerned that there might not be any real value here, but
> it's hard to be sure about that.  Is there any real reason not to use
> os.listdir().

No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic
number if you're worried about mixing CObjects.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US  Fri Dec 10 02:03:04 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 21:03:04 -0500
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: Your message of "Thu, 09 Dec 1999 18:31:40 EST."
 <38503BDC.CB91FB29@digicool.com>
References: <38503BDC.CB91FB29@digicool.com>
Message-ID: <199912100203.VAA07410@eric.cnri.reston.va.us>

> I found recently that I needed fsync and was pleasantly surprized 
> to find that it is provided in the posix module, where available.
> 
> Can I count on it staying in the posix module, when available, 
> for the forseeable future?

Since we seem to be on an adding spree, I don't see why not -- as long
as POSIX keeps it available :)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Fri Dec 10 06:28:56 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 10 Dec 1999 00:28:56 -0600 (CST)
Subject: [Python-Dev] posix module test suite
In-Reply-To: <14415.64952.780974.8124@weyr.cnri.reston.va.us>
References: <14415.64952.780974.8124@weyr.cnri.reston.va.us>
Message-ID: <14416.40360.611743.143624@dolphin.mojam.com>

    Fred> There's not a test for the posix or os modules; if anyone would
    Fred> like to contribute one, this would be a good time!  ;-)

Not having ever written any tests for the core Python modules, it seems
natural to ask if there are any guidelines for the construction of such
tests or the test equivalent of the Modules/xxmodule.c file.  Are there
standard behaviors expected for passing and failing a test?

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From tim_one@email.msn.com  Fri Dec 10 08:48:59 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 10 Dec 1999 03:48:59 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <14415.23676.775163.786028@dolphin.mojam.com>
Message-ID: <000501bf42eb$66529860$412d153f@tim>

[Skip Montanaro]
> Alright!  Now I understand what all the hubbub is about!  My eyes have
> mostly been glazing over trying to follow all this Windows
> registry/path/ini stuff.  MS believes that Python is the application.
> Those of us writing Python programs view those programs as the
> applications, not the Python interpreter per se.

Eww -- that's a helpful and insightful way to put it, Skip!  Now maybe *I*
can understand what the hubbub is about <wink>.

> Is there some way that people writing applications in Python can set
> up registry entries that are specific to their application (e.g.
> tabnanny.py) instead of only specific to the Python interpreter?

Yes, but they can't get Python to look at those before it's too late.  I
spent a whole evening a month or two ago just trying to figure out where all
the cruft in my Windows sys.path *came* from.  This is out-of-the-box; I
haven't added anything myself:

['',
 'D:\\Python\\win32',
 'D:\\Python\\win32\\lib',
 'D:\\Python',
 'D:\\Python\\Pythonwin',
 'D:\\Python\\Lib\\plat-win',
 'D:\\Python\\Lib',
 'D:\\Python\\DLLs',
 'D:\\Python\\Lib\\lib-tk',
 'D:\\PYTHON\\DLLs',
 'D:\\PYTHON\\lib',
 'D:\\PYTHON\\lib\\plat-win',
 'D:\\PYTHON\\lib\\lib-tk',
 'D:\\PYTHON']

That's bizarre on the face of it, and tracking it all down was draining.
I've forgotten the details.  I do remember concluding that it was impossible
to do what I wanted to do without changing the implementation, though, and
nobody on Python-Dev disputed that at the time.

In a pragmatic crunch, I wrote the little app I needed to distribute at the
time in Perl instead, meaning to come back to this.  I haven't had time.

IIRC, the ultimate problem wasn't really that Python looked at the registry
to get *some* path info, it was a combination of

A) It looked at the registry so early that it was impossible to stop it from
executing whatever site.py the registry pointed at (well, I could with
the -S option -- but then there was no way to get it to do the site.py that
was *wanted* instead).

B) No way to override what was in the registry; e.g., I was greatly
surprised to discover that setting a PYTHONPATH envar didn't override
anything, it simply plunked the PYTHONPATH entries into sys.path along with
everything else -- and too late to stop anything anyway.

In a long msg I haven't yet read all the way thru, Guido at least suggested
associating different registry path info with different Python versions.
That would address a number of otherwise currently intractable problems.

I suspect it still wouldn't help with the problem I was facing, though.
That is, I wanted to be able to tell people to run

\\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py

which is just a Windows way of saying "run a Python executable from a shared
network location".  When they tried that, though, the network Python looked
in *their* individual registries for its Python path info, and some of the
hackers with mondo customized Python setups on their own machines watched
things go down in flames.

This certainly can't be a common problem, but it speaks to an unforgiving
rigidity in the current approach.  There seemed to be nothing I could do to
guarantee this would work, short of telling users to edit their registries
before running this tool (that's a non-starter on Windows -- editing the
registry is dangerous) or putting a customized Python on the network
pointing to a bogus registry key (it was faster to write the app in Perl!
Perl doesn't *try* to be so infernally helpful <wink>, so doesn't get in the
way either).

I'm left wondering what purpose putting Python library path info into the
Windows registry serves.  Is there anyone on Windows who *doesn't* have
their Python Lib/ etc as direct subdirectories of the directory containing
python.exe?  Not that I've seen.  Python puts *those* in sys.path too -- but
only after it (in the normal case; see my sys.path above) pulls identically
redundant paths out of the registry first, or (in the cases we're griping
about) pulls irrelevant or downright harmful paths out of the registry first
(paths appropriate to the last Python you *installed*, not to the Python
that's *running*!).

Perhaps all this cruft is needed to support embedded Python, though
(something I've never done).

Regardless, I expect it would have been enough for me if PYTHONPATH simply
worked the way I mistakenly assumed it would (that is, this is sys.path, and
that's *it*; feel free to prepend the current directory when initialization
is complete, but before then looking at any file not reached from PYTHONPATH
is verboten).

the-cleverer-the-code-the-more-vital-that-there-be-a-way-to-
    short-circuit-it-ly y'rs  - tim


From jim@interet.com  Fri Dec 10 12:16:31 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 07:16:31 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000501bf42eb$66529860$412d153f@tim>
Message-ID: <3850EF1F.158445B6@interet.com>

Tim Peters wrote:
> 
> [Skip Montanaro]
> > Is there some way that people writing applications in Python can set
> 
> Yes, but they can't get Python to look at those before it's too late.  I
> spent a whole evening a month or two ago just trying to figure out where all
> the cruft in my Windows sys.path *came* from.  This is out-of-the-box; I
> .....

Excellent discussion Tim!

> I suspect it still wouldn't help with the problem I was facing, though.
> That is, I wanted to be able to tell people to run
> 
> \\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py
> 
> which is just a Windows way of saying "run a Python executable from a shared
> network location".  When they tried that, though, the network Python looked
> in *their* individual registries for its Python path info, and some of the
> hackers with mondo customized Python setups on their own machines watched
> things go down in flames.

I think a sensible way to run little apps is to put everything
in an archive file including the main.py.  On Windows you
concattenate that to python.exe, and it Just Works.

> Windows registry serves.  Is there anyone on Windows who *doesn't* have
> their Python Lib/ etc as direct subdirectories of the directory containing
> python.exe?  Not that I've seen.

Point on the curve.  We don't.  We freeze everything except the main.py.

JimA


From jim@interet.com  Fri Dec 10 13:38:28 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 08:38:28 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com>
Message-ID: <38510254.ED15D32B@interet.com>

Jean-Claude Wippler wrote:

> Ouch - what's wrong with zip archives?

> With all due respect - I sincerely hope you will reconsider and alter
> your code to work with zip files.  It's probably a small adjustment?

OK, you talked me into it.  Ya, small adjustment, no problem ;-)

JimA


From jack@oratrix.nl  Fri Dec 10 13:51:10 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Fri, 10 Dec 1999 14:51:10 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Message by "James C. Ahlstrom" <jim@interet.com> ,
 Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com>
Message-ID: <19991210135111.2F83C370CF2@snelboot.oratrix.nl>

Is it possible nowadays to have two files with the same name but different 
paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive?

That's the one thing that always struck me as very very silly about zipfiles.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From gmcm@hypernet.com  Fri Dec 10 14:28:51 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 10 Dec 1999 09:28:51 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: <19991210135111.2F83C370CF2@snelboot.oratrix.nl>
References: Message by "James C. Ahlstrom" <jim@interet.com> ,	     Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com>
Message-ID: <1267287023-386248@hypernet.com>

Jack Jansen asks:

> Is it possible nowadays to have two files with the same name but
> different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same
> archive?

Depends on how you do it.

If the user imports foo.spam.bar, an importer will be asked for:
  foo (return foo.__init__)
  foo.spam (return foo.bar.__init__)
  foo.spam.bar (return foo.spam.bar)

But the API allows lots of variations. This is another possible 
interaction:
  foo (return None)
  foo.__init__ (return foo.__init__)
  foo.spam (return None)
  foo.bar.__init__ (return foo.bar.__init__)
  foo.spam.bar (return foo.spam.bar)

Or, by looking at different args to get_code, you could look at 
the requests as:
  foo in context of None
  spam in context of foo
  bar in context of foo.spam
 
With another variation where the request for __init__ becomes 
explicit.

The first way seems the natural way for archives, and makes it 
easy to keep foo.bar.spam distinct from foo.spam.

> That's the one thing that always struck me as very very silly
> about zipfiles.

Huh?

- Gordon


From guido@CNRI.Reston.VA.US  Fri Dec 10 14:51:39 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 09:51:39 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 14:51:10 +0100."
 <19991210135111.2F83C370CF2@snelboot.oratrix.nl>
References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl>
Message-ID: <199912101451.JAA07786@eric.cnri.reston.va.us>

> Is it possible nowadays to have two files with the same name but different 
> paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive?
> 
> That's the one thing that always struck me as very very silly about zipfiles.

Zip files contain the full path, there's no problem with that.  Was
there ever?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jack@oratrix.nl  Fri Dec 10 14:52:26 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Fri, 10 Dec 1999 15:52:26 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy
 )
In-Reply-To: Message by "Gordon McMillan" <gmcm@hypernet.com> ,
 Fri, 10 Dec 1999 09:28:51 -0500 , <1267287023-386248@hypernet.com>
Message-ID: <19991210145227.01F99370CF2@snelboot.oratrix.nl>

> Jack Jansen asks:
> 
> > Is it possible nowadays to have two files with the same name but
> > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same
> > archive?
> 
> Depends on how you do it.

Apparently I mis-phrased my question, I'll try again.

When people suggested to use zip format as the standard Python archive format 
I was a bit worried, becuase I've had it happen to me various times that I was 
unable to create a ZIP archive with two files with the same name but different 
paths (i.e. create an archive of a directory that contains both a foo/bar.py 
and a foo/spam/bar.py).

So, my question was: has this happened to me because the winzip I used was 
braindead, or is there possibly a problem with the ZIP file format that 
disallows two files with the same name in one archive? Most zip programs I've 
seen also seem to present filenames as the primary metaphore, with full 
pathnames somewhat "tacked on".

If the latter is the case I wonder whether zip is the right format to use...
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From guido@CNRI.Reston.VA.US  Fri Dec 10 15:00:51 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 10:00:51 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 15:52:26 +0100."
 <19991210145227.01F99370CF2@snelboot.oratrix.nl>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
Message-ID: <199912101500.KAA07863@eric.cnri.reston.va.us>

Again, the zip format does not have this problem.  Some zip tools may
-- then we don't use those.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Fri Dec 10 15:40:21 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 10:40:21 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <Pine.LNX.4.10.9912091630540.10472-100000@nebula.lyra.org>
References: <14416.11676.888918.511932@weyr.cnri.reston.va.us>
 <Pine.LNX.4.10.9912091630540.10472-100000@nebula.lyra.org>
Message-ID: <14417.7909.511437.230915@weyr.cnri.reston.va.us>

Greg Stein writes:
 > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic
 > number if you're worried about mixing CObjects.

  That's certainly one option, but I would have made readdir(),
seekdir(), rewinddir() and closedir() into the methods read(), seek(), 
rewind() and close().  So it's a question of what interface you
prefer; functions with magically interpreted token parameters (kind of 
like file descriptors, hey!), or something that is more recognizably
object-oriented.
  I know my preference.  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From mal@lemburg.com  Fri Dec 10 15:55:02 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 16:55:02 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
Message-ID: <38512256.F9287E24@lemburg.com>

Jack Jansen wrote:
> 
> > Jack Jansen asks:
> >
> > > Is it possible nowadays to have two files with the same name but
> > > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same
> > > archive?
> >
> > Depends on how you do it.
> 
> Apparently I mis-phrased my question, I'll try again.
> 
> When people suggested to use zip format as the standard Python archive format
> I was a bit worried, becuase I've had it happen to me various times that I was
> unable to create a ZIP archive with two files with the same name but different
> paths (i.e. create an archive of a directory that contains both a foo/bar.py
> and a foo/spam/bar.py).
> 
> So, my question was: has this happened to me because the winzip I used was
> braindead, or is there possibly a problem with the ZIP file format that
> disallows two files with the same name in one archive? Most zip programs I've
> seen also seem to present filenames as the primary metaphore, with full
> pathnames somewhat "tacked on".
> 
> If the latter is the case I wonder whether zip is the right format to use...

Hmm, I've been doing the above for years now... never had a problem
with it (I use Info-ZIPs tools, BTW), e.g.

/home/lemburg> unzip -l projects/distribution/mxODBC-1.1.1.zip 
Archive:  projects/distribution/mxODBC-1.1.1.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
   131316  06-09-99 14:10   ODBC/EasySoft/mxODBC.c
   131316  06-09-99 14:10   ODBC/Informix/mxODBC.c
   ...

Would be cool if I could use my packages as ZIP files :-) So
here's another vote for using the ZIP format.

BTW, wouldn't it make sense to include the zlib code
in the core distribution much like the pcre stuff is now ?
AFAIK, it is public domain and including it would remedy many of the
compatibility issues with the different zlib versions around.

Ok, that was my wish for Xmas :-) The rest is up to Mr. Clause...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@CNRI.Reston.VA.US  Fri Dec 10 16:04:24 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:04:24 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 16:55:02 +0100."
 <38512256.F9287E24@lemburg.com>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
 <38512256.F9287E24@lemburg.com>
Message-ID: <199912101604.LAA14100@eric.cnri.reston.va.us>

> BTW, wouldn't it make sense to include the zlib code
> in the core distribution much like the pcre stuff is now ?
> AFAIK, it is public domain and including it would remedy many of the
> compatibility issues with the different zlib versions around.

What compatibility issues?  Note that the Win32 distri already comes
with zlib statically linked into zlib.pyd.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Fri Dec 10 16:15:48 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:15:48 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
 <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>
Message-ID: <38512734.CF6E4489@lemburg.com>

Guido van Rossum wrote:
> 
> > BTW, wouldn't it make sense to include the zlib code
> > in the core distribution much like the pcre stuff is now ?
> > AFAIK, it is public domain and including it would remedy many of the
> > compatibility issues with the different zlib versions around.
> 
> What compatibility issues?  Note that the Win32 distri already comes
> with zlib statically linked into zlib.pyd.

There were issues with zlib 1.0.4 and later ones. Also, many
Linux distributions don't have the zlib header files installed.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@CNRI.Reston.VA.US  Fri Dec 10 16:19:47 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:19:47 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 17:15:48 +0100."
 <38512734.CF6E4489@lemburg.com>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>
 <38512734.CF6E4489@lemburg.com>
Message-ID: <199912101619.LAA14174@eric.cnri.reston.va.us>

> There were issues with zlib 1.0.4 and later ones. Also, many
> Linux distributions don't have the zlib header files installed.

Hm.  I don't recall having any problems reported to me.  I'd rather
not include the entire zlib distri in the Python distri -- zlib
is rather big.  Adding only the Unix source would be cheating.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Fri Dec 10 16:25:23 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:25:23 -0500
Subject: [Python-Dev] dbm clone with serious specs wanted
Message-ID: <199912101625.LAA14216@eric.cnri.reston.va.us>

Someone has asked me for a dbm clone that can store 16M keys of 350
bytes each, and runs on Linux, HPUX, and NT.  That's 5.6 Gigabyte in
keys alone!  I presume most classic approaches won't cut it since
total file size is typicall limited by the seek system call, internal
data structures and/or file index format to 2Gb (signed longs) or 4Gb
(unsigned longs).

Does anyone have an idea where to start looking?  Would a Python
extension already exist?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From petrilli@amber.org  Fri Dec 10 16:29:27 1999
From: petrilli@amber.org (Christopher Petrilli)
Date: Fri, 10 Dec 1999 11:29:27 -0500
Subject: [Python-Dev] dbm clone with serious specs wanted
In-Reply-To: <199912101625.LAA14216@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Fri, Dec 10, 1999 at 11:25:23AM -0500
References: <199912101625.LAA14216@eric.cnri.reston.va.us>
Message-ID: <19991210112927.A14102@trump.amber.org>

Guido van Rossum [guido@CNRI.Reston.VA.US] wrote:
> Someone has asked me for a dbm clone that can store 16M keys of 350
> bytes each, and runs on Linux, HPUX, and NT.  That's 5.6 Gigabyte in
> keys alone!  I presume most classic approaches won't cut it since
> total file size is typicall limited by the seek system call, internal
> data structures and/or file index format to 2Gb (signed longs) or 4Gb
> (unsigned longs).
> 
> Does anyone have an idea where to start looking?  Would a Python
> extension already exist?

Assuming you mean an interface to a ddbm-style situation, you could easily
use berkeley DB, I belive it is limited in the 4TB range...  

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From mal@lemburg.com  Fri Dec 10 16:26:10 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:26:10 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>
 <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>
Message-ID: <385129A2.6FAF4E81@lemburg.com>

Guido van Rossum wrote:
> 
> > There were issues with zlib 1.0.4 and later ones. Also, many
> > Linux distributions don't have the zlib header files installed.
> 
> Hm.  I don't recall having any problems reported to me.  I'd rather
> not include the entire zlib distri in the Python distri -- zlib
> is rather big.  Adding only the Unix source would be cheating.

How about only adding those parts which would be needed to
at least deflate the ZIP archive contents ?

If the ZIP archive format becomes the standard for Python, we'd
have to ensure that all Python users can read them. Well, at
least that's what I would expect from a standard format :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@CNRI.Reston.VA.US  Fri Dec 10 16:29:36 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:29:36 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 17:26:10 +0100."
 <385129A2.6FAF4E81@lemburg.com>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>
 <385129A2.6FAF4E81@lemburg.com>
Message-ID: <199912101629.LAA14274@eric.cnri.reston.va.us>

> How about only adding those parts which would be needed to
> at least deflate the ZIP archive contents ?

Ditto -- still lots of portability issues I bet.

> If the ZIP archive format becomes the standard for Python, we'd
> have to ensure that all Python users can read them. Well, at
> least that's what I would expect from a standard format :-)

There's a simple solution: don't use compression.  With current disk
prices it's really not worth it.  Let the installer do the
decompression (installers travel across networks where compression
*is* worth it).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Fri Dec 10 16:34:09 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Fri, 10 Dec 1999 11:34:09 -0500 (EST)
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: <38512734.CF6E4489@lemburg.com>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
 <38512256.F9287E24@lemburg.com>
 <199912101604.LAA14100@eric.cnri.reston.va.us>
 <38512734.CF6E4489@lemburg.com>
Message-ID: <14417.11137.562474.99270@amarok.cnri.reston.va.us>

M.-A. Lemburg writes:
>There were issues with zlib 1.0.4 and later ones. Also, many
>Linux distributions don't have the zlib header files installed.

For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm,
and zlib.XXX.rpm only contains libz.so.  On the other hand, anyone
who's compiling Python should really have the various -devel RPMs
installed.  I'd argue against including it, because it might cause odd
versioning problems.  For example, what if I have PIL compiled against
zlib1.1.2 (zlib is used for writing PNGs) and the Python binary
includes zlib1.1.3?  There might be hard-to-debug problems
caused by calling the wrong symbol.

PCRE is a special case, because we've actually hacked the code a lot;
it's not the PCRE code as Philip Hazel distributes it.

Just received Guido's email suggesting skipping compression in
archives; not a bad idea.  You'd use less CPU, but might do
more I/O because you're reading more sectors off disk.  There
probably isn't much need for compression when the archive is on-disk;
Java needed it because of applets.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
The NSA response was, "Well, that was interesting, but there aren't any
ciphers like that."
    -- Gus Simmons, "The History of Subliminal Channels"


From petrilli@amber.org  Fri Dec 10 16:39:44 1999
From: petrilli@amber.org (Christopher Petrilli)
Date: Fri, 10 Dec 1999 11:39:44 -0500
Subject: [Python-Dev] dbm clone with serious specs wanted
In-Reply-To: <19991210112927.A14102@trump.amber.org>; from petrilli@amber.org on Fri, Dec 10, 1999 at 11:29:27AM -0500
References: <199912101625.LAA14216@eric.cnri.reston.va.us> <19991210112927.A14102@trump.amber.org>
Message-ID: <19991210113944.B14102@trump.amber.org>

Christopher Petrilli [petrilli@amber.org] wrote:
> Guido van Rossum [guido@CNRI.Reston.VA.US] wrote:
> > Does anyone have an idea where to start looking?  Would a Python
> > extension already exist?
> 
> Assuming you mean an interface to a ddbm-style situation, you could easily
> use berkeley DB, I belive it is limited in the 4TB range...  

I just did some checking... first Robin Dunn has an interface, but it's not
currently compatible with BerkeleyDB 3.x, which just came out... it shouldn't
be hard to retrofit.  Anyway, the limits are based on page size...

	512b page:	2TB
	64K page:	256TB

It uses 32bit numbers for pages, so I assume that is also a reflection
of the number of keys allowed... given I belive one key must use a minimum
of one page.

I know that I've pushed earlier releases o around 50Gb without trouble,
but you might see issues relatd to the number of keys.  I'd ask Sleepycat
directly, as they'r amazingly responsive.

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From mal@lemburg.com  Fri Dec 10 16:37:30 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:37:30 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>
 <385129A2.6FAF4E81@lemburg.com> <199912101629.LAA14274@eric.cnri.reston.va.us>
Message-ID: <38512C4A.ADB63C2B@lemburg.com>

Guido van Rossum wrote:
> 
> > How about only adding those parts which would be needed to
> > at least deflate the ZIP archive contents ?
> 
> Ditto -- still lots of portability issues I bet.

Hmm, not sure: zlib is pretty portable. Its the interface
changes that can break code, not so much the zlib portability.
 
> > If the ZIP archive format becomes the standard for Python, we'd
> > have to ensure that all Python users can read them. Well, at
> > least that's what I would expect from a standard format :-)
> 
> There's a simple solution: don't use compression.  With current disk
> prices it's really not worth it.  Let the installer do the
> decompression (installers travel across networks where compression
> *is* worth it).

That's a possibility, right. It would still let us use the many
ZIP tools while not adding complexity to the core.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Fri Dec 10 16:43:11 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:43:11 +0100
Subject: [Python-Dev] dbm clone with serious specs wanted
References: <199912101625.LAA14216@eric.cnri.reston.va.us>
Message-ID: <38512D9F.2AE9DC8B@lemburg.com>

Guido van Rossum wrote:
> 
> Someone has asked me for a dbm clone that can store 16M keys of 350
> bytes each, and runs on Linux, HPUX, and NT.  That's 5.6 Gigabyte in
> keys alone!  I presume most classic approaches won't cut it since
> total file size is typicall limited by the seek system call, internal
> data structures and/or file index format to 2Gb (signed longs) or 4Gb
> (unsigned longs).
> 
> Does anyone have an idea where to start looking?  Would a Python
> extension already exist?

I'd suggest using a dbm style wrapper around the DB-API and then
trying out the many cross-platform databases. IBM DB2 comes to
mind... it can certainly handle these sizes given the right
hardware.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fdrake@acm.org  Fri Dec 10 17:35:01 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 12:35:01 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <199912100203.VAA07410@eric.cnri.reston.va.us>
References: <38503BDC.CB91FB29@digicool.com>
 <199912100203.VAA07410@eric.cnri.reston.va.us>
Message-ID: <14417.14789.306365.439782@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > Since we seem to be on an adding spree, I don't see why not -- as long
 > as POSIX keeps it available :)

  fsync() isn't listed in O'Reilly's POSIX book, so it's probably not
in the POSIX spec.  Neither is the tempnam() function I added in
yesterdays spree, though tmpfile() and tmpnam() are.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From jim@digicool.com  Fri Dec 10 18:37:53 1999
From: jim@digicool.com (Jim Fulton)
Date: Fri, 10 Dec 1999 18:37:53 +0000
Subject: [Python-Dev] Thankyou for fsync :)
References: <38503BDC.CB91FB29@digicool.com>
 <199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us>
Message-ID: <38514881.5C124E36@digicool.com>

"Fred L. Drake, Jr." wrote:
> 
> Guido van Rossum writes:
>  > Since we seem to be on an adding spree, I don't see why not -- as long
>  > as POSIX keeps it available :)
> 
>   fsync() isn't listed in O'Reilly's POSIX book, so it's probably not
> in the POSIX spec. 

It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;)

I'd still like it to stay, where available. :)

Jim

--
Jim Fulton           mailto:jim@digicool.com
Technical Director   (888) 344-4332              Python Powered!
Digital Creations    http://www.digicool.com     http://www.python.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From fdrake@acm.org  Fri Dec 10 18:36:44 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 13:36:44 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <38514881.5C124E36@digicool.com>
References: <38503BDC.CB91FB29@digicool.com>
 <199912100203.VAA07410@eric.cnri.reston.va.us>
 <14417.14789.306365.439782@weyr.cnri.reston.va.us>
 <38514881.5C124E36@digicool.com>
Message-ID: <14417.18492.932392.608912@weyr.cnri.reston.va.us>

Jim Fulton writes:
 > It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;)

  I don't have that one, but I certainly don't have any plans on
ripping out fsync().  Not today, at any rate.  ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From jim@interet.com  Fri Dec 10 18:37:50 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 13:37:50 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl>
Message-ID: <3851487E.F610BE17@interet.com>

Jack Jansen wrote:
> 
> Is it possible nowadays to have two files with the same name but different
> paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive?

Yes, I just made one with WinZip.

JimA


From gmcm@hypernet.com  Fri Dec 10 18:41:56 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 10 Dec 1999 13:41:56 -0500
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <38514881.5C124E36@digicool.com>
Message-ID: <1267271840-1299809@hypernet.com>

Fred L. Drake, Jr. wrote:
> 
> Guido van Rossum writes:
>  > Since we seem to be on an adding spree, I don't see why not
>  > -- as long as POSIX keeps it available :)
> 
>   fsync() isn't listed in O'Reilly's POSIX book, so it's
>   probably not
> in the POSIX spec. 
> 

It's in the other O'Reilly POSIX book, p 348 of POSIX.4.

- Gordon


From fdrake@acm.org  Fri Dec 10 18:43:56 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 13:43:56 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <1267271840-1299809@hypernet.com>
References: <38514881.5C124E36@digicool.com>
 <1267271840-1299809@hypernet.com>
Message-ID: <14417.18924.461115.906914@weyr.cnri.reston.va.us>

Gordon McMillan writes:
 > It's in the other O'Reilly POSIX book, p 348 of POSIX.4.

  Ah, I don't have that either.  I thought POSIX.4 was real-time
stuff.
  (If anyone wants to send a copy along, I'd be glad to consider
adding reasonable interfaces for Python. ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From jim@interet.com  Fri Dec 10 18:43:18 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 13:43:18 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
Message-ID: <385149C6.DF942F36@interet.com>

Jack Jansen wrote:

> When people suggested to use zip format as the standard Python archive format
> I was a bit worried, becuase I've had it happen to me various times that I was
> unable to create a ZIP archive with two files with the same name but different
> paths (i.e. create an archive of a directory that contains both a foo/bar.py
> and a foo/spam/bar.py).

No problem.

But most zip tools will create an archive with either no
path (file name is "bar.py") or full path (filename "foo/bar.py".
If paths are different Ok, not sure about duplicate bare names.
The difference is an option and has nothing to do with how the
file name is specified to the utility.

JimA


From jim@interet.com  Fri Dec 10 18:48:47 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 13:48:47 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>
 <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com>
Message-ID: <38514B0F.84A546C6@interet.com>

"M.-A. Lemburg" wrote:

> How about only adding those parts which would be needed to
> at least deflate the ZIP archive contents ?
> 
> If the ZIP archive format becomes the standard for Python, we'd
> have to ensure that all Python users can read them. Well, at
> least that's what I would expect from a standard format :-)

I think that for now we will need to create archives with
compression method zero: no compression.  That is a valid
compression method all ZIP utilities support.  The point is that
zlib just isn't part of Python.

Jim


From jcw@equi4.com  Fri Dec 10 18:57:00 1999
From: jcw@equi4.com (Jean-Claude Wippler)
Date: Fri, 10 Dec 1999 19:57:00 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>
 <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> <38514B0F.84A546C6@interet.com>
Message-ID: <38514CFC.47C8A8E0@equi4.com>

"James C. Ahlstrom" wrote:
[...]
> I think that for now we will need to create archives with
> compression method zero: no compression.  That is a valid
> compression method all ZIP utilities support.

Sounds good.  This is also exactly how Java started out with jar.

-jcw


From gmcm@hypernet.com  Fri Dec 10 19:06:59 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 10 Dec 1999 14:06:59 -0500
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us>
References: <1267271840-1299809@hypernet.com>
Message-ID: <1267270337-1390160@hypernet.com>

Fred wrote:
 
> Gordon McMillan writes:
>  > It's in the other O'Reilly POSIX book, p 348 of POSIX.4.
> 
>   Ah, I don't have that either.  I thought POSIX.4 was real-time
> stuff.

Well, it says it is, but having done some stuff with automated 
warehouses, I'm always amazed at how people will use the 
term "real-time". I'd say "pretty likely to be responsive" ;-).

>   (If anyone wants to send a copy along, I'd be glad to consider
> adding reasonable interfaces for Python. ;)

Only around 70 documented functions, but many of them 
appear to be tweaks, or redocumenting stuff in view of new 
kernel behaviors.

- Gordon


From fdrake@acm.org  Fri Dec 10 19:18:16 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 14:18:16 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <1267270337-1390160@hypernet.com>
References: <1267271840-1299809@hypernet.com>
 <1267270337-1390160@hypernet.com>
Message-ID: <14417.20984.151867.630871@weyr.cnri.reston.va.us>

Gordon McMillan writes:
 > Well, it says it is, but having done some stuff with automated 
 > warehouses, I'm always amazed at how people will use the 
 > term "real-time". I'd say "pretty likely to be responsive" ;-).

  Oh, a manager's interpretation of real-time:  "I want this by close
of business next Wednesday!"

 > Only around 70 documented functions, but many of them 
 > appear to be tweaks, or redocumenting stuff in view of new 
 > kernel behaviors.

  Anything that should be added anywhere?  Failing all else, I can
probably read the man pages if I know what to look for.  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From fdrake@acm.org  Fri Dec 10 21:40:29 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 16:40:29 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <14417.29517.238124.767279@weyr.cnri.reston.va.us>

Andrew M. Kuchling writes:
 > fpathconf(fd, name) -- Get configuration limit for a file
...
 > pathconf(path, name) -- Gets config variables for a path
...
 > sysconf(int name) -- Gets system configuration information
 > 	    -- would need constants from unistd.h

  I'm almost done with these, and also confstr (from POSIX.2).  I
don't have time to finish them today; I'll check them in next week.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From skip@mojam.com (Skip Montanaro)  Fri Dec 10 23:20:21 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 10 Dec 1999 17:20:21 -0600 (CST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us>
References: <38514881.5C124E36@digicool.com>
 <1267271840-1299809@hypernet.com>
 <14417.18924.461115.906914@weyr.cnri.reston.va.us>
Message-ID: <14417.35509.284749.924066@dolphin.mojam.com>

    Fred> I thought POSIX.4 was real-time stuff.

This all seems to be happening in real-time to me... ;-)

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From andy@robanal.demon.co.uk  Sat Dec 11 00:11:28 1999
From: andy@robanal.demon.co.uk (Andy Robinson)
Date: Sat, 11 Dec 1999 00:11:28 GMT
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: <199912101619.LAA14174@eric.cnri.reston.va.us>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>   <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>
Message-ID: <38519531.15439641@post.demon.co.uk>

On Fri, 10 Dec 1999 11:19:47 -0500, you wrote:

>> There were issues with zlib 1.0.4 and later ones. Also, many
>> Linux distributions don't have the zlib header files installed.
>
>Hm.  I don't recall having any problems reported to me.  I'd rather
>not include the entire zlib distri in the Python distri -- zlib
>is rather big.  Adding only the Unix source would be cheating.
>
Minor data point on the importance of zlib.  I spent a long time
figuring out what Adobe PDF's "flate filter" was before I discovered
it was the inverse of "deflate" (yes, there were loud sounds of
head-slapping when I clicked) and discovered that zlib.compress() was
EXACTLY what you need to create compressed streams in PDF documents.
Being a Windows person, I naively assumed zlib was in the standard
distribution everywhere, and subsequently discovered Mac and Unix
users were not so happy.  So if you want to make PDFs, having zlib
around is very useful indeed...

- Andy


From akuchlin@mems-exchange.org  Sat Dec 11 00:35:58 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Fri, 10 Dec 1999 19:35:58 -0500 (EST)
Subject: [Python-Dev] Enabling more modules by default
In-Reply-To: <38519531.15439641@post.demon.co.uk>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
 <38512256.F9287E24@lemburg.com>
 <199912101604.LAA14100@eric.cnri.reston.va.us>
 <38512734.CF6E4489@lemburg.com>
 <199912101619.LAA14174@eric.cnri.reston.va.us>
 <38519531.15439641@post.demon.co.uk>
Message-ID: <14417.40046.850655.491684@amarok.cnri.reston.va.us>

Andy Robinson writes:
>...  So if you want to make PDFs, having zlib
>around is very useful indeed...

This raises a good point, though I still dislike the idea of including
the zlib library.  It would be nice if Setup.in would be autogenerated
to compile all the modules it can -- bsddb if it finds libdb, zlib if
it finds libz.a.  I vaguely recall once working on a Python script that
would generate a customized Setup.in file, though I can't find it at
the moment.  Given that someone has already suggested automatically
enabling threads on those platforms that support it, why not go all
the way?

(But a Python script that generates a Setup.in isn't going to work,
unless we compile a minipython first and then create a more complete
Setup file.)

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
The most merciful thing in the world... is the inability of the human mind to
correlate all its contents.
    -- H.P. Lovecraft


From petrilli@amber.org  Sat Dec 11 05:54:41 1999
From: petrilli@amber.org (Christopher Petrilli)
Date: Sat, 11 Dec 1999 00:54:41 -0500
Subject: [Python-Dev] Enabling more modules by default
In-Reply-To: <14417.40046.850655.491684@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Fri, Dec 10, 1999 at 07:35:58PM -0500
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <38519531.15439641@post.demon.co.uk> <14417.40046.850655.491684@amarok.cnri.reston.va.us>
Message-ID: <19991211005441.A20923@trump.amber.org>

Andrew M. Kuchling [akuchlin@mems-exchange.org] wrote:
> Andy Robinson writes:
> >...  So if you want to make PDFs, having zlib
> >around is very useful indeed...
> 
> This raises a good point, though I still dislike the idea of including
> the zlib library.  It would be nice if Setup.in would be autogenerated
> to compile all the modules it can -- bsddb if it finds libdb, zlib if
> it finds libz.a.  I vaguely recall once working on a Python script that
> would generate a customized Setup.in file, though I can't find it at
> the moment.  Given that someone has already suggested automatically
> enabling threads on those platforms that support it, why not go all
> the way?

WEll, one warning about BSDdb, is that it comes in 3 incarnations that 
all might be -ldb :-):

	1.85
	2.x
	3.x

and they are NOT compatible with eachother.  1.85 has serious brain damage,
and honestly I'd love to see Robin Dunn's 2.x work rolled in to replace it,
but not sure how viable that is---people might actually want the 1.85 breakage.

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From gstein@lyra.org  Sat Dec 11 11:23:30 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 03:23:30 -0800 (PST)
Subject: [Python-Dev] Zip format
In-Reply-To: <1267287023-386248@hypernet.com>
Message-ID: <Pine.LNX.4.10.9912110321010.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Gordon McMillan wrote:
>...
> If the user imports foo.spam.bar, an importer will be asked for:
>   foo (return foo.__init__)
>   foo.spam (return foo.bar.__init__)

                         ^^^ foo.spam.__init__

>   foo.spam.bar (return foo.spam.bar)

The above sequence is what currently happens.

> But the API allows lots of variations. This is another possible 
> interaction:
>   foo (return None)
>   foo.__init__ (return foo.__init__)
>   foo.spam (return None)
>   foo.bar.__init__ (return foo.bar.__init__)
>   foo.spam.bar (return foo.spam.bar)

The core of imputil has no knowledge of the __init__ thingy. That is
specific to the filesystem-based stuff. So in this sense, "possible" means
"imputil could be changed to do this". I would argue against the change,
however :-)

> Or, by looking at different args to get_code, you could look at 
> the requests as:
>   foo in context of None
>   spam in context of foo
>   bar in context of foo.spam

Bing!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 11 11:26:59 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 03:26:59 -0800 (PST)
Subject: [Python-Dev] Zip format
In-Reply-To: <14417.11137.562474.99270@amarok.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912110323510.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Andrew M. Kuchling wrote:
> M.-A. Lemburg writes:
> >There were issues with zlib 1.0.4 and later ones. Also, many
> >Linux distributions don't have the zlib header files installed.
> 
> For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm,
> and zlib.XXX.rpm only contains libz.so.  On the other hand, anyone
> who's compiling Python should really have the various -devel RPMs

Exactly. The distro's *have* the headers -- it all depends on what you
installed. I happen to have the headers on my system (because I installed
zlib-devel, as AMK mentions).

> installed.  I'd argue against including it, because it might cause odd
> versioning problems.  For example, what if I have PIL compiled against
> zlib1.1.2 (zlib is used for writing PNGs) and the Python binary
> includes zlib1.1.3?  There might be hard-to-debug problems
> caused by calling the wrong symbol.

I totally agree.

>...
> Just received Guido's email suggesting skipping compression in
> archives; not a bad idea.  You'd use less CPU, but might do
> more I/O because you're reading more sectors off disk.  There
> probably isn't much need for compression when the archive is on-disk;
> Java needed it because of applets.

There are all kinds of things that we can do here. Consider mmap'ing the
archive into a shared memory segment, used by all the Python processes on
the system... woo! :-)

IMO, the standard distro can use zip files, and just bail if they are
compressed, but Python cannot load zlib. Obvious failure with an obvious
remedy. No big deal.

As Guido also mentions, an installer can just bring along zlib if they
want to use a compressed archive. i.e. their choice.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 11 11:33:47 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 03:33:47 -0800 (PST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <14417.7909.511437.230915@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912110332360.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Fred L. Drake, Jr. wrote:
> Greg Stein writes:
>  > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic
>  > number if you're worried about mixing CObjects.
> 
>   That's certainly one option, but I would have made readdir(),
> seekdir(), rewinddir() and closedir() into the methods read(), seek(), 
> rewind() and close().  So it's a question of what interface you
> prefer; functions with magically interpreted token parameters (kind of 
> like file descriptors, hey!), or something that is more recognizably
> object-oriented.
>   I know my preference.  ;-)

Well, I know my preference of those two alternatives, too :-), but if
we're going with the Pythonic minimalism, then I'd think you would expose
the functions "as close as possible."

Would I argue if you went with a method-based approach? No :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik@pythonware.com  Sat Dec 11 13:07:08 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 11 Dec 1999 14:07:08 +0100
Subject: [Python-Dev] Zip format
References: <Pine.LNX.4.10.9912110323510.16305-100000@nebula.lyra.org>
Message-ID: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com>

Greg Stein <gstein@lyra.org> wrote:
> There are all kinds of things that we can do here. Consider mmap'ing the
> archive into a shared memory segment, used by all the Python processes on
> the system... woo! :-)

it doesn't really look like this, but I hope we're defining
interfaces here, and not just "one true solution".  I'd be
very annoyed if it turned out that we couldn't use works'
archives with the new standard importer...

> As Guido also mentions, an installer can just bring along zlib if they
> want to use a compressed archive. i.e. their choice.

in the pythonworks universe, the installer and the
application is the same thing...

</F>


From fredrik@pythonware.com  Sat Dec 11 13:12:12 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 11 Dec 1999 14:12:12 +0100
Subject: [Python-Dev] Thankyou for fsync :)
References: <38503BDC.CB91FB29@digicool.com><199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us>
Message-ID: <006c01bf43d9$57bc0f90$f29b12c2@secret.pythonware.com>

Fred L. Drake, Jr. <fdrake@acm.org> wrote:
>   fsync() isn't listed in O'Reilly's POSIX book, so it's probably not
> in the POSIX spec.  Neither is the tempnam() function I added in
> yesterdays spree, though tmpfile() and tmpnam() are.

instead of guessing, you can get a complete
list from:

http://www.unix-systems.org/apis.html

reading up on the "single unix specification"
should also help:

http://www.unix-systems.org/online.html

(registration required; contains complete man
pages for all functions covered by the UNIX95
and UNIX98 specification)

</F>


From gstein@lyra.org  Sat Dec 11 13:10:00 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 05:10:00 -0800 (PST)
Subject: [Python-Dev] Zip format
In-Reply-To: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com>
Message-ID: <Pine.LNX.4.10.9912110505580.16305-100000@nebula.lyra.org>

On Sat, 11 Dec 1999, Fredrik Lundh wrote:
> Greg Stein <gstein@lyra.org> wrote:
> > There are all kinds of things that we can do here. Consider mmap'ing the
> > archive into a shared memory segment, used by all the Python processes on
> > the system... woo! :-)
> 
> it doesn't really look like this, but I hope we're defining
> interfaces here, and not just "one true solution".  I'd be

Oh, I was just having fun there :-). I don't see "one true solution" at
all. Just some standards.

> very annoyed if it turned out that we couldn't use works'
> archives with the new standard importer...

get_code() and its processing is not going anywhere. Some stuff will
change under the covers, and we'll be using sys.path (typically) rather
than chaining (although chaining will still exist!).

I would think that your Importer subclass would be directly usable, but
the installation could/would be a bit different. Heck, worst case, nothing
is going to invalidate your archive format -- feel free to berate me if I
ever break that!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim@interet.com  Mon Dec 13 14:50:11 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 13 Dec 1999 09:50:11 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <38510254.ED15D32B@interet.com>
Message-ID: <385507A3.9F6AAF0F@interet.com>

> Jean-Claude Wippler wrote:
> 
> > Ouch - what's wrong with zip archives?
> 
> > With all due respect - I sincerely hope you will reconsider and alter
> > your code to work with zip files.  It's probably a small adjustment?

OK, I now have a new module "zipfile" which reads and
writes ZIP files.  It is written in Python and has been tested
on Windows and Linux.  I tested it with WinZip and found that
the files it creates are read OK with WinZip, and WinZip
files are read OK with zipfile.  So I am withdrawing my
Python archive file format, and re-writing all my stuff
using zipfile.  It should all be done in a week.

Basically everything works fine.  But there are some problems.

Python seems to lack a CRC-32 function, so I wrote one
in Python.  It is slow.  We need to add a CRC-32 function
to some Python built-in module that it always present, like
md5 or binascci.  The zlib module is not necessarily present.

I can't seem to get WinZip to record a partial path.  That is,
I want the ./Lib/test package to have these ZIP paths:
  test/__init__.pyc
  test/testall.pyc
  ...
but WinZip creates files with either no path at all or the
fully specified path.  Am I missing something?  Do all
other ZIP tools do this too?

JimA


Return-Path: <owner-python-dev@python.org>
Delivered-To: python-dev@dinsdale.python.org
Received: from python.org (parrot.python.org [132.151.1.90])
	by dinsdale.python.org (Postfix) with ESMTP id EFDA11CDB9
	for <python-dev@dinsdale.python.org>; Mon, 13 Dec 1999 10:21:56 -0500 (EST)
Received: from cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1] (may be forged))
	by python.org (8.9.1a/8.9.1) with ESMTP id KAA06423
	for <python-dev@python.org>; Mon, 13 Dec 1999 10:21:55 -0500 (EST)
Received: from kaluha.cnri.reston.va.us (kaluha.cnri.reston.va.us [132.151.7.31])
	by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id KAA04774
	for <python-dev@python.org>; Mon, 13 Dec 1999 10:21:56 -0500 (EST)
Received: from eric.cnri.reston.va.us (eric.cnri.reston.va.us [10.27.10.23])
	by kaluha.cnri.reston.va.us (8.9.1b+Sun/8.9.1) with ESMTP id KAA04556
	for <python-dev@python.org>; Mon, 13 Dec 1999 10:22:34 -0500 (EST)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by eric.cnri.reston.va.us (8.9.3+Sun/8.9.1) with ESMTP id KAA18858
	for <python-dev@python.org>; Mon, 13 Dec 1999 10:22:34 -0500 (EST)
Resent-Message-Id: <199912131522.KAA18858@eric.cnri.reston.va.us>
Message-Id: <199912131522.KAA18858@eric.cnri.reston.va.us>
To: "James C. Ahlstrom" <jim@interet.com>
Subject: Re: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-reply-to: Your message of "Mon, 13 Dec 1999 09:50:11 EST."
             <385507A3.9F6AAF0F@interet.com> 
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <38510254.ED15D32B@interet.com>  
            <385507A3.9F6AAF0F@interet.com> 
Date: Mon, 13 Dec 1999 10:22:12 -0500
From: Guido van Rossum <guido@CNRI.Reston.VA.US>
Resent-Cc: python-dev@python.org
Resent-Date: Mon, 13 Dec 1999 10:22:34 -0500
Resent-From: Guido van Rossum <guido@CNRI.Reston.VA.US>
Sender: python-dev-admin@python.org
Errors-To: python-dev-admin@python.org
X-BeenThere: python-dev@python.org
X-Mailman-Version: 1.2 (experimental)
Precedence: bulk
List-Id: Python core developers <python-dev.python.org>

> OK, I now have a new module "zipfile" which reads and
> writes ZIP files.  It is written in Python and has been tested
> on Windows and Linux.  I tested it with WinZip and found that
> the files it creates are read OK with WinZip, and WinZip
> files are read OK with zipfile.  So I am withdrawing my
> Python archive file format, and re-writing all my stuff
> using zipfile.  It should all be done in a week.

Ah, good!  (This saves me the trouble of cleaning up our own zip code :-)

> Basically everything works fine.  But there are some problems.
> 
> Python seems to lack a CRC-32 function, so I wrote one
> in Python.  It is slow.  We need to add a CRC-32 function
> to some Python built-in module that it always present, like
> md5 or binascci.  The zlib module is not necessarily present.
> 
> I can't seem to get WinZip to record a partial path.  That is,
> I want the ./Lib/test package to have these ZIP paths:
>   test/__init__.pyc
>   test/testall.pyc
>   ...
> but WinZip creates files with either no path at all or the
> fully specified path.  Am I missing something?  Do all
> other ZIP tools do this too?

Unclick the "Save Extra Folder Info" and then drag the *parent* folder
into the archive.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Mon Dec 13 17:00:26 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 13 Dec 1999 12:00:26 -0500 (EST)
Subject: [Python-Dev] confstr(), fpathconf(), pathconf(), sysconf()
Message-ID: <14421.9770.623399.673010@weyr.cnri.reston.va.us>

  I've just checked in bindings for these POSIX.1 and POSIX.2
functions, and thought I'd explain the interfaces for those who don't
want to read the diffs.  ;)
  These functions expect a "name" parameter (that's how it's described 
in the man pages and the O'Reilly book).  The value for "name" is an
integer that's defined in the system headers.  The constants all have
the form

    _XX_SOME_NAME

where XX is PC for fpathconf()- and pathconf()-related names, SC for
sysconf()-related names, and CS for confstr()-related names.  Some
names are defined by the standards, but additional names are defined
by implementations (there are a *lot* of sysconf() names under
Solaris!).
  We don't want to expose enormous numbers of constants in the
module's interface, however, as there are already a lot of names in
the posix module.  That would also slow down module initialization.
We also don't want to force callers to use magic numbers in code that 
uses these functions, especially since the values may be
system-specific.
  The best way to call these functions, then, is to use a *string*
that corresponds to the name of the C #define sysmbol with the leading 
underscore stripped off.  For example, to get the length of the
arguments to exec(), you could say:

    num_args = os.sysconf("SC_ARG_MAX")

  The string will be mapped to the appropriate numeric value defined
in an internal table.  If the name isn't defined for the platform, a
ValueError will be raised.

    >>> num_args = os.sysconf("FOO_BAR")
    Traceback (innermost last):
      File "<stdin>", line 1, in ?
    ValueError: unrecognized configuration name

  To allow retrieval for platform-dependent configuration information, 
integers can also be passed in.  On Solaris, this is equivalent to
using "SC_ARG_MAX":

    num_args = os.sysconf(1)

(Ignoring the portability and readability issues, ha!)
  There are three separate tables used for this; one for confstr(),
one for sysconf(), and one shared by fpathconf() and pathconf().  The
names used to build the tables come from Linux and Solaris; we can add 
other names as needed.  To add names, I'd need the names to add and
how to test for their existence at compile time (#ifdef, etc.).


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From fdrake@acm.org  Mon Dec 13 18:35:49 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 13 Dec 1999 13:35:49 -0500 (EST)
Subject: [Python-Dev] CVS: python/dist/src/Modules posixmodule.c,2.116,2.117
In-Reply-To: <Pine.LNX.4.10.9912131025480.16305-100000@nebula.lyra.org>
References: <199912131637.LAA17318@weyr.cnri.reston.va.us>
 <Pine.LNX.4.10.9912131025480.16305-100000@nebula.lyra.org>
Message-ID: <14421.15493.28263.387680@weyr.cnri.reston.va.us>

Greg Stein writes:
 > I'm not very familiar with these APIs, but should you let go of the
 > interpreter lock when you call them?
 > (and for the other new funcs)

  None of these should be doing an I/O as far as I can determine.
Whenever I get to getlogin() (which AMK & I decided should be
included, based on the specs that /F pointed us to), I will release
the interpreter lock for the getlogin_r() variant.  I'm not sure I
should release it for the non-reentrant getlogin(), however; the
specification for getlogin*() pretty much requires that it read from
utmp.  ;(


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From gstein@lyra.org  Mon Dec 13 20:31:22 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 13 Dec 1999 12:31:22 -0800 (PST)
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <385507A3.9F6AAF0F@interet.com>
Message-ID: <Pine.LNX.4.10.9912131229590.16305-100000@nebula.lyra.org>

On Mon, 13 Dec 1999, James C. Ahlstrom wrote:
>...
> OK, I now have a new module "zipfile" which reads and
> writes ZIP files.  It is written in Python and has been tested
> on Windows and Linux.  I tested it with WinZip and found that
> the files it creates are read OK with WinZip, and WinZip
> files are read OK with zipfile.  So I am withdrawing my
> Python archive file format, and re-writing all my stuff
> using zipfile.  It should all be done in a week.

Can you post zipfile.py so that people can starting reviewing that?

>...
> Python seems to lack a CRC-32 function, so I wrote one
> in Python.  It is slow.  We need to add a CRC-32 function
> to some Python built-in module that it always present, like
> md5 or binascci.  The zlib module is not necessarily present.

See zlib.crc32()

This is interesting, of course, because we have previously stated that
zlib (and its compression) is optional. But if we need the CRC-32
function...

hehe...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Mon Dec 13 22:11:33 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 13 Dec 1999 17:11:33 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <385507A3.9F6AAF0F@interet.com>
Message-ID: <000401bf45b7$04edfaa0$96a2143f@tim>

[James C. Ahlstrom]
> ...
> Python seems to lack a CRC-32 function, so I wrote one
> in Python.  It is slow.  We need to add a CRC-32 function
> to some Python built-in module that it always present, like
> md5 or binascci.  The zlib module is not necessarily present.

Unfortunately, there are many different CRC functions in common use.  None
belong in md5; if the intent is to support just zip's version, adding a
(say) zipcrc32 function to binascii would be ok; if we expect to support
others as well, a new parameterized crc module would be in order.

> I can't seem to get WinZip to record a partial path.  That is,
> I want the ./Lib/test package to have these ZIP paths:
>   test/__init__.pyc
>   test/testall.pyc
>   ...
> but WinZip creates files with either no path at all or the
> fully specified path.  Am I missing something?  Do all
> other ZIP tools do this too?

No, it's a clumsiness unique to WinZip (damn GUIs <0.9 wink>).  In the Add
dialog box, you need to cd to the *Lib* directory, check the "Save extra
folder info" box, and then, e.g.,

1. Put
      test\*.pyc
   in the Add Files line, and click Add With Wildcards.
   Then all test\*.pyc files will be added, with paths test/__init__.pyc
   etc.

or

2. Put
      "test\__init__.pyc" "test\testall.pyc"
   (including the quotes!) in the Add Files line, and click Add.

Since #2 can be unbearable, other useful strategies include:

3. Use #1 (e.g. with dir\*.*) then delete the files you didn't really
   want.

4. Use #1 repeatedly, cleverly using a number of wildcard patterns that
   cover the files of interest.

5. Mixtures of #3 and #4.

6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has
   an "experimental" cmdline add-on too, but haven't tried it).


From jim@interet.com  Tue Dec 14 13:13:03 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 08:13:03 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <Pine.LNX.4.10.9912131229590.16305-100000@nebula.lyra.org>
Message-ID: <3856425F.8C5E7A42@interet.com>

Greg Stein wrote:
> 

> Can you post zipfile.py so that people can starting reviewing that?

Yes, it will be available by next Monday.  I just want to
get it really working and pretty, and with documentation.

JimA


From jim@interet.com  Tue Dec 14 13:26:50 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 08:26:50 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000401bf45b7$04edfaa0$96a2143f@tim>
Message-ID: <3856459A.BF5A798A@interet.com>

Tim Peters wrote:
> 
> [James C. Ahlstrom]
> > ...
> > Python seems to lack a CRC-32 function, so I wrote one
>
> Unfortunately, there are many different CRC functions in common use.  None
> belong in md5; if the intent is to support just zip's version, adding a
> (say) zipcrc32 function to binascii would be ok; if we expect to support
> others as well, a new parameterized crc module would be in order.

OK, a CRC-32 in binascii it is.  The CRC-32 I
have comes with these comments which seem to indicate it is a
more "official standard" CRC-32 than average:

# *  Crc - 32 BIT ANSI X3.66 CRC checksum files
#*********************************************************************\
#*                                                                    *|
#* Demonstration program to compute the 32-bit CRC used as the frame  *|
#* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71     *|
#* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level     *|
#* protocol).  The 32-bit FCS was added via the Federal Register,     *|
#* 1 June 1982, p.23798.  I presume but don't know for certain that   *|
#* this polynomial is or will be included in CCITT V.41, which        *|
#* defines the 16-bit CRC (often called CRC-CCITT) polynomial.  FIPS  *|
#* PUB 78 says that the 32-bit FCS reduces otherwise undetected       *|
#* errors by a factor of 10^-5 over 16-bit FCS.                       *|
#*                                                                    *|
#*********************************************************************
#* Copyright (C) 1986 Gary S. Brown.  You may use this program, or
#* code or tables extracted from it, as desired without restriction.
 
I can submit this as a patch to binascii, or if the Copyright bothers
anyone, maybe it is better for Guido to use his CRC-32 from his ZIP
code.  Preference?

> > I can't seem to get WinZip to record a partial path.  That is,
>
> dialog box, you need to cd to the *Lib* directory, check the "Save extra
> folder info" box, and then, e.g.,

Thanks.  I knew there had to be some magic incantation to do it.
 
> 6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has
>    an "experimental" cmdline add-on too, but haven't tried it).

Actually pkzip 2.04g doesn't work because it writes names in upper case
and is limited to 8.3 names (I think).  My zipfile.py can be used as
a basis for a command line tool.  Actually I use makefiles with imbedded
Python programs and find this easier than command line tools.

JimA


From guido@CNRI.Reston.VA.US  Tue Dec 14 14:53:04 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 09:53:04 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Tue, 14 Dec 1999 08:26:50 EST."
 <3856459A.BF5A798A@interet.com>
References: <000401bf45b7$04edfaa0$96a2143f@tim>
 <3856459A.BF5A798A@interet.com>
Message-ID: <199912141453.JAA23429@eric.cnri.reston.va.us>

> OK, a CRC-32 in binascii it is.  The CRC-32 I
> have comes with these comments which seem to indicate it is a
> more "official standard" CRC-32 than average:
> 
> # *  Crc - 32 BIT ANSI X3.66 CRC checksum files
> #*********************************************************************\
> #*                                                                    *|
> #* Demonstration program to compute the 32-bit CRC used as the frame  *|
> #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71     *|
> #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level     *|
> #* protocol).  The 32-bit FCS was added via the Federal Register,     *|
> #* 1 June 1982, p.23798.  I presume but don't know for certain that   *|
> #* this polynomial is or will be included in CCITT V.41, which        *|
> #* defines the 16-bit CRC (often called CRC-CCITT) polynomial.  FIPS  *|
> #* PUB 78 says that the 32-bit FCS reduces otherwise undetected       *|
> #* errors by a factor of 10^-5 over 16-bit FCS.                       *|
> #*                                                                    *|
> #*********************************************************************
> #* Copyright (C) 1986 Gary S. Brown.  You may use this program, or
> #* code or tables extracted from it, as desired without restriction.
>  
> I can submit this as a patch to binascii, or if the Copyright bothers
> anyone, maybe it is better for Guido to use his CRC-32 from his ZIP
> code.  Preference?

I looked, but "my" crc32 in the zlib module (which was actually
contributed by Andrew Kuchling) is just a wrapper around the crc32
function in zlib, which is copyrighted by Mark Adler and follows the
zlib rules.

I propose to use Gary Brown's code.  I'll defend this to CNRI's
lawyers if need be.

Jim, have you checked that this is the right CRC to use for zip's CRC?
(This in the light of Tim's assertion that there are many CRCs around.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@interet.com  Tue Dec 14 15:22:56 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 10:22:56 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000401bf45b7$04edfaa0$96a2143f@tim>
 <3856459A.BF5A798A@interet.com> <199912141453.JAA23429@eric.cnri.reston.va.us>
Message-ID: <385660D0.C6C0C7B9@interet.com>

Guido van Rossum wrote:

> I propose to use Gary Brown's code.  I'll defend this to CNRI's
> lawyers if need be.
> 
> Jim, have you checked that this is the right CRC to use for zip's CRC?
> (This in the light of Tim's assertion that there are many CRCs around.)

The CRC it calculates agrees with the CRC of WinZip for all
files I have tried.  The original Gary Brown code was much
longer and included file reading.  Here is the shortened version:

JimA


# *  Crc - 32 BIT ANSI X3.66 CRC checksum files
#*********************************************************************\
#*                                                                    *|
#* Demonstration program to compute the 32-bit CRC used as the frame  *|
#* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71     *|
#* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level     *|
#* protocol).  The 32-bit FCS was added via the Federal Register,     *|
#* 1 June 1982, p.23798.  I presume but don't know for certain that   *|
#* this polynomial is or will be included in CCITT V.41, which        *|
#* defines the 16-bit CRC (often called CRC-CCITT) polynomial.  FIPS  *|
#* PUB 78 says that the 32-bit FCS reduces otherwise undetected       *|
#* errors by a factor of 10^-5 over 16-bit FCS.                       *|
#*                                                                    *|
#*********************************************************************

#
#* Copyright (C) 1986 Gary S. Brown.  You may use this program, or
#* code or tables extracted from it, as desired without restriction.
 
# First, the polynomial itself and its table of feedback terms.  The  
# polynomial is                                                       
# X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X^1+X^0 
# Note that we take it "backwards" and put the highest-order term in  
# the lowest-order bit.  The X^32 term is "implied"; the LSB is the   
# X^31 term, etc.  The X^0 term (usually shown as "+1") results in    
# the MSB being 1.                                                    

# Note that the usual hardware shift register implementation, which   
# is what we're using (we're merely optimizing it by doing eight-bit  
# chunks at a time) shifts bits into the lowest-order term.  In our   
# implementation, that means shifting towards the right.  Why do we   
# do it this way?  Because the calculated CRC must be transmitted in  
# order from highest-order term to lowest-order term.  UARTs transmit 
# characters in order from LSB to MSB.  By storing the CRC this way,  
# we hand it to the UART in the order low-byte to high-byte; the UART 
# sends each low-bit to hight-bit; and the result is transmission bit 
# by bit from highest- to lowest-order term without requiring any bit 
# shuffling on our part.  Reception works similarly.                  

# The feedback terms table consists of 256, 32-bit entries.  Notes:   
#                                                                     
#  1. The table can be generated at runtime if desired; code to do so 
#     is shown later.  It might not be obvious, but the feedback      
#     terms simply represent the results of eight shift/xor opera-    
#     tions for all combinations of data and CRC register values.     
#                                                                     
#  2. The CRC accumulation logic is the same for all CRC polynomials, 
#     be they sixteen or thirty-two bits wide.  You simply choose the 
#     appropriate table.  Alternatively, because the table can be     
#     generated at runtime, you can start by generating the table for 
#     the polynomial in question and use exactly the same "updcrc",   
#     if your application needn't simultaneously handle two CRC       
#     polynomials.  (Note, however, that XMODEM is strange.)          
#                                                                     
#  3. For 16-bit CRCs, the table entries need be only 16 bits wide;   
#     of course, 32-bit entries work OK if the high 16 bits are zero. 
#                                                                     
#  4. The values must be right-shifted by eight bits by the "updcrc"  
#     logic; the shift must be unsigned (bring in zeroes).  On some   
#     hardware you could probably optimize the shift in assembler by  
#     using byte-swap instructions.                                   

# Converted to Python by James C. Ahlstrom

crc_32_tab = [	# CRC polynomial 0xedb88320
0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f,
0xe963a535, 0x9e6495a3,
0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, 0x09b64c2b, 0x7eb17cbd,
0xe7b82d07, 0x90bf1d91,
0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, 0x1adad47d, 0x6ddde4eb,
0xf4d4b551, 0x83d385c7,
0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, 0x14015c4f, 0x63066cd9,
0xfa0f3d63, 0x8d080df5,
0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, 0x3c03e4d1, 0x4b04d447,
0xd20d85fd, 0xa50ab56b,
0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, 0x32d86ce3, 0x45df5c75,
0xdcd60dcf, 0xabd13d59,
0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423,
0xcfba9599, 0xb8bda50f,
0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, 0x2f6f7c87, 0x58684c11,
0xc1611dab, 0xb6662d3d,
0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f,
0x9fbfe4a5, 0xe8b8d433,
0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d,
0x91646c97, 0xe6635c01,
0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, 0x6c0695ed, 0x1b01a57b,
0x8208f4c1, 0xf50fc457,
0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49,
0x8cd37cf3, 0xfbd44c65,
0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7,
0xa4d1c46d, 0xd3d6f4fb,
0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, 0x44042d73, 0x33031de5,
0xaa0a4c5f, 0xdd0d7cc9,
0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3,
0xb966d409, 0xce61e49f,
0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81,
0xb7bd5c3b, 0xc0ba6cad,
0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, 0xead54739, 0x9dd277af,
0x04db2615, 0x73dc1683,
0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d,
0x0a00ae27, 0x7d079eb1,
0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb,
0x196c3671, 0x6e6b06e7,
0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, 0xf9b9df6f, 0x8ebeeff9,
0x17b7be43, 0x60b08ed5,
0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767,
0x3fb506dd, 0x48b2364b,
0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55,
0x316e8eef, 0x4669be79,
0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, 0xcc0c7795, 0xbb0b4703,
0x220216b9, 0x5505262f,
0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31,
0x2cd99e8b, 0x5bdeae1d,
0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f,
0x72076785, 0x05005713,
0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, 0x92d28e9b, 0xe5d5be0d,
0x7cdcefb7, 0x0bdbdf21,
0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b,
0x6fb077e1, 0x18b74777,
0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69,
0x616bffd3, 0x166ccf45,
0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, 0xa7672661, 0xd06016f7,
0x4969474d, 0x3e6e77db,
0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5,
0x47b2cf7f, 0x30b5ffe9,
0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693,
0x54de5729, 0x23d967bf,
0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, 0xb40bbe37, 0xc30c8ea1,
0x5a05df1b, 0x2d02ef8d
]


def crc32(string):
  crc = 0xFFFFFFFF
  for ch in string:
    crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) &
0xFFFFFF)
  return ~crc


From tim_one@email.msn.com  Tue Dec 14 17:06:36 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 14 Dec 1999 12:06:36 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912141453.JAA23429@eric.cnri.reston.va.us>
Message-ID: <000101bf4655$94e40840$3a2d153f@tim>

[Guido]
> I propose to use Gary Brown's code.  I'll defend this to CNRI's
> lawyers if need be.

If there's a hassle, I can do a clean-room implementation easily enough --
although I'd rather not.

> Jim, have you checked that this is the right CRC to use for zip's CRC?

If WinZip unzips Jim's files without griping, the odds that he's got the
wrong CRC are about 1 in 2**36 <wink>.

> (This in the light of Tim's assertion that there are many CRCs
> around.)

There are, and several others are hiding in assorted communications stds
(e.g., Ethernet uses a different 32-bit CRC); but the zip CRC is the one
you'll find most commonly described on the Web.

All the same, once Jim releases his code, I'll do an anal verification that
it's the right one.


From jim@interet.com  Tue Dec 14 17:54:35 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 12:54:35 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000101bf4655$94e40840$3a2d153f@tim>
Message-ID: <3856845B.6C3C7330@interet.com>

Tim Peters wrote:

> If WinZip unzips Jim's files without griping, the odds that he's got the
> wrong CRC are about 1 in 2**36 <wink>.

You mean 2**32, right?  Oh, sorry, you must be
using a DEC-10  <wink again>.

JimA


From gstein@lyra.org  Tue Dec 14 19:23:36 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 11:23:36 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <3856425F.8C5E7A42@interet.com>
Message-ID: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, James C. Ahlstrom wrote:

> Greg Stein wrote:
> > 
> 
> > Can you post zipfile.py so that people can starting reviewing that?
> 
> Yes, it will be available by next Monday.  I just want to
> get it really working and pretty, and with documentation.

My point was that people could possibly use it *before* then. Not
everybody needs it to be pretty, needs doc, or needs it fully working.
Maybe people would like to provide feedback on the API. Maybe they'd like
to start their own modules that use your library.

This goes back to my years-old statement: release it now rather than later
-- people can always use it now, and there might not be a later.

Release early. Release often. :-)

People are too hesitant to release code. Why? Just send it out there. When
you update it, send out another. It doesn't hurt anybody to have more than
one release.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Wed Dec 15 04:20:25 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 14 Dec 1999 23:20:25 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <3856845B.6C3C7330@interet.com>
Message-ID: <000501bf46b3$b6184f40$05a0143f@tim>

[Tim]
> If WinZip unzips Jim's files without griping, the odds that he's
> got the wrong CRC are about 1 in 2**36 <wink>.

[JimA]
> You mean 2**32, right?

Nope!  For each of the 2**32 polynomials you may have pulled out of thin
air, there are about a dozen common variations in the details of CRC
algorithms.  For example, a CRC used for hashing usually initializes "the
register" to 0, but a CRC used to protect against transmission errors
usually initializes to a block of 1 bits (since leading zeroes don't affect
the result, and a common transmission error is dropping a prefix of the
msg).  Similarly, algorithms vary in the order they scan the data; in
whether they use the raw data or its complement; and in whether they return
the actual remainder, the complement of the remainder, or a checksum
cleverly computed so that "the other end" always sees a fixed remainder
other than 0 (or ~0).

> Oh, sorry, you must be using a DEC-10  <wink again>.

I used a Univac 1108 in college, back when ASCII was in its infancy.  They
couldn't decide on the natural size for a character, so the 36-bit 1108
could be configured to treat each word as either 6 6-bit bytes or 4 9-bit
ones.  If they had been thinking ahead, they would have defined it as two
Unicode characters plus a 4-bit tag field for the Python implementation to
play with <wink>.

now-they-make-their-living-suing-.gif-bandits-ly y'rs  - tim


From tim_one@email.msn.com  Wed Dec 15 07:40:11 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 15 Dec 1999 02:40:11 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <385660D0.C6C0C7B9@interet.com>
Message-ID: <000b01bf46cf$9ebe27e0$05a0143f@tim>

[JimA posts his Python rendering of Gary Brown's code]

Yup!  That's the zip algorithm, right down to the absurdly bit-reversed
polynomial.

> def crc32(string):
>   crc = 0xFFFFFFFF
>   for ch in string:
>     crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) &
> 0xFFFFFF)
>   return ~crc

Note that the last line is better (whether in Python or C!) as

    return crc ^ 0xffffffff

Else you'll get a surprising result in a 64-bit Python, and in some 64-bit C
implementations.

it's-a-32-bit-algorithm-not-an-"int"-or-"long"-one-ly y'rs  - tim


From fredrik@pythonware.com  Wed Dec 15 09:31:29 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 15 Dec 1999 10:31:29 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000101bf4655$94e40840$3a2d153f@tim>
Message-ID: <002601bf46e0$06e25ca0$f29b12c2@secret.pythonware.com>

> [Guido]
> > I propose to use Gary Brown's code.  I'll defend this to CNRI's
> > lawyers if need be.
> 
> If there's a hassle, I can do a clean-room implementation easily enough --
> although I'd rather not.

or you can grab the code from PIL, which already
comes with a Python compatible license...

(it's based on ISO 3307, but judging from the table
James posted, it's the same thing...)

</F>


From fredrik@pythonware.com  Wed Dec 15 09:39:19 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 15 Dec 1999 10:39:19 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000b01bf46cf$9ebe27e0$05a0143f@tim>
Message-ID: <003001bf46e0$43860b20$f29b12c2@secret.pythonware.com>

Tim Peters <tim_one@email.msn.com> wrote:
> Yup!  That's the zip algorithm, right down to the absurdly bit-reversed
> polynomial.

also known as ISO 3307, according to some
strange comments in PIL's sources...

</F>


From jim@interet.com  Wed Dec 15 15:53:34 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Wed, 15 Dec 1999 10:53:34 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
Message-ID: <3857B97E.3684224F@interet.com>

Greg Stein wrote:

> Release early. Release often. :-)

You are right of course.  OK, the zipfile.py code and docs are at:

  ftp://ftp.interet.com/pub/pylib.html

Despite the ftp URL, clicking on it should display the html.

Please don't panic if is seems to be slow.  It uses a Python CRC-32
which is slow.  You may want to hack it to use zlib.crc32() if you
have it.

I am testing with WinZip.  If you have another zip tool, it
would be interesting to see how compatible it is.

JimA


From guido@CNRI.Reston.VA.US  Wed Dec 15 16:38:47 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 15 Dec 1999 11:38:47 -0500
Subject: [Python-Dev] Writers wanted for Linux Journal Python special issue
Message-ID: <199912151638.LAA02522@eric.cnri.reston.va.us>

Linux Journal is preparing a special issue devoted to Python (actually
more like a pullout section or whatever I think).  They are looking
for writers, e.g. to write a piece about Python's history and/or an
introduction.  And probably anything else Python related.

If you're interested, please write to Marjorie Richardson
<mlr@ssc.com>, who is coordinating.  Also direct any questions to her.

This is for the June issue which will be on newsstands mid-May and
mailed to subscribers even earlier, I believe.  The deadline is
February 1st (magazine production takes forever!).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Wed Dec 15 18:17:53 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Wed, 15 Dec 1999 13:17:53 -0500 (EST)
Subject: [Python-Dev] fwd. from Paul Prescod
Message-ID: <14423.56145.877163.395736@amarok.cnri.reston.va.us>

This is a forwarded e-mail from the XML-SIG mailing list, in which
Paul makes some good points.  Some context: I've been arguing against
adding more XML stuff to the base Python distribution, because 1) it's
bloat for those people don't care about XML, and 2) the Distutils is
supposed to fix this by making installing things easier.  Paul's
response, below, has shaken my conviction a bit (*only* a bit,
though).  If it's deemed valuable, perhaps the XML-SIG could
concentrate on the minimal set of parser + SAX + DOM that could be
included in 1.6.

Please join the XML-SIG to follow the specifics of this thread
further, as it relates only to XML.  As a more general philosophical
question for python-dev: do we want to add things to 1.6 following the
"batteries included" philosophy?  Or should we wave in the direction
of the distutils and say they'll fix the problem?  (In which case they
should be given high priority, as in "1.6 doesn't ship until they're
done".)

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
And after all, why should I go to bed every night? Sleep is only a habit.
    -- Cornelius Van Horne


Paul Prescod writes:
>"Andrew M. Kuchling" wrote:
>> 
>> Huh?  There's obviously a good deal of stuff in there, some of it
>> perhaps too esoteric, but I don't see where there's overlap.  
>
>Well, there are several parsers and parser wrappers. How is a user
>supposed to choose? And there is PyDOM, Minidom and qp_dom.
>
>> Or are
>> you talking about Python tools in general, where there are 3 DOM
>> implementations?  (PyDOM, 4DOM, and ZDOM hiding inside Zope.)
>
>That too.
>
>> I lean against shoveling more stuff into 1.6; better to get the
>> Distutils widely used, which makes it easier to install *all* Python
>> extensions.
>
>I don't think that XML is any more of an "add-on" to a modern scripting
>language than URL support or regular expression support. I'm in the
>"batteries included" camp for this and several other reasons: 
>
>	* standard Python libraries may soon need XML support. If WebDAV takes
>off then there should be a libWebDAV right alongside libftp and libhttp.
>And libWebDAV will require XML
>
>	* there is a difference between theory and practice. In theory,
>distutils will be done soon and everything will be easy. In practice, it
>is the end of 1999 and at every conference I have to install the XML sig
>package on the machines of several people who haven't been able to get
>it going themselves. In practice, we can't wait for distutils because
>people are choosing their XML tools now.
>
>> >Ideally we would have one (or at most two!) implementation of each of
>> >the major specs:
>> >XML    >SAX   >Unicode    >XPath    >XPointer   >XSLT    >DOM
>> 
>> Do you mean "one implementation of each in a single package", or "one
>> implementation existing for Python, distributed separately"?
>
>With the possible exception of XSLT, one implementation of each *in
>Python 1.6*.
>
>> We need to come up with a position paper for developer's day, stating
>> what needs to be discussed.  Suggestions?  I'd propose focusing on
>> getting the XML-SIG package to 1.0, but that's just an idea.
>
>I don't see how the XML-SIG package can ever get to 1.0. Anybody can
>contribute code at anytime and thus far we've been totally flexible
>about putting it in. I think that's great. It just won't ever lead to a
>stable, carefully maintained, tightly interoperable package. Some of the
>maintainers of the individual pieces have probably lost interest and
>there is probably nobody that understands it all enough to integrate it
>nicely.
>
>-- 
> Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
>


From fdrake@acm.org  Wed Dec 15 19:47:01 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 15 Dec 1999 14:47:01 -0500 (EST)
Subject: [Python-Dev] posix module
Message-ID: <14423.61493.90107.433664@weyr.cnri.reston.va.us>

  Ok, I think I'm done with the posix module updates, modulo bugs and
additional symbols for the *conf*() tables.  That leaves us with the
following status for interfaces that Andrew brought up in the message
that started this spate of additions:

Worth adding?
=============
opendir(), readdir(), closedir() -- not added
           The only thing these give us that os.listdir() doesn't is
           the inode numbers.  Unless someone actually wants those,
           it's not worth having.

Worth adding:
=============

abort() -- added

ctermid(), ctermid_r() -- added
            
fpathconf(fd, name) -- added

getlogin() -- added

getgroups(gidsetsize, grouplist) -- added

pathconf(path, name) -- added

sysconf(int name) -- added; also added confstr(int name)

Not worth adding:
=================
clearerr() -- not added

cuserid() -- not added

difftime -- not added

tmpfile(), tmpnam() -- added, also tempnam()

mblen(), mbstowcs(), mbtowc(), wcstombs(),  wctomb() -- not added


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From jeremy@cnri.reston.va.us  Wed Dec 15 19:58:16 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 15 Dec 1999 14:58:16 -0500 (EST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
References: <3856425F.8C5E7A42@interet.com>
 <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
Message-ID: <14423.62168.576273.719577@goon.cnri.reston.va.us>

>>>>> "GS" == Greg Stein <gstein@lyra.org> writes:

  GS> On Tue, 14 Dec 1999, James C. Ahlstrom wrote:
  >> Greg Stein wrote: >
  >> 
  >> > Can you post zipfile.py so that people can starting reviewing
  >> that?
  >> 
  >> Yes, it will be available by next Monday.  I just want to get it
  >> really working and pretty, and with documentation.

  GS> My point was that people could possibly use it *before*
  GS> then. Not everybody needs it to be pretty, needs doc, or needs
  GS> it fully working.  Maybe people would like to provide feedback
  GS> on the API. Maybe they'd like to start their own modules that
  GS> use your library.

  GS> This goes back to my years-old statement: release it now rather
  GS> than later -- people can always use it now, and there might not
  GS> be a later.

Ok.  I think we need some kind of zip file support in the core so that
it can be used as a standard distribution format.  I'd be happy if
Jim's zipfile module ended up being it.  We've got some zip code that
we developed at CNRI; it's a bit of a mess, but it might be helpful to
see what we did.  Our code is at ftp://www.python.org/pub/tmp/zip.zip

Jeremy


From jim@interet.com  Thu Dec 16 15:41:56 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 16 Dec 1999 10:41:56 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com>
Message-ID: <38590844.769C3025@interet.com>

Did anyone look at this yet?

   ftp://ftp.interet.com/pub/pylib.html

   ftp://ftp.interet.com/pub/zipfile.py

JimA


From skip@mojam.com (Skip Montanaro)  Thu Dec 16 15:46:28 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 16 Dec 1999 09:46:28 -0600 (CST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38590844.769C3025@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
 <3857B97E.3684224F@interet.com>
 <38590844.769C3025@interet.com>
Message-ID: <14425.2388.529932.61119@dolphin.mojam.com>

    JA> Did anyone look at this yet?
    JA>    ftp://ftp.interet.com/pub/pylib.html
    JA>    ftp://ftp.interet.com/pub/zipfile.py

I thought it wasn't supposed to be out until Monday?  You're looking for,
perhaps, a time machine? ;-)

(More seriously, it won't have any effect on my "gotta have this done
yesterday" list, so I will let others comment...)

Skip


From jim@interet.com  Thu Dec 16 17:16:21 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 16 Dec 1999 12:16:21 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com>
Message-ID: <38591E65.4885A39D@interet.com>

"James C. Ahlstrom" wrote:
 
>    ftp://ftp.interet.com/pub/pylib.html

I just changed zipfile.py so that regular zip compression
works.  And if zlib is available,
its crc32() is used instead of the Python version.

I should mention that the current code rejects zip files which have
an archive comment added to the end.  Accepting them would require
a search, and I am not sure it is worth it.

JimA


From fdrake@acm.org  Thu Dec 16 17:19:23 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 16 Dec 1999 12:19:23 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <Pine.LNX.4.10.9912151910500.16305-100000@nebula.lyra.org>
References: <199912151831.NAA02685@weyr.cnri.reston.va.us>
 <Pine.LNX.4.10.9912151910500.16305-100000@nebula.lyra.org>
Message-ID: <14425.7963.347400.763562@weyr.cnri.reston.va.us>

[Note that Greg's message went to python-checkins since he responded
to a checkin message, but I suspect he meant to change the header to
point to python-dev.  ;)  If not, too bad!]

Greg Stein writes:
 > But this means that your tables no long reside in "const" space. Yet More
 > Per-Process Memory...
 > 
 > It would be nice to have those tables marked as "const".

  Perhaps; as Guido points out, there haven't been a lot of complaints 
about this issue.
  I will note that only the tables aren't constant; the strings that
are pointed to are still constant.  I'm inclined to let the compiler/
linker care about this, and not change the code without a really clear 
need to do so.
  Here are the sizes of those tables and the strings they point to
(including terminating null bytes for the strings):

pathconf_names:  14 entries, 112 bytes,  176 string bytes
confstr_names:   25 entries, 200 bytes,  576 string bytes
sysconf_names:  108 entries, 864 bytes, 1774 string bytes

  Figures are for Solaris7.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From gstein@lyra.org  Thu Dec 16 18:10:14 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 10:10:14 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <14425.7963.347400.763562@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161006011.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Fred L. Drake, Jr. wrote:
> [Note that Greg's message went to python-checkins since he responded
> to a checkin message, but I suspect he meant to change the header to
> point to python-dev.  ;)  If not, too bad!]

I didn't really care too much where it went. I would actually suggest that
the Reply-To: on the checkin list is set to python-dev if that is where
replies are Supposed To Go.
[ I do this with mod_dav checkins; replies to dav-checkins mail goes to
  dav-dev. ]

> Greg Stein writes:
>  > But this means that your tables no long reside in "const" space. Yet More
>  > Per-Process Memory...
>  > 
>  > It would be nice to have those tables marked as "const".
> 
>   Perhaps; as Guido points out, there haven't been a lot of complaints 
> about this issue.
>   I will note that only the tables aren't constant; the strings that
> are pointed to are still constant.  I'm inclined to let the compiler/
> linker care about this, and not change the code without a really clear 
> need to do so.
>   Here are the sizes of those tables and the strings they point to
> (including terminating null bytes for the strings):
> 
> pathconf_names:  14 entries, 112 bytes,  176 string bytes
> confstr_names:   25 entries, 200 bytes,  576 string bytes
> sysconf_names:  108 entries, 864 bytes, 1774 string bytes
> 
>   Figures are for Solaris7.

Ah. I just replied to that. Guess that one went to python-checkins :-)

True, this is a small amount of memory. But they start to add up.
non-const globals also pain me when I start to work on free-threading
stuff (each must be examined to see if synchronization is needed), so
reducing the number there is important. Regarding the memory itself: as I
mentioned in the other note, I just want to ensure that Python's working
set remains low (reasons given in that email).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skip@mojam.com (Skip Montanaro)  Thu Dec 16 18:09:11 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 16 Dec 1999 12:09:11 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <Pine.LNX.4.10.9912161003010.16305-100000@nebula.lyra.org>
References: <199912161553.KAA08428@eric.cnri.reston.va.us>
 <Pine.LNX.4.10.9912161003010.16305-100000@nebula.lyra.org>
Message-ID: <14425.10951.169751.843764@dolphin.mojam.com>

>>>>> "Greg" == Greg Stein <gstein@lyra.org> writes:

    Greg> On Thu, 16 Dec 1999, Guido van Rossum wrote:
    >> I don't think there's much of a need to worry about this.  Why are
    >> you always bringing up this subject?  No-one else that I know has
    >> ever had this concern...

    Greg> Somebody has to :-)

    Greg> Keeping the working set low is more efficient from a system
    Greg> standpoint. 

Not to mention the not-all-that-occasional-anymore requests to have Python
on various itty-bitty things like Palm Pilots and WinCE devices.  It's one
thing to add size to modules people can live without for many applications,
but I think the posix module and its other platform-specific relations are
fairly heavily used.  (I realize this specific example isn't likely to apply
to PP/WinCE.)

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From gstein@lyra.org  Thu Dec 16 18:21:54 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 10:21:54 -0800 (PST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
In-Reply-To: <199912161527.KAA08308@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Guido van Rossum wrote:
>...
> I realize it's just a rant.  In this case (distutils) your advice is
> correct.  (I usually paraphrase it as "release early, release often".)

True. I prefer that phrase, too, but I used it on JimA earlier in the day
or the previous day. I didn't want to sound like a broken record :-). But
that is why I moved into <rant> mode... it seems like the mindset was
spreading :-) I've railed at AMK for it, too :-), when he was talking
about 0.5.1pre1 or whatever, rather than just releasing 0.5.1 and doing an
0.5.2 if there was a problem.

> However there are other situations, like core Python itself, where
> it's really useful to have stable releases -- if only for those users
> who won't touch anything with "beta" in its name.  I still hear from
> people who haven't upgraded to 1.5.2.

But this doesn't explain why there isn't a 1.5.3b1, 1.5.3b2, etc. Or
1.6.0a1 or whatever (maybe "d" or "r" for dev release, as opposed to
alpha).

There are some people would like the releases rather than using CVS. Some
people can't even use CVS because of firewall issues. Of course, an
alternative is snapshot-tarballs of the CVS repository. But a snapshot
could *really* be broken; something like 1.6.0d1 says "well, it's a
development release, but I've hit a good point between some changes."

> I wonder if perhaps for those cases (where there's a demand for stable
> releases) some other strategy could be used?  Such as labeling
> releases "stable" after the fact?  Or what Linus seems to do with the
> Linux kernel (even = stable, odd = development; or was it the other
> way around?).

Yes: even are stable (e.g. 1.0, 1.2, 2.0, 2.2). The odd numbers are for
development. Linus is currently working 2.3.x, but declared in the past
couple days that things will be wrapping up to move towards 2.4. Once he
thinks it is ready, he'll start off with 2.4.0pre1, pre2, pre3... At some
point the "pre" suffix will drop and 2.4.0 will be released.

You might have a bit of problem using that mechanism since the current
stable release is 1.5 :-). Once 1.6 hits the street, then you could start
doing 1.9 releases (dev) and shift to 2.0 once it is "stable".

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Thu Dec 16 18:02:55 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 10:02:55 -0800
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
References: <199912132354.SAA10101@amarok.cnri.reston.va.us>
 <3856A77C.3A4D9F00@prescod.net>
 <14423.49044.143333.790752@amarok.cnri.reston.va.us>
 <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us>
Message-ID: <3859294F.138FF398@prescod.net>

"Andrew M. Kuchling" wrote:
> 
>     * Python revisions come out slowly, once every year or two.  XML
>     standards have been revolving faster , and we don't want to wait
>     until 1.7 for SAX2, or DOM Level2, or other new revisions.
>     Keeping the modules out of the core lets them be updated at their
>     own pace.  A counterargument is that the XML specs are slowing
>     down -- add namespace support to SAX, and finalize DOM
>     Level 2, and I don't think any other standards are very important
>     to basic XML programming.

I agree with your counterargument. :) Anyhow, isn't there a logical
fallacy in your original argument? Why can't we offer a DOM 3 module or
extension after Python ships with DOM 2? 

>     * We really want a C-based parser to be commonly available.
>     sgmlop is the only reasonable choice for this, because I'd be
>     against including Expat.  To replay some arguments I made against
>     including the zlib library in 1.6, what if a C extension requires
>     a newer version of the library?  Symbol conflicts if you're lucky,
>     hard-to-debug problems if you're not.

I don't understand this issue. Why would a C extension build on sgmlop
which is designed to make XML information available to *Python*
programmers?

>     * We can drop various marginal bits of the CVS tree; the xmlarch
>     support is probably not of very wide interest, for example.

How about "expat", "mac", "pyexpat", "utils", "windows". There is just
too much stuff there! And I daresay that alot of it has not been
"quality controlled" to the level that we would expect if it were a part
of the real Python library. In other words, there is no single place to
go to get only XML-processing software that works well and works
together.

> I think I'm on the record as saying that Python's major problems now
> aren't language-related, but are with the development environment.
> Language changes (from minor, like 'for i in 1..9', to major, like
> fixing the type/class dichotomy or adding static types) aren't going
> to bring in piles of new users, useful though they might be to
> experienced Pythoneers, large projects, or some other specific
> application.

(irrelevant aside: I agree 100% that making things easier to install
will actually improve newbies experience more than (e.g.) static type
checking but I do not agree that it is a better "sales tool". Most
people are sold based on the language and its libraries before they
start trying to install extensions.)

> If installing things is a problem, then we need to
> buckle down and finish the distutils.  So, overall, I'd still vote
> against inclusion in 1.6.

So are you saying that Python 2 might have only five packages and
everything else must be downloaded? No httplib, no pickle, no random or
math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?

When people download Python and go to the library documentation that
impressive array of BUILT-IN-FEATURES is part of what sells them on
Python. Hell, I can download all of that stuff for Scheme but what makes
Python beautiful is that I don't have to download it for Python. It's
just there. But if an XML person comes to Python after hearing us rant
about how great it is for processing XML and all they find is
xmllib...they will be underwhelmed.

> No, it's *got* to reach 1.0.  The point of the package is that it's
> exactly *one* thing to install that gives basic XML tools; you don't
> need to chase down the SAX modules from Lars' page, PyExpat from
> ftp.cwi.nl, sgmlop from pythonware.com, and so forth.  If the
> Distutils made it as easy as:
> 
> python fetchpackage.py SAX PyExpat DOM sgmlop
>    <find PySAX's home site>
>    <download it>
>    <compile & install>
>    etc...
> 
> then much of the need for a single package goes away, but, as you
> point out, that isn't currently the case.

I'm a little lost here. We need xmllib to continue because distutils
doesn't do what we need yet but we don't need to put the stuff in the
Python library because disutils will work well enough soon.

But there is an important issue that disutils will not solve. One of the
beautiful things about the Python library is that everything is at the
same version level. When you install it you know that everything works
together or else it WILL in the next patch level if you report the
incompatibility. When the xml package gets versioned incompatibly with
the Python library you don't have that safe feeling. 

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From akuchlin@mems-exchange.org  Thu Dec 16 18:50:48 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Thu, 16 Dec 1999 13:50:48 -0500 (EST)
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
In-Reply-To: <3859294F.138FF398@prescod.net>
References: <199912132354.SAA10101@amarok.cnri.reston.va.us>
 <3856A77C.3A4D9F00@prescod.net>
 <14423.49044.143333.790752@amarok.cnri.reston.va.us>
 <3857CEB0.C29C5F24@prescod.net>
 <14423.57778.131798.776845@amarok.cnri.reston.va.us>
 <3859294F.138FF398@prescod.net>
Message-ID: <14425.13448.737831.460241@amarok.cnri.reston.va.us>

(Responding to the python-dev related portion of this...)

Paul Prescod writes:
>I don't understand this issue. Why would a C extension build on sgmlop
>which is designed to make XML information available to *Python*
>programmers?

No, no; I'm arguing against shipping with Expat; sgmlop good!
Consider this scenario:

	* Python includes Expat 1.0
	* Some C library (for DAV or whatever) uses Expat 1.1
	* Someone writes a Python interface to this C library and
	  attempts to compile it statically.
	* Two versions of Expat in the same binary; symbol conflicts
	  and core dumps, oh my!

>So are you saying that Python 2 might have only five packages and
>everything else must be downloaded? No httplib, no pickle, no random or
>math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?

I'm not arguing for dropping existing packages; I'm against adding
many more of them.  Existing library modules can stay where they are.
But I wouldn't mind a minimalist Python too much, if it came with a
script fetch-basic-packages:

python fetch-packages.py httplib
python fetch-packages.py imaplib
 ...  200 more lines ...

>I'm a little lost here. We need xmllib to continue because distutils
>doesn't do what we need yet but we don't need to put the stuff in the
>Python library because disutils will work well enough soon.

Basically, yes.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
And now let us hasten to the station. I have commanded the rain to fall at
exactly one-fifteen and I would hate to get my shoes wet.
    -- Lord Lavender, in SEBASTIAN O #2


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Thu Dec 16 18:50:49 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Thu, 16 Dec 1999 13:50:49 -0500 (EST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
References: <199912161527.KAA08308@eric.cnri.reston.va.us>
 <Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>
Message-ID: <14425.13449.954026.960703@anthem.cnri.reston.va.us>

    >> I wonder if perhaps for those cases (where there's a demand for
    >> stable releases) some other strategy could be used?  Such as
    >> labeling releases "stable" after the fact?  Or what Linus seems
    >> to do with the Linux kernel (even = stable, odd = development;
    >> or was it the other way around?).

I really dislike the odd/even distinction for exactly this reason.

-Barry


From guido@CNRI.Reston.VA.US  Thu Dec 16 19:02:16 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 16 Dec 1999 14:02:16 -0500
Subject: [Python-Dev] Batteries Included?
Message-ID: <199912161902.OAA11345@eric.cnri.reston.va.us>

I like the batteries included approach, but I also feel resistence
against including stuff I cannot maintain.  The XML code base is a
point in case; I don't understand enough about XML.  (I just read that
xmllib.py is "illegal".  Jeez!  What happened?  Did Congress pass a
law against it?)

I think it may be time for separate Python distributions, like Linux
-- I can concentrate on the core, and keep it really small; others can
make all-encompassing distributions.

There are currently some drawbacks to this approach: non-core modules
have less status; and the documentation process is fundamentally
different for core and non-core modules.  There's also the version
dependency stuff, but I think resolving that is the responsibility of
the distribution makers.

I think the status problem will be gone once there is a respected
distribution -- then you derive status from being in that
distribution, rather than from being in the core distribution.  (Well,
you would still derive status from being in the core, but it would be
much harder to obtain, since I can set a much higher standard.)

The documentation problem is the one that's left.  I think the doc-sig
may be on its way as we speak to solve this, though.  Fred?

This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Thu Dec 16 19:05:05 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 16 Dec 1999 13:05:05 -0600 (CST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
In-Reply-To: <14425.13449.954026.960703@anthem.cnri.reston.va.us>
References: <199912161527.KAA08308@eric.cnri.reston.va.us>
 <Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>
 <14425.13449.954026.960703@anthem.cnri.reston.va.us>
Message-ID: <14425.14305.907618.978628@dolphin.mojam.com>

    >>> Or what Linus seems to do with the Linux kernel (even = stable, odd
    >>> = development; or was it the other way around?).

    BAW> I really dislike the odd/even distinction for exactly this reason.

It's one saving grace is that it is a uniform format.  There are no
"optional" tokens like "pre", "alpha", "beta", etc for the most part.

To remember which way it is, I find it useful to execute "uname -r", check
the second digit, then look down at my shirt for a pocket protector.  The
two pieces of information together work for me.  I currently get
"2.2.13-4mdk" from uname.  I don't even have a pocket, let alone a pocket
protector, so even numbers must be stable releases...

;-)

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From fdrake@acm.org  Thu Dec 16 19:05:22 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 16 Dec 1999 14:05:22 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <14425.10951.169751.843764@dolphin.mojam.com>
References: <199912161553.KAA08428@eric.cnri.reston.va.us>
 <Pine.LNX.4.10.9912161003010.16305-100000@nebula.lyra.org>
 <14425.10951.169751.843764@dolphin.mojam.com>
Message-ID: <14425.14322.355507.500813@weyr.cnri.reston.va.us>

Skip Montanaro writes:
 > fairly heavily used.  (I realize this specific example isn't likely to apply
 > to PP/WinCE.)

  Or any version of Windows, I suspect; perhaps Mark Hammond can
elaborate.  Appearantly none of the pathconf() constants are defined
on that platform, at least not as #define constants.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From jcw@equi4.com  Thu Dec 16 19:09:42 1999
From: jcw@equi4.com (Jean-Claude Wippler)
Date: Thu, 16 Dec 1999 20:09:42 +0100
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
References: <199912132354.SAA10101@amarok.cnri.reston.va.us>
 <3856A77C.3A4D9F00@prescod.net>
 <14423.49044.143333.790752@amarok.cnri.reston.va.us>
 <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> <3859294F.138FF398@prescod.net>
Message-ID: <385938F6.C4164756@equi4.com>

Paul Prescod wrote:
[...]
> (irrelevant aside: [...] Most people are sold based on the language
> and its libraries before they start trying to install extensions.)
> 
> [AMK]
> > If installing things is a problem, then we need to
> > buckle down and finish the distutils.  So, overall, I'd still vote
> > against inclusion in 1.6.
> 
> So are you saying that Python 2 might have only five packages and
> everything else must be downloaded? No httplib, no pickle, no random
> or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?
> 
> When people download Python and go to the library documentation that
> impressive array of BUILT-IN-FEATURES is part of what sells them on
> Python. Hell, I can download all of that stuff for Scheme but what
> makes Python beautiful is that I don't have to download it for Python.
> It's just there. But if an XML person comes to Python after hearing us
> rant about how great it is for processing XML and all they find is
> xmllib...they will be underwhelmed.

(Nodding in agreement)

Could this perhaps be solved with a large batteries-included standard
distribution, plus a real easy/effective way to strip Python down and
wrap things up for deployment?  

In other words, aim for two very distinct goals: everything within easy
reach for development + fully signed-sealed-delivered products.

The first goal can evolve to do fancy net-bourne distribution, even if
it is a brittle process, because this is for Python developers.  They
want it all, so open the floodgate to give it all to them.

The second becomes a matter or pruning down and wrapping up.  All the
way down to an single installation-less executable, if possible.

I may well be wrong (and I'm not tracking distutils), but might it not
be simpler to focus on 1) power users + 2) production-grade deployment,
instead of trying to streamline a tangled-web-of-module-dependencies
into a distribution system which tries to meet a wide range of needs?

> [...] One of the beautiful things about the Python library is that
> everything is at the same version level. When you install it you know
> that everything works together or else it WILL in the next patch level
> if you report the incompatibility.  [...]

More nods.  So why not allow the Python distribution to become very
large - with every release moving to a better-tuned combination of all
the different parts (occasional mishaps can quickly be fixed)?

Plus some tools to dist(ut)il(l) a turnkey solution from this big soup.

Sort-of-from-violin-to-quartet-all-the-way-to-symphony-orchestra...

-- Jean-Claude


From gstein@lyra.org  Thu Dec 16 20:02:46 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 12:02:46 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38590844.769C3025@interet.com>
Message-ID: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, James C. Ahlstrom wrote:
> Did anyone look at this yet?
> 
>    ftp://ftp.interet.com/pub/pylib.html
> 
>    ftp://ftp.interet.com/pub/zipfile.py

I went to look for it, but I think that was before you put zipfile up.

Looking at it now...  The writepy() as a method is questionable, I think.
I think it should open the file at instantiation time. I don't see a
reason to allow that to be deferred. Especially given that some of the
methods fail if open() hasn't been called. It would be good to have
symbolic names for the 0 and 8 compression constants, and to fail if 8 is
passed and zlib is not available (otherwise, it doesn't fail until
read/write time, and with a NameError). There should probably be a
__del__ that calls close(). Oh, and a "closed" attribute that can be
checked and an error raised if an operation is done after the file has
been closed. I think dir() should return the contents, rather than print
them. read() and write() ought to fail if the mode is incorrect. Oh, some
symbolic constants for things like "PK\005\006" would be nice.

Do you have a ZipImporter written?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 20:12:30 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 12:12:30 -0800 (PST)
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
In-Reply-To: <14425.13448.737831.460241@amarok.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161210350.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Andrew M. Kuchling wrote:
> Paul Prescod writes:
> >I don't understand this issue. Why would a C extension build on sgmlop
> >which is designed to make XML information available to *Python*
> >programmers?
> 
> No, no; I'm arguing against shipping with Expat; sgmlop good!
> Consider this scenario:
> 
> 	* Python includes Expat 1.0
> 	* Some C library (for DAV or whatever) uses Expat 1.1
> 	* Someone writes a Python interface to this C library and
> 	  attempts to compile it statically.
> 	* Two versions of Expat in the same binary; symbol conflicts
> 	  and core dumps, oh my!

We should ship pyexpat, not Expat.  (IMO)

> >So are you saying that Python 2 might have only five packages and
> >everything else must be downloaded? No httplib, no pickle, no random or
> >math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?
> 
> I'm not arguing for dropping existing packages; I'm against adding
> many more of them.  Existing library modules can stay where they are.
> But I wouldn't mind a minimalist Python too much, if it came with a
> script fetch-basic-packages:
> 
> python fetch-packages.py httplib
> python fetch-packages.py imaplib
>  ...  200 more lines ...

Considering that it would probably use HTTP to fetch the packages, I think
you wouldn't be fetching httplib :-)

But yes: I agree with the basic sentiment.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From petrilli@amber.org  Thu Dec 16 20:55:16 1999
From: petrilli@amber.org (Christopher Petrilli)
Date: Thu, 16 Dec 1999 15:55:16 -0500
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <199912161902.OAA11345@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Thu, Dec 16, 1999 at 02:02:16PM -0500
References: <199912161902.OAA11345@eric.cnri.reston.va.us>
Message-ID: <19991216155516.A28037@trump.amber.org>

Guido van Rossum [guido@CNRI.Reston.VA.US] wrote:
> I think it may be time for separate Python distributions, like Linux
> -- I can concentrate on the core, and keep it really small; others can
> make all-encompassing distributions.

My fear is what we face in the Zope world---different distributions break
in totally diffrent ways, and sometimes we have to ask 30 questions to figure
out what might be going wrong :/  The nice thing is hat if someone installes
Python from the source, we know what's going to happen.  I don't know if
this is solvable, honestly.

> This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

I think Guido just wants to IPO and retire :-)

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From gward@cnri.reston.va.us  Thu Dec 16 21:03:26 1999
From: gward@cnri.reston.va.us (Greg Ward)
Date: Thu, 16 Dec 1999 16:03:26 -0500
Subject: [Python-Dev] distutils-sig/python-dev crosstalk
Message-ID: <19991216160325.H4289@cnri.reston.va.us>

Most recent threads on distutils-sig seem to have migrated to python-dev
pretty quickly.  This means that a) there are python-dev people on
distutils-sig (duh), b) they think what goes on there is important
enough to interest the other core developers (good!), and c) they assume
there are people on python-dev who are not also on distutils-sig.

Is this last assumption true?  If you read python-dev, are interested in
distutils issues, but do *not* read distutils-sig, please drop me a
note.  If no one says anything, I will (politely, tentatively) propose
that we keep the distutils threads on distutils-sig and leave python-dev
for, well, core Pythond development.

If you think that the two are inextricably linked and I might as well
just cross-post everything on distutils-sig to python-dev, let me know
about that too.  ;-)

        Greg
-- 
Greg Ward - software developer                    gward@cnri.reston.va.us
Corporation for National Research Initiatives    
1895 Preston White Drive                           voice: +1-703-620-8990
Reston, Virginia, USA  20191-5434                    fax: +1-703-620-0913


From gstein@lyra.org  Thu Dec 16 21:18:50 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:18:50 -0800 (PST)
Subject: [Python-Dev] distutils-sig/python-dev crosstalk
In-Reply-To: <19991216160325.H4289@cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161316580.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Greg Ward wrote:
>...
> If you think that the two are inextricably linked and I might as well
> just cross-post everything on distutils-sig to python-dev, let me know
> about that too.  ;-)

:-)  I think distutils is about the mechanics. And it is a large and
sophisticated problem (which why it has a SIG :-). You could almost view
it as a spinoff of the python-dev grand problem set.

When we get into the question of "what does Python ship with?", then I
think it belongs in python-dev, as that is a discussion of what
constitutes Python itself.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 21:21:12 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:21:12 -0800 (PST)
Subject: [Python-Dev] distutils-sig/python-dev crosstalk
In-Reply-To: <19991216160325.H4289@cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161318550.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Greg Ward wrote:
> Most recent threads on distutils-sig seem to have migrated to python-dev
> pretty quickly.  This means that a) there are python-dev people on
> distutils-sig (duh), b) they think what goes on there is important
> enough to interest the other core developers (good!), and c) they assume
> there are people on python-dev who are not also on distutils-sig.

Oh. One more thing.

Actually, what I am somewhat worried about is whether there was relevant
discussion on python-dev that should have been visible to the distutils
people. Not sure if there was, but that is always a potential problem.
Same with the recent xml-sig / python-dev crosstalk. Specifically, Paul
Prescod is not on python-dev, so he may have missed a response or two.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From mal@lemburg.com  Thu Dec 16 21:23:30 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 16 Dec 1999 22:23:30 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com>
Message-ID: <38595852.E8054741@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "James C. Ahlstrom" wrote:
> 
> >    ftp://ftp.interet.com/pub/pylib.html
> 
> I just changed zipfile.py so that regular zip compression
> works.  And if zlib is available,
> its crc32() is used instead of the Python version.
> 
> I should mention that the current code rejects zip files which have
> an archive comment added to the end.  Accepting them would require
> a search, and I am not sure it is worth it.

I don't think it is needed for our purposes, but maybe a
subclass could provide it ?

FYI, I've tested the module against mxStack-0.3.0.zip which 
you can find on my Python Pages. It was created using Info-ZIP's
zip 2.2 on Linux.

Unfortunately, I always get the following traceback when trying
to print the directory:

>>> z.open('../projects/distribution/mxStack-0.3.0.zip','rb')
>>> z.dir()
File Name                             Modified             Size
Stack/mxStack/mxStack.h        1999-04-16 10:50:06         4368
Stack/mxStack/mxstdlib.h       1999-04-13 15:37:52         5433
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File "/home/lemburg/lib/zipfile.py", line 120, in dir
    bytes = self.read(name)     # Just to check CRC-32
  File "/home/lemburg/lib/zipfile.py", line 133, in read
    bytes = zlib.decompress(bytes, -15)
zlib.error: Error -5 while decompressing data

Some notes on the API:
----------------------
* I would find it more convenient if the filename and mode
would be constructor parameters, e.g.

	zfile = zipfile('myfile.zip','rb')

with compression defaulting to 8 rather than 0 (most zip files
will be deflated since this is the ZIP default).

* Also, I would like a method much like the os.listdir()
which returns a list of filenames rather than print it
to stdout.

* .is_zipfile() should probably be a separate function: it
doesn't use any of the class' features.

More wishes to come ;-)

So far: Great Work !

Aside: I found that you are using undocumented arguments to
zlib.compressobj() ... are these extra arguments left out of
the documentation on purpose or by simple oversight ? I couldn't
find them in the HTML docs and neither in the docstrings.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    15 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein@lyra.org  Thu Dec 16 21:32:09 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:32:09 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38595852.E8054741@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912161330570.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, M.-A. Lemburg wrote:
>...
> Some notes on the API:
> ----------------------
> * I would find it more convenient if the filename and mode
> would be constructor parameters, e.g.
> 
> 	zfile = zipfile('myfile.zip','rb')
> 
> with compression defaulting to 8 rather than 0 (most zip files
> will be deflated since this is the ZIP default).
> 
> * Also, I would like a method much like the os.listdir()
> which returns a list of filenames rather than print it
> to stdout.

The above two items were in my ramble, just not as clear as MAL :-)

> * .is_zipfile() should probably be a separate function: it
> doesn't use any of the class' features.

Ah! Good call. It is even more important to shift it out if the
constructor now opens a file.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fdrake@acm.org  Thu Dec 16 21:33:36 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 16 Dec 1999 16:33:36 -0500 (EST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38595852.E8054741@lemburg.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
 <3857B97E.3684224F@interet.com>
 <38590844.769C3025@interet.com>
 <38591E65.4885A39D@interet.com>
 <38595852.E8054741@lemburg.com>
Message-ID: <14425.23216.636687.704436@weyr.cnri.reston.va.us>

M.-A. Lemburg writes:
 > Aside: I found that you are using undocumented arguments to
 > zlib.compressobj() ... are these extra arguments left out of
 > the documentation on purpose or by simple oversight ? I couldn't
 > find them in the HTML docs and neither in the docstrings.

  The documentation is way out of date and Jeremy Hylton and Andrew
Kuchling haven't updated it.  I'm not sure which of them changed the
signatures for that module, but I've pestered Jeremy about it a few
times.
  If anyone would like to update the documentation, I'd certainly
appreciate it.  I don't know the details of those interfaces, and this 
is somewhere where the details are pretty critical.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From bwarsaw@python.org  Thu Dec 16 23:10:11 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Thu, 16 Dec 1999 18:10:11 -0500 (EST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
References: <199912161527.KAA08308@eric.cnri.reston.va.us>
 <Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>
 <14425.13449.954026.960703@anthem.cnri.reston.va.us>
 <14425.14305.907618.978628@dolphin.mojam.com>
Message-ID: <14425.29011.429867.485070@anthem.cnri.reston.va.us>

>>>>> "SM" == Skip Montanaro <skip@mojam.com> writes:

    SM> To remember which way it is, I find it useful to execute
    SM> "uname -r", check the second digit, then look down at my shirt
    SM> for a pocket protector.  The two pieces of information
    SM> together work for me.  I currently get "2.2.13-4mdk" from
    SM> uname.  I don't even have a pocket, let alone a pocket
    SM> protector, so even numbers must be stable releases...

What do you do if it's the second Thursday after the full moon, and
the local hockey team has just skated to a 3-3 tie?

-Barry


From mal@lemburg.com  Thu Dec 16 21:53:36 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 16 Dec 1999 22:53:36 +0100
Subject: [Python-Dev] Batteries Included?
References: <199912161902.OAA11345@eric.cnri.reston.va.us>
Message-ID: <38595F60.7C1B34FF@lemburg.com>

Guido van Rossum wrote:
> 
> I like the batteries included approach, but I also feel resistence
> against including stuff I cannot maintain. 
> ...
> This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

I think we should wait for distutils to get up and running
perfectly for everyone before taking such a step.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    15 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein@lyra.org  Fri Dec 17 08:31:38 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 00:31:38 -0800 (PST)
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <38595F60.7C1B34FF@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912170027530.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, M.-A. Lemburg wrote:
> Guido van Rossum wrote:
> > I like the batteries included approach, but I also feel resistence
> > against including stuff I cannot maintain. 

This is an interesting comment, and is similar to the Apache sentiment.
Nothing gets added to the standard distribution unless somebody in the
Group is willing to maintain it. It provides a good mechanism for keeping
the module set to a reasonable size and a set that can/will actually be
maintained.

> > ...
> > This isn't rocket science.  Red Hat Python?  I'm all for it! :-)
> 
> I think we should wait for distutils to get up and running
> perfectly for everyone before taking such a step.

You can also operate on the assumption that it will be done by the time
1.6 is ready to be released. In other words: do the work (distutils and
minimizing the release) in parallel, rather than in sequence.

I would also think that a large distro isn't going to be assembled with
distutils. Somebody will sit down, pull all the components together, and
make a big release.

However, I do see the distutils as being needed for the people who grab
the minimal distro. They need it to grab add'l packages.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik@pythonware.com  Fri Dec 17 09:06:20 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 17 Dec 1999 10:06:20 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>
Message-ID: <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com>

James C. Ahlstrom wrote:
> > Did anyone look at this yet?
> > 
> >    ftp://ftp.interet.com/pub/pylib.html
> > 
> >    ftp://ftp.interet.com/pub/zipfile.py
> 
> I went to look for it, but I think that was before you put zipfile up.

just a few comments (from reading the docs):

-- it would be great if "open" could take an open file
object as well as a file name.

(in this case, you also need to document what you
expect from the underlying file object: read, write,
seek, tell should be enough, right?  haven't looked
at the code -- assuming it works, I'm only interested
in the interface)

-- or you could nuke "open" and pass those arguments
to the constructor instead.

-- I assume "open" adds "b" to the given mode argument.

-- "dir" looks a bit strange.  and hey, there's no "listdir"
in there.  I'd prefer a recursive "listdir" method, which
takes an optional "depth" argument (e.g. 0=this dir,
1=this dir and first subdir, None=infinity, i.e. the full
tree).

that's all for now.

</F>


From fredrik@pythonware.com  Fri Dec 17 12:21:03 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 17 Dec 1999 13:21:03 +0100
Subject: [Python-Dev] posix module
References: <14423.61493.90107.433664@weyr.cnri.reston.va.us>
Message-ID: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com>

> Ok, I think I'm done with the posix module updates, modulo bugs and
> additional symbols for the *conf*() tables.

gcc  -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H -c ./posixmodule.c
./posixmodule.c:3789: `_SC_AIO_LIST_MAX' undeclared here (not in a function)
./posixmodule.c:3789: initializer element for `posix_constants_sysconf[10].value' is not constant
make[1]: *** [posixmodule.o] Error 1
make[1]: Leaving directory `/data/repository/BleedingEdge/python/dist/src/Modules'

(current CVS stuff, on Red Hat 5.2)

</F>


From jim@interet.com  Fri Dec 17 14:33:31 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 09:33:31 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>
Message-ID: <385A49BB.4D064240@interet.com>

Greg Stein wrote:
> 
> On Thu, 16 Dec 1999, James C. Ahlstrom wrote:
> > Did anyone look at this yet?
> >
> >    ftp://ftp.interet.com/pub/pylib.html
> >
> >    ftp://ftp.interet.com/pub/zipfile.py
> 
> Looking at it now...  The writepy() as a method is questionable, I think.
> I think it should open the file at instantiation time. I don't see a
> reason to allow that to be deferred. Especially given that some of the
> methods fail if open() hasn't been called.

I eliminated open and added its args to the constructor.

> It would be good to have
> symbolic names for the 0 and 8 compression constants, and to fail if 8 is
> passed and zlib is not available (otherwise, it doesn't fail until
> read/write time, and with a NameError). There should probably be a
> __del__ that calls close(). Oh, and a "closed" attribute that can be
> checked and an error raised if an operation is done after the file has
> been closed.

All done.

> I think dir() should return the contents, rather than print
> them.

I added listdir() and documented self.TOC.  I kept printdir()
as example code.

> read() and write() ought to fail if the mode is incorrect. Oh, some
> symbolic constants for things like "PK\005\006" would be nice.

All done.

JimA


From guido@CNRI.Reston.VA.US  Fri Dec 17 14:43:23 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 17 Dec 1999 09:43:23 -0500
Subject: [Python-Dev] Batteries Included?
In-Reply-To: Your message of "Thu, 16 Dec 1999 22:53:36 +0100."
 <38595F60.7C1B34FF@lemburg.com>
References: <199912161902.OAA11345@eric.cnri.reston.va.us>
 <38595F60.7C1B34FF@lemburg.com>
Message-ID: <199912171443.JAA12414@eric.cnri.reston.va.us>

> Guido van Rossum wrote:
> > 
> > I like the batteries included approach, but I also feel resistence
> > against including stuff I cannot maintain. 
> > ...
> > This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

MAL:
> I think we should wait for distutils to get up and running
> perfectly for everyone before taking such a step.

Fair enough -- but in the mean time, no more pushing for new modules
in the core distribution (distutils excluded).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward@cnri.reston.va.us  Fri Dec 17 14:59:09 1999
From: gward@cnri.reston.va.us (Greg Ward)
Date: Fri, 17 Dec 1999 09:59:09 -0500
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us>; from guido@cnri.reston.va.us on Fri, Dec 17, 1999 at 09:43:23AM -0500
References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> <199912171443.JAA12414@eric.cnri.reston.va.us>
Message-ID: <19991217095908.B8799@cnri.reston.va.us>

On 17 December 1999, Guido van Rossum said:
> Fair enough -- but in the mean time, no more pushing for new modules
> in the core distribution (distutils excluded).

So anyone who wants a new module snuck into the core just has to
convince me to add it the distutils package, right?  >snicker<

        Greg


From jeremy@cnri.reston.va.us  Fri Dec 17 18:30:37 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Fri, 17 Dec 1999 13:30:37 -0500 (EST)
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us>
References: <199912161902.OAA11345@eric.cnri.reston.va.us>
 <38595F60.7C1B34FF@lemburg.com>
 <199912171443.JAA12414@eric.cnri.reston.va.us>
Message-ID: <14426.33101.757523.853781@goon.cnri.reston.va.us>

>>>>> "GvR" == Guido van Rossum <guido@CNRI.Reston.VA.US> writes:

  >> Guido van Rossum wrote:  I like the batteries included
  >> approach, but I also feel resistence  against including stuff I
  >> cannot maintain.   ...   This isn't rocket science.  Red Hat
  >> Python?  I'm all for it! :-)

  >> MAL wrote:
  >> I think we should wait for distutils to get up and running
  >> perfectly for everyone before taking such a step.

  GvR> Fair enough -- but in the mean time, no more pushing for new
  GvR> modules in the core distribution (distutils excluded).

Perhaps the right long-term solution (post-distutils) is to split
Python into a core architected by Guido and a bazaar-style standard
library maintained in a more apache-style.

Jeremy


From jim@interet.com  Fri Dec 17 15:25:10 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 10:25:10 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com>
Message-ID: <385A55D6.A8A05EB9@interet.com>

"M.-A. Lemburg" wrote:

> Unfortunately, I always get the following traceback when trying
> to print the directory:

OK, I changed the decompress code (10:23 AM), please re-try.

> with compression defaulting to 8 rather than 0 (most zip files
> will be deflated since this is the ZIP default).

The compress mode only applies to writing.  On read, the
method recorded in the file controls.

JimA


From jim@interet.com  Fri Dec 17 14:49:20 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 09:49:20 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org> <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com>
Message-ID: <385A4D70.A162C584@interet.com>

Fredrik Lundh wrote:
> 
> James C. Ahlstrom wrote:
> > >
> > >    ftp://ftp.interet.com/pub/pylib.html

> -- it would be great if "open" could take an open file
> object as well as a file name.

I put these arguments into the constructor now.

> (in this case, you also need to document what you
> expect from the underlying file object: read, write,
> seek, tell should be enough, right?  haven't looked
> at the code -- assuming it works, I'm only interested
> in the interface)

OK, docs updated.

> -- I assume "open" adds "b" to the given mode argument.

Correct.  The mode can be either "w" or "wb" etc., and it works.

> -- "dir" looks a bit strange.  and hey, there's no "listdir"
> in there.  I'd prefer a recursive "listdir" method, which
> takes an optional "depth" argument (e.g. 0=this dir,
> 1=this dir and first subdir, None=infinity, i.e. the full
> tree).

I added a plain listdir() and changed dir() to printdir().  I also
documented self.TOC which gets you the values too.

JimA


From jim@interet.com  Fri Dec 17 14:39:51 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 09:39:51 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com>
Message-ID: <385A4B37.333B9443@interet.com>

"M.-A. Lemburg" wrote:
> 
> "James C. Ahlstrom" wrote:
> > >    ftp://ftp.interet.com/pub/pylib.html
> >

> Unfortunately, I always get the following traceback when trying
> to print the directory:

Yes, compression isn't there yet.  I am looking into it.
 
> Some notes on the API:
> ----------------------
> * I would find it more convenient if the filename and mode
> would be constructor parameters, e.g.
> 
>         zfile = zipfile('myfile.zip','rb')

OK, done.
 
> with compression defaulting to 8 rather than 0 (most zip files
> will be deflated since this is the ZIP default).

Until compression works, and zlib ships with Python I
would rather default to no compression (method 0).  Otherwise
this is not useful as a Python import archive.
 
> * Also, I would like a method much like the os.listdir()
> which returns a list of filenames rather than print it
> to stdout.

OK, done.
 
> * .is_zipfile() should probably be a separate function: it
> doesn't use any of the class' features.

OK, done.
  
> Aside: I found that you are using undocumented arguments to
> zlib.compressobj() ... are these extra arguments left out of
> the documentation on purpose or by simple oversight ? I couldn't
> find them in the HTML docs and neither in the docstrings.

I am following the CNRI code blindly here.  I don't have
docs either.

JimA


From jack@oratrix.nl  Fri Dec 17 22:54:03 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Fri, 17 Dec 1999 23:54:03 +0100
Subject: [Python-Dev] Batteries Included?
In-Reply-To: Message by Jeremy Hylton <jeremy@cnri.reston.va.us> ,
 Fri, 17 Dec 1999 13:30:37 -0500 (EST) , <14426.33101.757523.853781@goon.cnri.reston.va.us>
Message-ID: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl>

Recently, Jeremy Hylton <jeremy@cnri.reston.va.us> said:
> Perhaps the right long-term solution (post-distutils) is to split
> Python into a core architected by Guido and a bazaar-style standard
> library maintained in a more apache-style.

I can't help feeling uncomfortable with this. I've had quite some work 
to get an Apache with SSL up and running, even though someone gave me
quite precise instructions. With Perl I fared even worse, despite
their distutils-like package, when I wanted to try a PalmPilot package 
for Unix that needed Perl. I finally had to give up after quite some
effort because the addon installers kept finding the older version of
Perl that the system mgr had installed in stead of my newer version.

I think distutils will be wonderful for us, the Python community, but
something more RedHattish is needed for the general world who just want 
Python plus a certain set of extensions because some application needs 
it, so they can just download a fresh copy of ParrotPython 3.4.4 and
know the application will work, without interfering with another
application that happens to use Inquisition 1a5 and lives elsewhere on 
the disk.

And maybe the answer is a much simpler freezing process, like
MacPython BuildApplication where any Python user can drop a script on
it and end up with a fully self-contained app guaranteed (well.... No
reports to the contrary have been heard so far, at least:-) to contain
everything needed and not interfere with an existing MacPython
installation (or be interfered with by it). Then a popular app will
have prebuilt binaries available for all platforms quickly, made by
the Python community, and the enduser interested in the app but not in 
Python can simply download that.

--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From mal@lemburg.com  Sat Dec 18 13:17:52 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 18 Dec 1999 14:17:52 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A4B37.333B9443@interet.com>
Message-ID: <385B8980.11CDE9AC@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> > "James C. Ahlstrom" wrote:
> > > >    ftp://ftp.interet.com/pub/pylib.html
> > >
> 
> > Unfortunately, I always get the following traceback when trying
> > to print the directory:
> 
> Yes, compression isn't there yet.  I am looking into it.

Great :-)
 
> > Some notes on the API:
> > ----------------------
> > * I would find it more convenient if the filename and mode
> > would be constructor parameters, e.g.
> >
> >         zfile = zipfile('myfile.zip','rb')
> 
> OK, done.
> 
> > with compression defaulting to 8 rather than 0 (most zip files
> > will be deflated since this is the ZIP default).
> 
> Until compression works, and zlib ships with Python I
> would rather default to no compression (method 0).  Otherwise
> this is not useful as a Python import archive.

Point taken.

Perhaps it would be even better to not have a
default at all: that way people will have to think about the
issue *before* implementing it, rather than debug code
that produces tracebacks.

> > * Also, I would like a method much like the os.listdir()
> > which returns a list of filenames rather than print it
> > to stdout.
> 
> OK, done.
> 
> > * .is_zipfile() should probably be a separate function: it
> > doesn't use any of the class' features.
> 
> OK, done.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    13 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Sat Dec 18 15:16:44 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 18 Dec 1999 16:16:44 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com>
Message-ID: <385BA55C.9DFCA88D@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> 
> > Unfortunately, I always get the following traceback when trying
> > to print the directory:
> 
> OK, I changed the decompress code (10:23 AM), please re-try.

Everything is fine now... it's really impressive how easy
you can manipulate ZIP files with it.

One thing I'd suugest is to include some way to delete and
update contents, e.g. the write() method should overwrite
any existing entry in the archive (if it not already does --
I haven't tested it, just read the code and it seems to raise
an exception), plus maybe a .remove() method which deletes
an entry.
 
> > with compression defaulting to 8 rather than 0 (most zip files
> > will be deflated since this is the ZIP default).
> 
> The compress mode only applies to writing.  On read, the
> method recorded in the file controls.

True. How about making the compression argument mandatory
for file opened in 'wb' mode only ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    13 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From da@ski.org  Sat Dec 18 17:35:00 1999
From: da@ski.org (David Ascher)
Date: Sat, 18 Dec 1999 09:35:00 -0800
Subject: [Python-Dev] Year 2000 O'Reilly Python Conference
Message-ID: <003501bf497e$368f6f60$e655cfc0@ski.org>

I just got off the phone with someone at O'Reilly, who is starting to plan
the next O'Reilly Open Source Convention.  I've agreed to be the chair of
the Python conference, just so that there are no delays in getting the
conference organized.  If someone feels that I should not be chair, speak
now and we can figure out who takes the 'job'.

There are short-term and long-term issues to discuss:

Short term:

- We need a program committee -- If you're interested in being on said
committee or know someone who should be, let me know. I'd like to get
representatives from various subconstituencies on there (web types, zope
types, business types, scientist types, linux types, hackers, etc.)

- The call for papers is going on the O'Reilly website soon.  I will try and
get them to pass things by me first, but if we want to emphasize specific
kinds of paper submissions, we need to decide that soon.

- Greg or Barry, is it possible for one of you to setup a mailman mailing
list which will be used by the program committee?  eGroups is easy for me to
setup, but lots of people hated it last year.  I don't want to pollute
python-dev with conference discussions.

Longer term:

- The schedule for the conference is (supposedly) going to be the same as
last year.  conference-wide keynotes at the beginning of both days, and
4x90minute segments.

- We have two parallel tracks

- We have 4 half-day tutorial slots

- All of the paper materials have to be 'in' by March 1.  We need to decide
how much time we need to go through the review/revision process ourselves.
In other words, the deadline for submissions is up to us, but we don't have
that much time.

--david ascher


From jeremy@cnri.reston.va.us  Sat Dec 18 22:39:58 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Sat, 18 Dec 1999 17:39:58 -0500 (EST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <385A4B37.333B9443@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
 <3857B97E.3684224F@interet.com>
 <38590844.769C3025@interet.com>
 <38591E65.4885A39D@interet.com>
 <38595852.E8054741@lemburg.com>
 <385A4B37.333B9443@interet.com>
Message-ID: <14428.3390.671438.663889@bitdiddle.cnri.reston.va.us>

>>>>> "JCA" == James C Ahlstrom <jim@interet.com> writes:

  >> Aside: I found that you are using undocumented arguments to
  >> zlib.compressobj() ... are these extra arguments left out of the
  >> documentation on purpose or by simple oversight ? I couldn't find
  >> them in the HTML docs and neither in the docstrings.

  JCA> I am following the CNRI code blindly here.  I don't have docs
  JCA> either.

The docs for the zlib module are quite out of date, although I think
the docstrings may be better (not necessarily completely up-to-date
thought :-).  The specific parameters to pass to zlib don't seem to be
documented anywhere either; IIRC I dug them out of some example C code
somewhere that used zlib to read Zip files.

Jeremy


From gstein@lyra.org  Sat Dec 18 23:14:02 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 15:14:02 -0800 (PST)
Subject: [Python-Dev] Year 2000 O'Reilly Python Conference
In-Reply-To: <003501bf497e$368f6f60$e655cfc0@ski.org>
Message-ID: <Pine.LNX.4.10.9912181513020.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, David Ascher wrote:
>...
> - Greg or Barry, is it possible for one of you to setup a mailman mailing
> list which will be used by the program committee?  eGroups is easy for me to
> setup, but lots of people hated it last year.  I don't want to pollute
> python-dev with conference discussions.

Done. ora-pc@pythonpros.com.
http://mailman.pythonpros.com/mailman/listinfo/ora-pc

I also removed the old monterey-speakers mailing list :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From da@ski.org  Sun Dec 19 07:24:51 1999
From: da@ski.org (David Ascher)
Date: Sat, 18 Dec 1999 23:24:51 -0800
Subject: [Python-Dev] Year 2000 O'Reilly Python Conference
References: <Pine.LNX.4.10.9912181513020.16305-100000@nebula.lyra.org>
Message-ID: <013301bf49f2$243946f0$df55cfc0@ski.org>

From: Greg Stein <gstein@lyra.org>
> On Sat, 18 Dec 1999, David Ascher wrote:
> >...
> > - Greg or Barry, is it possible for one of you to setup a mailman
mailing
> > list which will be used by the program committee?

> Done. ora-pc@pythonpros.com.
> http://mailman.pythonpros.com/mailman/listinfo/ora-pc

Thanks, Greg.

Now, folks, please consider joining the program committee.  We need a few
volunteers - not too many, but somewhere between 5 and 10 would be good.
You don't even have to commit to making it to the conference, if that's a
concern.

-- david


From jim@interet.com  Mon Dec 20 14:18:17 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 09:18:17 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>
Message-ID: <385E3AA9.162BE568@interet.com>

Greg Stein wrote:

> Do you have a ZipImporter written?

Yes, it is ftp://ftp.interet.com/pub/importer.py

JimA


From jim@interet.com  Mon Dec 20 14:35:58 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 09:35:58 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com>
Message-ID: <385E3ECE.F8DCDE28@interet.com>

"M.-A. Lemburg" wrote:

> One thing I'd suugest is to include some way to delete and
> update contents, e.g. the write() method should overwrite
> any existing entry in the archive (if it not already does --
> I haven't tested it, just read the code and it seems to raise
> an exception), plus maybe a .remove() method which deletes
> an entry.

Currently, adding a file requires the "a" append mode, while
the "w" mode re-writes the file.  Adding a duplicate file name
produces an error message.  I can change this,
but removing a file would either waste space, or else the file
contents must be copied over the old file and all the offsets
updated.  I don't like this because it is complicated, and I think
it is fast enough to just re-write the archive.  But it
could be added if people want.

> True. How about making the compression argument mandatory
> for file opened in 'wb' mode only ?

The default of zero provides a little guidance that you should
use zero.  I added a warning message if 8 is used which should
discourage people from using 8.  Or I could disallow 8.
Is that OK?

JimA


From jim@interet.com  Mon Dec 20 15:34:02 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 10:34:02 -0500
Subject: [Python-Dev] Batteries Included?
References: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl>
Message-ID: <385E4C6A.BEC0F728@interet.com>

Jack Jansen wrote:

> And maybe the answer is a much simpler freezing process, like
> MacPython BuildApplication where any Python user can drop a script on
> it and end up with a fully self-contained app guaranteed (well.... No
> reports to the contrary have been heard so far, at least:-) to contain
> everything needed and not interfere with an existing MacPython
> installation (or be interfered with by it). Then a popular app will
> have prebuilt binaries available for all platforms quickly, made by
> the Python community, and the enduser interested in the app but not in
> Python can simply download that.

IMHO the "much simpler freezing process" is archive files.  A simple
script can build them, imputil can import them, and the only
remaining problem is to find them.  Please see:

ftp://ftp.interet.com/pub/bootmodule.html
ftp://ftp.interet.com/pub/pylib.html

JimA


From jack@oratrix.nl  Mon Dec 20 16:50:32 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Mon, 20 Dec 1999 17:50:32 +0100
Subject: [Python-Dev] Batteries Included?
In-Reply-To: Message by "James C. Ahlstrom" <jim@interet.com> ,
 Mon, 20 Dec 1999 10:34:02 -0500 , <385E4C6A.BEC0F728@interet.com>
Message-ID: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl>

> IMHO the "much simpler freezing process" is archive files.  A simple
> script can build them, imputil can import them, and the only
> remaining problem is to find them.  Please see:

Archive files solves the problem for Python modules. But that leaves the 
problem of dynamically loaded modules. And resources for dialogs and such, if 
you use native GUI stuff on Mac or Windows.

And most serious applications that I've seen (GRiNS and Zope, to name two, 
Mailman is the only exception I can think of) depend on non-standard plugin 
modules.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From mal@lemburg.com  Mon Dec 20 14:44:42 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 20 Dec 1999 15:44:42 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com>
Message-ID: <385E40DA.37AD704F@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> 
> > One thing I'd suugest is to include some way to delete and
> > update contents, e.g. the write() method should overwrite
> > any existing entry in the archive (if it not already does --
> > I haven't tested it, just read the code and it seems to raise
> > an exception), plus maybe a .remove() method which deletes
> > an entry.
> 
> Currently, adding a file requires the "a" append mode, while
> the "w" mode re-writes the file.  Adding a duplicate file name
> produces an error message.  I can change this,
> but removing a file would either waste space, or else the file
> contents must be copied over the old file and all the offsets
> updated.  I don't like this because it is complicated, and I think
> it is fast enough to just re-write the archive.  But it
> could be added if people want.

I guess it would be ok to waste space. You could provide
a .cleanup() or .rewrite() method that takes care of
reorganizing the file to fill up the gaps.
 
> > True. How about making the compression argument mandatory
> > for file opened in 'wb' mode only ?
> 
> The default of zero provides a little guidance that you should
> use zero.  I added a warning message if 8 is used which should
> discourage people from using 8.  Or I could disallow 8.
> Is that OK?

Well the module seems to work just fine with compression
on, so disallowing it or issuing a warning would reduce its value,
IMHO. How about making compression a boolean value and then
converting any true value to 8 ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    11 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fdrake@acm.org  Mon Dec 20 18:52:41 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 20 Dec 1999 13:52:41 -0500 (EST)
Subject: [Python-Dev] posix module
In-Reply-To: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com>
References: <14423.61493.90107.433664@weyr.cnri.reston.va.us>
 <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com>
Message-ID: <14430.31481.402469.896400@weyr.cnri.reston.va.us>

Fredrik Lundh writes:
 > (current CVS stuff, on Red Hat 5.2)

  Ok, Guido figured it out; this is a typo in the header
/usr/include/confname.h; the enum and the #define don't have the same
name.
  Do you know a way to detect the Linux kernel version using
pre-preprocessor macros?  (Seems very fragile.)  Would it be
reasonable to only add that table entry for kernel versions >= 2.2?


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From jim@interet.com  Mon Dec 20 19:25:27 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 14:25:27 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com>
Message-ID: <385E82A7.72345807@interet.com>

"M.-A. Lemburg" wrote:

> I guess it would be ok to waste space. You could provide
> a .cleanup() or .rewrite() method that takes care of
> reorganizing the file to fill up the gaps.

OK, adding a duplicate name replaces the old file.

> Well the module seems to work just fine with compression
> on, so disallowing it or issuing a warning would reduce its value,
> IMHO.

Yes compression works, but 90% of Python installations don't have
zlib, so it is an ERROR to create archives with compression when
these archives are distributed to other sites.

> How about making compression a boolean value and then
> converting any true value to 8 ?

It would close the door to future or other compression methods.
Currently the method must be 0 or 8 or a traceback will result.

JimA


From jim@interet.com  Mon Dec 20 19:33:11 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 14:33:11 -0500
Subject: [Python-Dev] Batteries Included?
References: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl>
Message-ID: <385E8477.F727E0F8@interet.com>

Jack Jansen wrote:

> Archive files solves the problem for Python modules. But that leaves the
> problem of dynamically loaded modules. And resources for dialogs and such, if
> you use native GUI stuff on Mac or Windows.

Point taken.

For dynamically loaded modules, I believe in following the
native system's DLL path, and not adding eccentric Python
logic.  But many disagreed a couple week's ago when I raised this.

For resources, I think the archive file can accommodate this,
although it seems highly system dependent.

Anyway, any file at all can live in the archive and the import
mechanism for *.pyc will not be damaged nor unduly slowed down
by its presence.

JimA


From gstein@lyra.org  Mon Dec 20 20:11:50 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 20 Dec 1999 12:11:50 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <385E82A7.72345807@interet.com>
Message-ID: <Pine.LNX.4.10.9912201208290.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, James C. Ahlstrom wrote:
> "M.-A. Lemburg" wrote:
> > I guess it would be ok to waste space. You could provide
> > a .cleanup() or .rewrite() method that takes care of
> > reorganizing the file to fill up the gaps.
> 
> OK, adding a duplicate name replaces the old file.

But it shouldn't print a warning(!). If an application wants to replace a
file, then stuff shouldn't appear on stdout as a result.

> > Well the module seems to work just fine with compression
> > on, so disallowing it or issuing a warning would reduce its value,
> > IMHO.
> 
> Yes compression works, but 90% of Python installations don't have
> zlib, so it is an ERROR to create archives with compression when
> these archives are distributed to other sites.

While it may be problem to distribute them to other sites, that is not up
to the library. If I want compression, then I should get compression. A
library module should not determine application-level policy.

The warning that __init__ prints shouldn't be there.

Really: there should not be a single "print" in the library (well,
printdir() is fine... that's what it is supposed to do; printing in the
test code would be fine). In normal, or even exceptional(!), operation
there should never be a print.

> > How about making compression a boolean value and then
> > converting any true value to 8 ?
> 
> It would close the door to future or other compression methods.
> Currently the method must be 0 or 8 or a traceback will result.

I definitely agree with JimA here. For example, maybe we want bzip
compression in there. Sure, non-portable, but that's my problem :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim@interet.com  Mon Dec 20 20:50:46 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 15:50:46 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912201208290.16305-100000@nebula.lyra.org>
Message-ID: <385E96A6.40CCF285@interet.com>

Greg Stein wrote:
> 
> On Mon, 20 Dec 1999, James C. Ahlstrom wrote:
> > "M.-A. Lemburg" wrote:
> But it shouldn't print a warning(!). If an application wants to replace a
> file, then stuff shouldn't appear on stdout as a result.

OK, no warning.
 
> The warning that __init__ prints shouldn't be there.

OK, it is gone.
 
> Really: there should not be a single "print" in the library (well,

No print unless _debug > 0

JimA


From mal@lemburg.com  Mon Dec 20 21:16:39 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 20 Dec 1999 22:16:39 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com>
Message-ID: <385E9CB7.5DE4848A@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> 
> > I guess it would be ok to waste space. You could provide
> > a .cleanup() or .rewrite() method that takes care of
> > reorganizing the file to fill up the gaps.
> 
> OK, adding a duplicate name replaces the old file.

Cool.
 
> > Well the module seems to work just fine with compression
> > on, so disallowing it or issuing a warning would reduce its value,
> > IMHO.
> 
> Yes compression works, but 90% of Python installations don't have
> zlib, so it is an ERROR to create archives with compression when
> these archives are distributed to other sites.

Sure, for the sake of creating Python code archives, but
your module is much more versatile: e.g. I could automatically
create ZIP archives of log files or sets of other files and
then have Python email them to someone who uses these archives
through standard tools such as WinZip -- the target doesn't always
have to be a Python process :-)

> > How about making compression a boolean value and then
> > converting any true value to 8 ?
> 
> It would close the door to future or other compression methods.
> Currently the method must be 0 or 8 or a traceback will result.

Ok.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    11 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jim@interet.com  Mon Dec 20 21:37:20 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 16:37:20 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com> <385E9CB7.5DE4848A@lemburg.com>
Message-ID: <385EA190.6AF511BD@interet.com>

"M.-A. Lemburg" wrote:
>
> Sure, for the sake of creating Python code archives, but
> your module is much more versatile: e.g. I could automatically
> create ZIP archives of log files or sets of other files and

OK, zipfile.py no longer complains about compression != 0

JimA


From fdrake@acm.org  Tue Dec 21 22:42:26 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 21 Dec 1999 17:42:26 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212238.RAA13660@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
Message-ID: <14432.594.33416.600794@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > + 
 > + class GetoptError(Exception):
 > +     opt = ''
 > +     msg = ''
 > +     def __init__(self, *args):
 > +         self.args = args
 > +         if len(args) == 1:
 > +             self.msg = args[0]
 > +         elif len(args) == 2:
 > +             self.msg = args[0]
 > +             self.opt = args[1]
 > + 
 > +     def __str__(self):
 > +         return self.msg
 >   
 > ! error = GetoptError # backward compatibility

  This breaks as soon as the standard exceptions are strings; does
this mean -X will be removed in the next release?  (Please????)


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Tue Dec 21 22:44:46 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 17:44:46 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
Message-ID: <14432.734.155183.508785@anthem.cnri.reston.va.us>

>>>>> "Fred" == Fred L Drake, Jr <fdrake@acm.org> writes:

    Fred>   This breaks as soon as the standard exceptions are
    Fred> strings; does this mean -X will be removed in the next
    Fred> release?  (Please????)

Pretty please? :)


From guido@CNRI.Reston.VA.US  Tue Dec 21 23:05:28 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:05:28 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 17:42:26 EST."
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
Message-ID: <199912212305.SAA13722@eric.cnri.reston.va.us>

> Guido van Rossum writes:
>  > + 
>  > + class GetoptError(Exception):
>  > +     opt = ''
>  > +     msg = ''
>  > +     def __init__(self, *args):
>  > +         self.args = args
>  > +         if len(args) == 1:
>  > +             self.msg = args[0]
>  > +         elif len(args) == 2:
>  > +             self.msg = args[0]
>  > +             self.opt = args[1]
>  > + 
>  > +     def __str__(self):
>  > +         return self.msg
>  >   
>  > ! error = GetoptError # backward compatibility

[Fred Drake]

>   This breaks as soon as the standard exceptions are strings; does
> this mean -X will be removed in the next release?  (Please????)

Not a bad idea.

Anybody got a reason why -X should stay?

(The next step would be to outlaw raise with a string argument; I
think I can't make that for 1.6.  But it would be a good idea to scan
the standard library for string exceptions and convert all of them.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Tue Dec 21 23:21:38 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:21:38 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
 <199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <14432.2946.857539.898577@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido@cnri.reston.va.us> writes:

    Guido> Anybody got a reason why -X should stay?

Kill it.

    Guido> (The next step would be to outlaw raise with a string
    Guido> argument; I think I can't make that for 1.6.  But it would
    Guido> be a good idea to scan the standard library for string
    Guido> exceptions and convert all of them.)

Or require that exception classes be derived from exceptions.Exception
:)

-Barry


From guido@CNRI.Reston.VA.US  Tue Dec 21 23:23:29 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:23:29 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 18:21:38 EST."
 <14432.2946.857539.898577@anthem.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>
 <14432.2946.857539.898577@anthem.cnri.reston.va.us>
Message-ID: <199912212323.SAA13803@eric.cnri.reston.va.us>

[Barry]
>     Guido> Anybody got a reason why -X should stay?
> 
> Kill it.

You already said that.

Anybody else?

>     Guido> (The next step would be to outlaw raise with a string
>     Guido> argument; I think I can't make that for 1.6.  But it would
>     Guido> be a good idea to scan the standard library for string
>     Guido> exceptions and convert all of them.)
> 
> Or require that exception classes be derived from exceptions.Exception
> :)

That's hard to require.  But it could easily be a requirement checked
by one of the hypothetical typecheckers that are being discussed in
the types-sig.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Tue Dec 21 23:27:31 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:27:31 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
 <199912212305.SAA13722@eric.cnri.reston.va.us>
 <14432.2946.857539.898577@anthem.cnri.reston.va.us>
 <199912212323.SAA13803@eric.cnri.reston.va.us>
Message-ID: <14432.3299.404561.698836@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido@cnri.reston.va.us> writes:

    BAW> Or require that exception classes be derived from
    BAW> exceptions.Exception :)

    Guido> That's hard to require.  But it could easily be a
    Guido> requirement checked by one of the hypothetical typecheckers
    Guido> that are being discussed in the types-sig.

Hmm, the raise could probably enforce this, but it might not be that
useful.

-Barry


From guido@CNRI.Reston.VA.US  Tue Dec 21 23:40:22 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:40:22 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 18:27:31 EST."
 <14432.3299.404561.698836@anthem.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us>
 <14432.3299.404561.698836@anthem.cnri.reston.va.us>
Message-ID: <199912212340.SAA13851@eric.cnri.reston.va.us>

> >>>>> "Guido" == Guido van Rossum <guido@cnri.reston.va.us> writes:
> 
>     BAW> Or require that exception classes be derived from
>     BAW> exceptions.Exception :)
> 
>     Guido> That's hard to require.  But it could easily be a
>     Guido> requirement checked by one of the hypothetical typecheckers
>     Guido> that are being discussed in the types-sig.
> 
> Hmm, the raise could probably enforce this, but it might not be that
> useful.
> 
> -Barry

The raise could easily enforce this, but it would break lots of
existing code.

I wish I had done it right from the start -- then exceptions would
have been classes from the start and would have required inheritance
from the Exception base class.  Like in Java.  (And in C++?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@python.org  Tue Dec 21 23:43:59 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:43:59 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
 <199912212305.SAA13722@eric.cnri.reston.va.us>
 <14432.2946.857539.898577@anthem.cnri.reston.va.us>
 <199912212323.SAA13803@eric.cnri.reston.va.us>
 <14432.3299.404561.698836@anthem.cnri.reston.va.us>
 <199912212340.SAA13851@eric.cnri.reston.va.us>
Message-ID: <14432.4287.543786.308468@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido@CNRI.Reston.VA.US> writes:

    Guido> The raise could easily enforce this, but it would break
    Guido> lots of existing code.

Maybe not (I'm not sure).  All the standard exceptions inherit from
Exception, and of course there'd be nothing to enforce for existing
user-defined string based exceptions.  How pervasive are user-defined
class based exceptions that don't inherit from Exception?  (I don't
know, and I haven't grepped, but I think we've been making that
recommendation from day 1 of class-based standard exceptions, and I
try to follow this recommendation in my own code).

    Guido> I wish I had done it right from the start -- then
    Guido> exceptions would have been classes from the start and would
    Guido> have required inheritance from the Exception base class.
    Guido> Like in Java.  (And in C++?)

All Hail, Python 2.0, our Savior and Redeemer! :)

-Barry


From guido@CNRI.Reston.VA.US  Tue Dec 21 23:49:09 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:49:09 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 18:43:59 EST."
 <14432.4287.543786.308468@anthem.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us>
 <14432.4287.543786.308468@anthem.cnri.reston.va.us>
Message-ID: <199912212349.SAA13892@eric.cnri.reston.va.us>

> From: "Barry A. Warsaw" <bwarsaw@cnri.reston.va.us>

> >>>>> "Guido" == Guido van Rossum <guido@CNRI.Reston.VA.US> writes:
> 
>     Guido> The raise could easily enforce this, but it would break
>     Guido> lots of existing code.
> 
> Maybe not (I'm not sure).  All the standard exceptions inherit from
> Exception, and of course there'd be nothing to enforce for existing
> user-defined string based exceptions.  How pervasive are user-defined
> class based exceptions that don't inherit from Exception?  (I don't
> know, and I haven't grepped, but I think we've been making that
> recommendation from day 1 of class-based standard exceptions, and I
> try to follow this recommendation in my own code).

Yes, but class-based user exceptions existed many Python versions
before class-based standard exceptions!

Two examples in the standard library: ConfigParser.py and xdrlib.py.

> All Hail, Python 2.0, our Savior and Redeemer! :)

Or, the perfect excuse for procrastination :)

(But yes, 2.0 will enforce this.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Tue Dec 21 23:53:50 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 15:53:50 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 getopt.py,1.7,1.8
In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912211552380.16305-100000@nebula.lyra.org>

On Tue, 21 Dec 1999, Guido van Rossum wrote:
>...
> [Fred Drake]
> >   This breaks as soon as the standard exceptions are strings; does
> > this mean -X will be removed in the next release?  (Please????)
> 
> Not a bad idea.
> 
> Anybody got a reason why -X should stay?

Kill it.

> (The next step would be to outlaw raise with a string argument; I
> think I can't make that for 1.6.  But it would be a good idea to scan
> the standard library for string exceptions and convert all of them.)

Keep string exceptions. I think there is probably a lot of code that still
uses them. I know I do :-)

We can issues warnings about string exceptions via the type-checking tool.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From bwarsaw@python.org  Tue Dec 21 23:54:04 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:54:04 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
 <199912212305.SAA13722@eric.cnri.reston.va.us>
 <14432.2946.857539.898577@anthem.cnri.reston.va.us>
 <199912212323.SAA13803@eric.cnri.reston.va.us>
 <14432.3299.404561.698836@anthem.cnri.reston.va.us>
 <199912212340.SAA13851@eric.cnri.reston.va.us>
 <14432.4287.543786.308468@anthem.cnri.reston.va.us>
 <199912212349.SAA13892@eric.cnri.reston.va.us>
Message-ID: <14432.4892.908107.421149@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido@CNRI.Reston.VA.US> writes:

    Guido> Yes, but class-based user exceptions existed many Python
    Guido> versions before class-based standard exceptions!

True, but I suspect that legacy class-based user exceptions are rare.
I might be wrong, but you're absolutely right that these would all be
broken.

    Guido> Two examples in the standard library: ConfigParser.py and
    Guido> xdrlib.py.

Fortunately these are fixed with two 11 character patches :)

I'm not necessarily arguing for or against tightening this.

-Barry


From gmcm@hypernet.com  Tue Dec 21 23:55:07 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Tue, 21 Dec 1999 18:55:07 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us>
References: Your message of "Tue, 21 Dec 1999 18:27:31 EST."             <14432.3299.404561.698836@anthem.cnri.reston.va.us>
Message-ID: <1266302877-22249299@hypernet.com>

[Guido]

> I wish I had done it right from the start -- then exceptions
> would have been classes from the start and would have required
> inheritance from the Exception base class.  Like in Java.  (And
> in C++?)

In C++ you can throw anything at all. Strings, ints, that 
Warsaw blockhead...

off-topic-ly y'rs

- Gordon


From tismer@appliedbiometrics.com  Wed Dec 22 00:57:27 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 22 Dec 1999 01:57:27 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>
 <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us>
Message-ID: <386021F7.4F94C458@appliedbiometrics.com>


Guido van Rossum wrote:
> 
> [Barry]
> >     Guido> Anybody got a reason why -X should stay?
> >
> > Kill it.
> 
> You already said that.
> 
> Anybody else?

I'd say kill -X, but keep allowing string exceptions if
it doesn't cost too much. I think of C++, like Gordon said.

Also I'd take the chance and move the exceptions Python
module back into the core, as a frozen mdule or whatever.

Reason: At the moment, the CVS version of the Python library
is incompatible to 1.5.2, which makes testing against the
standard dist quite inconvenient. A compiled CVS Python
does not run under PythonWin when I put it into my standard
installation. Or is there an easy way to switch all settings
to a completely different path?

Anyway, I'm most probably off until Y2K.

See ya all then, provided we survive - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From guido@CNRI.Reston.VA.US  Wed Dec 22 01:01:16 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 20:01:16 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 01:57:27 +0100."
 <386021F7.4F94C458@appliedbiometrics.com>
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us>
 <386021F7.4F94C458@appliedbiometrics.com>
Message-ID: <199912220101.UAA14109@eric.cnri.reston.va.us>

> I'd say kill -X, but keep allowing string exceptions if
> it doesn't cost too much. I think of C++, like Gordon said.

Agreed.

> Also I'd take the chance and move the exceptions Python
> module back into the core, as a frozen mdule or whatever.
> 
> Reason: At the moment, the CVS version of the Python library
> is incompatible to 1.5.2, which makes testing against the
> standard dist quite inconvenient. A compiled CVS Python
> does not run under PythonWin when I put it into my standard
> installation. Or is there an easy way to switch all settings
> to a completely different path?

Point the PYTHONHOME variable to the top of your install directory.
(On Windows you may have to kill the registry settings -- this is a
bug.)

> Anyway, I'm most probably off until Y2K.

Ditto.

> See ya all then, provided we survive - chris

Best wishes to all,

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@digicool.com  Wed Dec 22 13:54:41 1999
From: jim@digicool.com (Jim Fulton)
Date: Wed, 22 Dec 1999 08:54:41 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <3860D821.576B3146@digicool.com>

Guido van Rossum wrote:
> 
> (The next step would be to outlaw raise with a string argument; I
> think I can't make that for 1.6.  But it would be a good idea to scan
> the standard library for string exceptions and convert all of them.)

This would be waaaaay to big a change for Python 1.x. There are alot
of Python modules outside the standard distribution that use string 
exceptions. This would be a huge backward incompatability. 

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From fdrake@acm.org  Wed Dec 22 14:23:29 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 22 Dec 1999 09:23:29 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
 <199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <14432.57057.535205.558@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > (The next step would be to outlaw raise with a string argument; I
 > think I can't make that for 1.6.  But it would be a good idea to scan
 > the standard library for string exceptions and convert all of them.)

  I don't know if requiring class-based exceptions will make the
runtime any simpler, but that seems the only reason to do it.
  The only reason to remove -X, and possibly the string exception
fallback code, is to ensure that we *can* subclass Exception and
friends without having to catch TypeError and do something different.


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From fdrake@acm.org  Wed Dec 22 14:25:33 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 22 Dec 1999 09:25:33 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <14432.2946.857539.898577@anthem.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
 <199912212305.SAA13722@eric.cnri.reston.va.us>
 <14432.2946.857539.898577@anthem.cnri.reston.va.us>
Message-ID: <14432.57181.944364.427093@weyr.cnri.reston.va.us>

Barry A. Warsaw writes:
 > Or require that exception classes be derived from exceptions.Exception
 > :)

  Ok, it's early, and maybe I haven't had enough coffee(!).  But is
this serious?  Does JPython gain some benefit from this, is it your
preference, or are you just yanking on my leg?  ("Pulling my arm" as
my 5-year-old says!)


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From guido@CNRI.Reston.VA.US  Wed Dec 22 14:40:39 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 22 Dec 1999 09:40:39 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 09:23:29 EST."
 <14432.57057.535205.558@weyr.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>
 <14432.57057.535205.558@weyr.cnri.reston.va.us>
Message-ID: <199912221440.JAA16198@eric.cnri.reston.va.us>

> From: "Fred L. Drake, Jr." <fdrake@acm.org>
> 
> Guido van Rossum writes:
>  > (The next step would be to outlaw raise with a string argument; I
>  > think I can't make that for 1.6.  But it would be a good idea to scan
>  > the standard library for string exceptions and convert all of them.)
> 
>   I don't know if requiring class-based exceptions will make the
> runtime any simpler, but that seems the only reason to do it.

Do what?  *Require* class exceptions?  You're probably right, and I
think the gain is minimal.

There's another reason to scan the std library though -- not to set a
bad example.  I want to eventually (in 2.0) move to a
class-derived-from-Exception-only scheme.

>   The only reason to remove -X, and possibly the string exception
> fallback code, is to ensure that we *can* subclass Exception and
> friends without having to catch TypeError and do something different.

And that's a very good reason indeed.

Let me repeat my plans for 1.6.

- Remove -X; the standard exceptions are always class-based.

- Change all standard library and other example code to use
class-based exceptions with a standard exception as base class, to set
an example.

- Still allow string exceptions in user code.

- Still allow class exceptions that don't use a standard exception
base class in user code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Vladimir.Marangozov@inrialpes.fr  Wed Dec 22 18:09:47 1999
From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov)
Date: Wed, 22 Dec 1999 19:09:47 +0100 (CET)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912221440.JAA16198@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 09:40:39 AM
Message-ID: <199912221809.TAA25322@python.inrialpes.fr>

Guido van Rossum wrote:
> 
> [Fred Drake]
> >   I don't know if requiring class-based exceptions will make the
> > runtime any simpler, but that seems the only reason to do it.
> 
> Do what?  *Require* class exceptions?  You're probably right, and I
> think the gain is minimal.

Yes. Besides, I still think that string-based exceptions are just
convenient for quick & dirty, throw-away test scripts.

> 
> Let me repeat my plans for 1.6.
> 
> - Remove -X; the standard exceptions are always class-based.
> 
> - Change all standard library and other example code to use
> class-based exceptions with a standard exception as base class, to set
> an example.
> 
> - Still allow string exceptions in user code.
> 
> - Still allow class exceptions that don't use a standard exception
> base class in user code.

Sounds okay.

---

PS: I'm particularly happy today :-) because I've finally published
 the new version of our Web site http://www.inrialpes.fr. Two things
 I'd like to mention:
 (1) it shouldn't have been possible without quick Python scripts ;)
 (2) I'll find the time to reinvoke some of the topics discussed here
     instead of being mute as a fish.

That said, Merry Christmas and a Happy New Year to all of you!

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From guido@CNRI.Reston.VA.US  Wed Dec 22 18:23:45 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 22 Dec 1999 13:23:45 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 19:09:47 +0100."
 <199912221809.TAA25322@python.inrialpes.fr>
References: <199912221809.TAA25322@python.inrialpes.fr>
Message-ID: <199912221823.NAA16517@eric.cnri.reston.va.us>

Vladimir.Marangozov@inrialpes.fr:

> Yes. Besides, I still think that string-based exceptions are just
> convenient for quick & dirty, throw-away test scripts.

They have a hard-to-understand quirk though: the id() of the string is
used to check rather than its value, so that except "foo" doesn't
necessarily catch raise "foo"; but due to various optimization, this
usually works, and people get bent out of shape when it doesn't.
Since you have to give your exception a name, how hard is it to say

class MyError(Exception): pass

rathern than

MyError = "MyError"

?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Wed Dec 22 18:33:19 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 10:33:19 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 getopt.py,1.7,1.8
In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912221031390.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, Guido van Rossum wrote:
> Vladimir.Marangozov@inrialpes.fr:
> > Yes. Besides, I still think that string-based exceptions are just
> > convenient for quick & dirty, throw-away test scripts.
> 
> They have a hard-to-understand quirk though: the id() of the string is
> used to check rather than its value, so that except "foo" doesn't
> necessarily catch raise "foo"; but due to various optimization, this
> usually works, and people get bent out of shape when it doesn't.
> Since you have to give your exception a name, how hard is it to say
> 
> class MyError(Exception): pass
> 
> rathern than
> 
> MyError = "MyError"
> 
> ?

It is very hard. My fingers do the typing for me, and they fill in
strings. I'm trying to teach them otherwise, but they insist.

You're also assuming that MyError gets defined. Sometimes, my little
fingers like typing:

  try:
    foo
  except:
    raise "foo broke for some reason"


Quick and dirty, indeed! :-)

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From fdrake@acm.org  Wed Dec 22 19:59:55 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 22 Dec 1999 14:59:55 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
 <199912212305.SAA13722@eric.cnri.reston.va.us>
 <14432.2946.857539.898577@anthem.cnri.reston.va.us>
 <199912212323.SAA13803@eric.cnri.reston.va.us>
 <14432.3299.404561.698836@anthem.cnri.reston.va.us>
 <199912212340.SAA13851@eric.cnri.reston.va.us>
Message-ID: <14433.11707.607533.698901@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > I wish I had done it right from the start -- then exceptions would
 > have been classes from the start and would have required inheritance
 > from the Exception base class.  Like in Java.  (And in C++?)

  I've seen this said or hinted at in a couple of places (the specific 
requirement that exception derive from Exception), but I've seen
nothing that indicates any reason or derived value for this.  Could
someone please clarify?


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From guido@CNRI.Reston.VA.US  Wed Dec 22 20:05:52 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 22 Dec 1999 15:05:52 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 14:59:55 EST."
 <14433.11707.607533.698901@weyr.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us>
 <14433.11707.607533.698901@weyr.cnri.reston.va.us>
Message-ID: <199912222005.PAA17291@eric.cnri.reston.va.us>

> From: "Fred L. Drake, Jr." <fdrake@acm.org>

> Guido van Rossum writes:
>  > I wish I had done it right from the start -- then exceptions would
>  > have been classes from the start and would have required inheritance
>  > from the Exception base class.  Like in Java.  (And in C++?)
> 
>   I've seen this said or hinted at in a couple of places (the specific 
> requirement that exception derive from Exception), but I've seen
> nothing that indicates any reason or derived value for this.  Could
> someone please clarify?

It's simply an extra bit of checking that your program is reasonable
-- if you accidentally raise a non-exception class, there's probably
something wrong with your program, and it gives the reader a hint
about the intended use of the class.

Other languages (e.g. Modula-3) have a specific exception type that
can be used only for that one purpose.  However it's useful to allow
methods an subclassing of exceptions, so they might as well be
classes.  So, all exceptions are classes.  But not all classes are
exceptions.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Wed Dec 22 20:11:43 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 12:11:43 -0800 (PST)
Subject: [Python-Dev] Please test new dynamic load behavior
Message-ID: <Pine.LNX.4.10.9912221200290.16305-100000@nebula.lyra.org>

Hi all,

I reorganized Python's dynamic load/import code over the past few days.
Gudio provided some feedback, I did some more mods, and now it is checked
into CVS. The new loading behavior has been tested on Linux, IRIX, and
Solaris (and probably Windows by now).

For people with CVS access, I'd like to ask that you grab an updated copy
and shake out the new code. There have been updates to the "configure"
process, so you'll need to run configure again. Make sure that you alter
your Modules/Setup to build some shared modules, and then try it out.

Here are some of the platforms that I believe need specific testing:

- NetBSD, FreeBSD, OpenBSD, ...
- AIX
- HP/UX
- BeOS
- NeXT
- Mac
- OS/2
- Win16

I believe it should work for most people, but we may be looking for the
wrong "init<module>" symbol on some platforms. We might even be selecting
the wrong import mechanism (or missing it altogether!) on some platforms.

If you get a chance to test this, then please drop me a note with your
platform and whether it succeeded or failed (and how it failed).

Thanx!
-g

p.s. you can tell if dynamic loading is missing by watching for
DYNLOADFILE in the configure process and seeing if it used dynload_stub.
alternatively, you can import the "imp" module and see if "load_dynamic"
is missing.

-- 
Greg Stein, http://www.lyra.org/


From gvwilson@nevex.com  Thu Dec 23 03:43:40 1999
From: gvwilson@nevex.com (gvwilson@nevex.com)
Date: Wed, 22 Dec 1999 22:43:40 -0500 (EST)
Subject: [Python-Dev] re: Open Source design competition / Python / software tools
Message-ID: <Pine.LNX.4.10.9912222242070.4839-200000@akbar.nevex.com>

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

--168427786-691315853-945920620=:4839
Content-Type: TEXT/PLAIN; charset=US-ASCII

Hi, folks.  I hope you don't mind another mail out of the blue, but I got
notice on Saturday that the Department of Energy is giving me $860K over
two years to support development of easier-to-use software engineering
tools.  All of the work will be Open Source, and will be done in Python,
with a strong emphasis on design, testing, and documentation.  The
project's long-term objective is to encourage scientists and engineers to
treat programs in the same way as they do other experiments, i.e. to
calibrate, test, peer review, and so on.

To kick-start things, we're going to be holding a two-round design
competition.  Anyone (individual or team, professional or student) can
submit a short entry for the first round; the judges will pick four
candidates to go forward in each of four categories, and those
individuals or teams will be asked to submit full entries. The four
categories are:

* an issue tracking system to replace Gnats and Bugzilla;

* a build system to replace make;

* a platform inspection and configuration system to replace autoconf;
  and

* a testing framework to replace XUnit, Expect, and DejaGnu.

Would you be interested in participating in any way---judging, entering a
design, critiquing things from the pointer of view of end users, or
anything else? I realize that you're probably up past your eyeballs with
work, and that the money on offer is nothing special, but I think this
could be a lot of fun, and could help to shift the emphasis of the Open
Source community from hacking to design (both by drawing attention to, and
rewarding, design, and by creating a corpus of examples and commentary for
programmers to refer to).  It could also make life a lot easier for
computational scientists and engineers...

Please let me know if you'd like to be involved, or if you'd like more
information than is contained in the FAQ (attached).  Timescales are a
bit tight---I'd like to be able to make an announcement on January
14---but I'll be reading email at this address several times a day
during the holiday.

I look forward to hearing from you,

Greg Wilson

p.s. please note that the attached FAQ is a first draft; I'd be grateful
if you could show it to anyone you think might be interested, but I'd
also be grateful if you wouldn't broadcast it until it's gone through 
one more editing pass.

--168427786-691315853-945920620=:4839
Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; name="faq.html"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.LNX.4.10.9912222243400.4839@akbar.nevex.com>
Content-Description: 
Content-Disposition: attachment; filename="faq.html"

PEhUTUw+DQo8SEVBRD4NCjxUSVRMRT5Tb2Z0d2FyZSBDYXJwZW50cnkgRkFR
PC9USVRMRT4NCjwvSEVBRD4NCjxCT0RZPg0KDQo8SDEgQUxJR049IkNFTlRF
UiI+U29mdHdhcmUgQ2FycGVudHJ5IEZBUTwvSDE+DQoNCg0KPEgyPkdlbmVy
YWwgaW5mb3JtYXRpb248L0gyPg0KDQo8T0w+DQoNCjxMST48RU0+V2hhdCBp
cyB0aGUgU29mdHdhcmUgQ2FycGVudHJ5IHByb2plY3Q/IDwvRU0+DQo8QlI+
DQpUaGUgYWltIG9mIHRoZSBTb2Z0d2FyZSBDYXJwZW50cnkgcHJvamVjdCBp
cyB0byBtYWtlIGl0IGVhc2llciBmb3INCnByb2dyYW1tZXJzIGluIGdlbmVy
YWwsIGFuZCBzY2llbnRpZmljIHByb2dyYW1tZXJzIGluIHBhcnRpY3VsYXIs
IHRvDQphZG9wdCBiZXR0ZXIgc29mdHdhcmUgZGV2ZWxvcG1lbnQgcHJhY3Rp
Y2VzLiBUaGUgcHJvamVjdCB3aWxsIGFjaGlldmUNCnRoaXMgYnkgY3JlYXRp
bmcgdG9vbHMgdGhhdCBhcmUgZWFzaWVyIHRvIGxlYXJuIGFuZCB1c2UsIGFu
ZCBieQ0KZG9jdW1lbnRpbmcgdGhvc2UgdG9vbHMgYW5kIHRoZSBwcmFjdGlj
ZXMgdGhleSBlbWJvZHkuDQo8L0xJPg0KDQo8TEk+PEVNPldoZXJlIGRvZXMg
dGhlIG5hbWUgY29tZSBmcm9tPzwvRU0+DQo8QlI+DQpUaGUgbmFtZSBpcyBh
IHBsYXkgb24gInNvZnR3YXJlIGVuZ2luZWVyaW5nIiwgYW5kIGlzIG1lYW50
IHRvIGluZGljYXRlDQp0aGF0IHRoaXMgcHJvamVjdCBpcyBpbml0aWFsbHkg
Y29uY2VybmVkIHdpdGggbWVkaXVtLXNpemVkIHRlYW1zICh1cA0KdG8gYSBk
b3plbiBvciB0d28gcHJvZ3JhbW1lcnMpIGFuZCBtZWRpdW0tdGVybSB0aW1l
c2NhbGVzIChhIHllYXIgb3INCnR3bykuDQo8L0xJPg0KDQo8TEk+PEVNPkhv
dyBkaWQgdGhlIHByb2plY3QgZ2V0IHN0YXJ0ZWQ/PC9FTT4NCjxCUj4NClRo
ZSBwcm9qZWN0IGhhcyBpdHMgb3JpZ2lucyBpbiBhIDxBDQpIUkVGPSJodHRw
Oi8vd3d3LmFjbC5sYW5sLmdvdi9zYy9yZXNvdXJjZXMvY3NlL2luZGV4Lmh0
bWwiPnNlcmllcyBvZg0KYXJ0aWNsZXM8L0E+IHRoYXQgR3JlZyBXaWxzb24g
b3JnYW5pemVkIGZvciB0aGUgRmFsbCAxOTk2IGFuZCBXaW50ZXINCjE5OTYg
aXNzdWVzIG9mIDxDSVRFPklFRUUgQ29tcHV0YXRpb25hbCBTY2llbmNlIGFu
ZA0KRW5naW5lZXJpbmc8L0NJVEU+LiBUaGVzZSBhcnRpY2xlcyBvdXRsaW5l
ZCB3aGF0IHRoZWlyIGF1dGhvcnMgdGhvdWdodA0KY29tcHV0ZXIgc2NpZW50
aXN0cyBzaG91bGQgdGVhY2ggdG8gcGh5c2ljYWwgc2NpZW50aXN0cyBhbmQN
CmVuZ2luZWVycy4gTW9zdCBhdXRob3JzIHJlY29tbWVuZGVkIG51bWVyaWNh
bCBtZXRob2RzIG9yIHRoZSBzdGFuZGFyZA0KVW5peCB0b29sc2V0LCBidXQg
U3RldmUgTWNDb25uZWxsIGFyZ3VlZCB0aGF0IGJldHRlciBwcm9ncmFtbWlu
Zw0KcHJhY3RpY2VzIHdvdWxkIGhhdmUgdGhlIGdyZWF0ZXN0IGltcGFjdCBv
biBwcm9kdWN0aXZpdHkuDQoNCjxCUj4gQXMgYSByZXN1bHQgb2YgdGhhdCBv
YnNlcnZhdGlvbiwgR3JlZyBXaWxzb24sIEJyZW50IEdvcmRhLCBhbmQNClN0
ZXZlIE1jQ29ubmVsbCBwdXQgdG9nZXRoZXIgYSAzLWRheSBjb3Vyc2Ugb24g
c29mdHdhcmUgZW5naW5lZXJpbmcNCmZvciBzY2llbnRpc3RzIGFuZCBlbmdp
bmVlcnMsIHdoaWNoIHRoZXkgdGF1Z2h0IHNldmVyYWwgdGltZXMgYXQgdGhl
DQpMb3MgQWxhbW9zIE5hdGlvbmFsIExhYm9yYXRvcnkuIEZlZWRiYWNrIG9u
IHRoZSBjb3Vyc2Ugd2FzIHZlcnkNCnBvc2l0aXZlLCBidXQgbWFueSBwYXJ0
aWNpcGFudHMgZmVsdCB0aGF0IHRoZSB0b29scyBiZWluZw0KdGF1Z2h0LS0t
UGVybCwgTWFrZSwgQ1ZTLCBhbmQgc28gb24tLS13ZXJlIHVubmVjZXNzYXJp
bHkgZGlmZmljdWx0IHRvDQppbnN0YWxsLCBsZWFybiwgYW5kIHVzZS4gVGhl
eSB3ZXJlIGFsc28gZnJ1c3RyYXRlZCBieSB0aGUgc2NhcmNpdHkgb2YNCmV4
YW1wbGVzIG9mIGRlc2lnbiBkb2N1bWVudHMsIHRlc3RpbmcgcGxhbnMsIGFu
ZCBhbGwgb2YgdGhlIG90aGVyDQp0aGluZ3MgdGhlIGNvdXJzZSB3YXMgdHJ5
aW5nIHRvIHRlYWNoIHRoZW0uDQo8L0xJPg0KDQo8TEk+PEVNPldoeSBPcGVu
IFNvdXJjZT88L0VNPg0KPEJSPg0KVGhlcmUgYXJlIHRocmVlIHJlYXNvbnMg
d2h5IHRoZSBTb2Z0d2FyZSBDYXJwZW50cnkgcHJvamVjdCBpcw0KZm9sbG93
aW5nIHRoZSBPcGVuIFNvdXJjZSBtb2RlbDoNCjwvTEk+DQoNCgk8T0w+DQoN
Cgk8TEk+PEVNPkxldmVyYWdpbmcgZXhpc3Rpbmcga25vd2xlZGdlLiA8L0VN
Pg0KCTxCUj4NCglBIGNsb3NlZCBwcm9qZWN0IGNhbiBvbmx5IHRha2UgYWR2
YW50YWdlIG9mIGEgZmV3IG1pbmRzLiBBcw0KCUxpbnV4IGFuZCBvdGhlciBw
cm9qZWN0cyBoYXZlIHNob3duLCBhIHdlbGwtcnVuIE9wZW4gU291cmNlDQoJ
cHJvamVjdCBjYW4gaGFybmVzcyB0aGUgZXhwZXJpZW5jZSBhbmQgaW5zaWdo
dCBvZiB0aG91c2FuZHMgb2YNCglwZW9wbGUuDQoJPC9MST4NCg0KCTxMST48
RU0+TG93ZXJpbmcgYmFycmllcnMgdG8gYWRvcHRpb24uIDwvRU0+DQoJPEJS
Pg0KCUZyZWVseS1hdmFpbGFibGUgdG9vbHMgYXJlIG1vcmUgbGlrZWx5IHRv
IGJlIHBpY2tlZCB1cCB0aGFuDQoJdGhlaXIgY29tbWVyY2lhbCBlcXVpdmFs
ZW50cy4gVGhpcyBpcyBwYXJ0aWN1bGFybHkgdHJ1ZSB3aGVuDQoJdGhlIHRv
b2wgaW4gcXVlc3Rpb24gZG9lcyBzb21ldGhpbmcgbm92ZWwgKGF0IGxlYXN0
IGZyb20gdGhlDQoJcG9pbnQgb2YgdGhlIHBlcnNvbiBhZG9wdGluZyBpdCks
IGFuZCBpbiBhY2FkZW1pYSAod2hlcmUNCglidWRnZXRzIGFyZSBsaW1pdGVk
KS4NCgk8L0xJPg0KDQoJPExJPjxFTT5FbmNvdXJhZ2luZyBwZWVyIHJldmll
dy48L0VNPg0KCTxCUj4NCglEYW4gR2V6ZWx0ZXKScyA8YQ0KCWhyZWY9Imh0
dHA6Ly93d3cub3BlbnNjaWVuY2Uub3JnL3RhbGsvYm5sL2luZGV4Lmh0bWwi
PnRhbGs8L2E+DQoJYXQgdGhlIGZpcnN0IE9wZW4gU291cmNlL09wZW4gU2Np
ZW5jZSBjb25mZXJlbmNlIGRpc2N1c3NlZCBob3cNCgl0aGUgc2NpZW50aWZp
YyB0cmFkaXRpb24gb2YgcGVlciByZXZpZXcgZml0cyB3aXRoIHRoZQ0KCXBo
aWxvc29waHkgb2YgdGhlIE9wZW4gU291cmNlIG1vdmVtZW50LiBCeSBkZXNp
Z25pbmcgYW5kDQoJYnVpbGRpbmcgdGhlc2UgdG9vbHMgaW4gdGhlIG9wZW4s
IHRoZSBTb2Z0d2FyZSBDYXJwZW50cnkNCglwcm9qZWN0IHdpbGwgYm90aCBl
bmNvdXJhZ2UgcGVlciByZXZpZXcgb2YgdGhlIHRvb2xzDQoJdGhlbXNlbHZl
cywgYW5kIGRlbW9uc3RyYXRlIGhvdyB0aGlzIG91Z2h0IHRvIGJlIGRvbmUg
Zm9yDQoJc2NpZW50aWZpYyBhbmQgY29tbWVyY2lhbCBzb2Z0d2FyZS4NCgk8
L0xJPg0KDQoJPC9PTD4NCg0KPExJPjxFTT5XaGVyZSBkb2VzIHRoZSBmdW5k
aW5nIGNvbWUgZnJvbT8gPC9FTT4NCjxCUj4NClRoZSBmdW5kaW5nIGNvbWVz
IGZyb20gdGhlIFUuUy4gRGVwYXJ0bWVudCBvZiBFbmVyZ3ksIHRocm91Z2gg
dGhlDQpBZHZhbmNlZCBDb21wdXRpbmcgTGFib3JhdG9yeSBhdCBMb3MgQWxh
bW9zIE5hdGlvbmFsIExhYm9yYXRvcnkuIFRoZQ0KcHJvamVjdCBpcyBiZWlu
ZyBhZG1pbmlzdGVyZWQgYnkgQ29kZSBTb3VyY2VyeS4gVVMkNDgwLDAwMCBo
YXMgYmVlbg0KcHJvdmlkZWQgZm9yIDIwMDAsIGFuZCBVUyQzODAsMDAwIGZv
ciAyMDAxLg0KPC9MST4NCg0KPExJPjxFTT5XaHkgd291bGQgdGhlIERlcGFy
dG1lbnQgb2YgRW5lcmd5IGZ1bmQgc29tZXRoaW5nIGxpa2UgdGhpcz88L0VN
Pg0KPEJSPg0KVGhlIGZ1bmRpbmcgaGFzIGJlZW4gcHJvdmlkZWQgcGFydGx5
IGJlY2F1c2UgdGhlIERvRSB3b3VsZCBsaWtlDQpzY2llbnRpc3RzIGFuZCBl
bmdpbmVlcnMgdG8gYmUgbW9yZSBwcm9kdWN0aXZlLCBhbmQgcGFydGx5IGJl
Y2F1c2UgaXQNCndvdWxkIGxpa2UgdG8gZmluZCBvdXQgd2hldGhlciB0aGUg
T3BlbiBTb3VyY2UgbW9kZWwgYW5kIGNvbW11bml0eSBjYW4NCm1lZXQgdGhl
IHNwZWNpYWwgbmVlZHMgb2YgaGlnaC1wZXJmb3JtYW5jZSBjb21wdXRhdGlv
bmFsIHNjaWVuY2UuIFRoZQ0KbGFzdCBmZXcgeWVhcnMgaGF2ZSBzZWVuIG1v
c3QgbWFudWZhY3R1cmVycyBvZiBzcGVjaWFsLXB1cnBvc2UNCnN1cGVyY29t
cHV0ZXJzIGRpc2FwcGVhciBvciBiZSBib3VnaHQgb3V0LCBhbmQgdGhlIHJp
c2Ugb2YgY2x1c3RlcnMNCmJhc2VkIG9uIGNvbW1lcmNpYWwgb2ZmLXRoZS1z
aGVsZiAoQ09UUykgaGFyZHdhcmUsIExpbnV4LCBNUEksIHRoZSBHTlUNCmNv
bXBpbGVyIHRvb2xzZXQsIGFuZCBzbyBvbi4gVGhlcmUgaXMgYSBncm93aW5n
IGZlZWxpbmcgdGhhdCB0aGVzZQ0KbWFjaGluZXMgY291bGQgYnJpbmcgc2Nh
bGFibGUgc3VwZXJjb21wdXRpbmcgaW50byB0aGUgbWFpbnN0cmVhbSwgYnV0
DQp0aGlzIHdpbGwgb25seSBoYXBwZW4gaWYgZ29vZCB0b29scyBhbmQgcHJh
Y3RpY2VzIGFyZSBhY2Nlc3NpYmxlDQplbm91Z2guDQo8L0xJPg0KDQo8TEk+
PEVNPkknbSBub3QgYSBzY2llbnRpc3Qgb3IgZW5naW5lZXItLS13aGF0J3Mg
aW4gaXQgZm9yIG1lPyA8L0VNPg0KPEJSPg0KVGhlIHRoaW5ncyB0aGF0IG1h
a2UgbWFueSBleGlzdGluZyBPcGVuIFNvdXJjZSBzb2Z0d2FyZSBkZXZlbG9w
bWVudA0KdG9vbHMgZGlmZmljdWx0IHRvIGxlYXJuIGFuZCB1c2UtLS1vYnNj
dXJlIHN5bnRheCwgYXJiaXRyYXJ5IG9yDQpoYXJkLXRvLWZvbGxvdyBiZWhh
dmlvciwgYW5kIHBvb3IgZG9jdW1lbnRhdGlvbi0tLWFmZmVjdCBwcm9mZXNz
aW9uYWwNCnByb2dyYW1tZXJzIGFuZCBjb21wdXRlciBzY2llbmNlIHN0dWRl
bnRzIGp1c3QgYXMgbXVjaCBhcyB0aGV5IGRvDQpjb21wdXRhdGlvbmFsIHNj
aWVudGlzdHMgYW5kIGVuZ2luZWVycy4gSWYgdGhlIE9wZW4gU291cmNlIG1v
dmVtZW50DQpjYW4gYnVpbGQgdG9vbHMgdGhhdCBhcmUgc2ltcGxlIGVub3Vn
aCB0byBiZSBsZWFybmVkIGJ5IHBlb3BsZSB3aG8NCmhhdmUgcHJvYmxlbXMg
b2YgdGhlaXIgb3duIHRvIHNvbHZlLCBhbmQgeWV0IHBvd2VyZnVsIGVub3Vn
aCB0bw0Kc3VwcG9ydCBkaXN0cmlidXRlZCBkZXZlbG9wbWVudCBvZiBodW5k
cmVkcyBvZiB0aG91c2FuZHMgb2YgbGluZXMgb2YNCmNvbXBsZXggbnVtZXJp
Y2FsIGFuZCB2aXN1YWxpemF0aW9uIGNvZGUsIHRoZW4gdGhvc2UgdG9vbHMg
d2lsbA0KcHJvYmFibHkgYWxzbyBoZWxwIHBlb3BsZSB3aG8gd2FudCB0byBi
dWlsZCBJbnRlcm5ldCBjaGF0IHJvb21zIGFuZA0Kb3JkZXItdHJhY2tpbmcg
c3lzdGVtcy4NCjxCUj4NClRoaXMgcHJvamVjdCBzaG91bGQgYWxzbyBiZSBp
bnRlcmVzdGluZyB0byB0aGUgZ2VuZXJhbCBwcm9ncmFtbWluZw0KY29tbXVu
aXR5IGJlY2F1c2UgaXQgaXMgZ29pbmcgdG8gcGxhY2UgbW9yZSBlbXBoYXNp
cyBvbiBkZXNpZ24gYW5kDQplYXJseSBmZWVkYmFjayB0aGFuIG1vc3QgT3Bl
biBTb3VyY2UgcHJvamVjdHMgaGF2ZSB0byBkYXRlLiBJbnN0ZWFkIG9mDQpn
cm93aW5nIHNvbWVvbmWScyBwZXQgcHJvamVjdCwgU29mdHdhcmUgQ2FycGVu
dHJ5IGlzIGdvaW5nIHRvDQpvcmdhbml6ZS0tLWFuZCBwYXkgZm9yLS0tYSBk
ZXNpZ24gY29tcGV0aXRpb24uIElmIHRoaXMgd29ya3MsIGl0IGNvdWxkDQpi
ZSBhbiBpbnRlcmVzdGluZyBtb2RlbCBmb3Igb3RoZXIgT3BlbiBTb3VyY2Ug
cHJvamVjdHMgdG8gYWRvcHQuDQo8L0xJPg0KDQo8TEk+PEVNPkkgdGhpbmsg
W3Rvb2xdIGlzIGdvb2QgZW5vdWdoIGFscmVhZHktLS13aHkgYXJlIHlvdSBy
ZS1pbnZlbnRpbmcgdGhlIHdoZWVsPyA8L0VNPg0KPEJSPg0KVGhlIHNob3J0
IGFuc3dlciB0byB0aGlzIGlzIEFsYW4gQ29vcGVyJ3M6DQoNCg0KCTxCTE9D
S1FVT1RFPg0KCVRoZSBwaHJhc2UgImNvbXB1dGVyIGxpdGVyYXRlIHVzZXIi
IHJlYWxseSBtZWFucyB0aGUgcGVyc29uDQoJaGFzIGJlZW4gaHVydCBzbyBt
YW55IHRpbWVzIHRoYXQgdGhlIHNjYXIgdGlzc3VlIGlzIHRoaWNrDQoJZW5v
dWdoIHNvIGhlIG5vIGxvbmdlciBmZWVscyB0aGUgcGFpbi4NCgk8QlI+DQoJ
LS0gQWxhbiBDb29wZXIsDQoJPENJVEU+VGhlIElubWF0ZXMgYXJlIFJ1bm5p
bmcgdGhlIEFzeWx1bTwvQ0lURT4NCgk8L0JMT0NLUVVPVEU+DQoNClRoZSBs
b25nZXIgYW5zd2VyIGlzIHRoYXQgdGhlICJhY2NpZGVudGFsIGNvbXBsZXhp
dHkiIG9mIHRoZSBzdGFuZGFyZA0KVW5peCBjb21tYW5kLWxpbmUgdG9vbHNl
dCBpcyBhIG1ham9yIGJhcnJpZXIgdG8gaXRzIGFkb3B0aW9uIGJ5IHBlb3Bs
ZQ0Kd2hvIGFyZSBub3QgZnVsbC10aW1lIHByb2dyYW1tZXJzLCBvciBmb3Ig
d2hvbSBwcm9ncmFtbWluZyBpcyBqdXN0DQpzb21ldGhpbmcgdGhhdCBoYXMg
dG8gYmUgZG9uZSBpbiBvcmRlciB0byBkbyBzb21ldGhpbmcgZWxzZS4gTWFu
eQ0KcHJvZmVzc2lvbmFsIHByb2dyYW1tZXJzLS0tcGFydGljdWxhcmx5IHRo
b3NlIHdobyBlbmpveSBwcm9ncmFtbWluZw0KZW5vdWdoIHRvIGJlIGludm9s
dmVkIGluIHRoZSBPcGVuIFNvdXJjZSBtb3ZlbWVudC0tLWhhdmUgYmVlbiB1
c2luZw0KdGhlc2UgdG9vbHMgZm9yIHNvIGxvbmcgdGhhdCB0aGV5IHNpbXBs
eSBkb24ndCByZW1lbWJlciBob3cgaGFyZCBpdCBpcw0KdG8gY29uZmlndXJl
IEduYXRzLCBvciBwYXNzIHZhcmlhYmxlIGJpbmRpbmdzIGJldHdlZW4gcmVj
dXJzaXZlIGNhbGxzDQp0byBNYWtlLg0KPEJSPg0KQW5kIGxldCdzIGZhY2Ug
aXQ6IGlmIE1ha2Ugb3IgQXV0b2NvbmYgd2VyZSBidWlsdCBmcm9tIHNjcmF0
Y2ggdG9kYXksDQp0aGV5IHdvdWxkIGJlIHdyaXR0ZW4gYXMgZXh0ZW5zaWJs
ZSwgZW1iZWRkYWJsZSBtb2R1bGVzIGluIGENCmhpZ2gtbGV2ZWwgc2NyaXB0
aW5nIGxhbmd1YWdlLiBUaGlzIHdvdWxkIG5vdCBvbmx5IG1ha2UgdGhlbSBl
YXNpZXIgdG8NCnVzZSwgaXQgd291bGQgYWxzbyBtYWtlIHRoZW0gZWFzaWVy
IHRvIGxlYXJuLCBzaW5jZSB0aGV5IHdvdWxkIGVtcGxveQ0Kb25lIHN5bnRh
eCBmb3IgYWxsIHB1cnBvc2VzLiBNaWNyb3NvZnQgVmlzdWFsIEJhc2ljIGhh
cyBzaG93biBqdXN0IGhvdw0KdXNlZnVsIGl0IGNhbiBiZSB0byBoYXZlIGEg
c2luZ2xlIGdlbmVyYWwtcHVycG9zZSAiZ2x1ZSIgbGFuZ3VhZ2UNCmNhcGFi
bGUgb2YgYmluZGluZyBkaXNwYXJhdGUgdG9vbHMgdG9nZXRoZXI7IHRoZSBh
aW0gb2YgdGhlIGZpcnN0IGhhbGYNCm9mIHRoaXMgcHJvamVjdCBpcyB0byBi
cmluZyB0aG9zZSBiZW5lZml0cyB0byB0aGUgT3BlbiBTb3VyY2UNCmNvbW11
bml0eS4NCg0KPC9PTD4NCg0KPEgyPkRldmVsb3BtZW50PC9IMj4NCg0KPE9M
Pg0KDQo8TEk+PEVNPldoYXQgcHJvamVjdHMgYXJlIGN1cnJlbnRseSB1bmRl
ciB3YXk/IDwvRU0+DQo8QlI+U29mdHdhcmUgQ2FycGVudHJ5IHdpbGwgc3Rh
cnQgYnkgcHJvZHVjaW5nOg0KPC9MST4NCg0KCTxPTD4NCg0KCTxMST5hIHBs
YXRmb3JtIGluc3BlY3Rpb24gdG9vbCBzaW1pbGFyIHRvIEF1dG9jb25mOzwv
TEk+DQoNCgk8TEk+YSBidWlsZCBtYW5hZ2VtZW50IHRvb2wgc2ltaWxhciB0
byBNYWtlOzwvTEk+DQoNCgk8TEk+YW4gaXNzdWUgdHJhY2tpbmcgc3lzdGVt
IHNpbWlsYXIgdG8gR25hdHMgb3IgQnVnemlsbGE7IGFuZDwvTEk+DQoNCgk8
TEk+YSB1bml0IGFuZCByZWdyZXNzaW9uIHRlc3RpbmcgaGFybmVzcyB3aXRo
IHRoZQ0KCWZ1bmN0aW9uYWxpdHkgb2YgWFVuaXQsIEV4cGVjdCwgYW5kIERl
amFHbnUuPC9MST4NCg0KCTwvT0w+DQoNCjxMST48RU0+V2h5IHdlcmUgdGhv
c2UgdG9vbHMgY2hvc2VuPyA8L0VNPg0KPEJSPg0KVGhlc2UgZm91ciB0b29s
cyB3ZXJlIGNob3NlbiBhcyBpbml0aWFsIHRhcmdldHMgZm9yIHNldmVyYWwN
CnJlYXNvbnMuIEZpcnN0LCB0aGUgd29ya2luZyBwcmFjdGljZXMgdGhleSBz
dXBwb3J0IGFyZSBlc3NlbnRpYWwgdG8NCm1lZGl1bS1zY2FsZSBzb2Z0d2Fy
ZSBlbmdpbmVlcmluZy4gU2Vjb25kLCB0aGUgdG9vbHMgdGhleSBhcmUgaW50
ZW5kZWQNCnRvIHJlcGxhY2UgYXJlIGdlbmVyYWxseSByZWNvZ25pemVkIGFz
IGJlaW5nIG91dGRhdGVkIG9yIGZsYXdlZC4gVGhpcw0KY3JlYXRlcyBkZW1h
bmQsIGFuZCBpbmNyZWFzZXMgdGhlIG9kZHMgdGhhdCByYXRpb25hbCByZWlt
cGxlbWVudGF0aW9ucw0Kd2lsbCBiZSBhZG9wdGVkLiBUaGlyZCwgZW5vdWdo
IHBlb3BsZSBoYXZlIGVub3VnaCBleHBlcmllbmNlIHdpdGggdGhlDQp0b29s
cyB0aGF0IGFyZSB0byBiZSByZXBsYWNlZCB0byBwYXJ0aWNpcGF0ZSBpbiB0
aGUgZGVzaWduIGNvbXBldGl0aW9uDQpkZXNjcmliZWQgbGF0ZXIuDQo8L0xJ
Pg0KDQo8TEk+PEVNPldoeSBpc26SdCBbdG9vbF0gb24gdGhpcyBsaXN0Pzwv
RU0+DQo8QlI+DQpUaGVyZSBhcmUgc2V2ZXJhbCBvdGhlciB0b29scyB0aGF0
IGNvdWxkIGhhdmUgYmVlbiBvbiB0aGlzIGxpc3QsIGFuZA0Kd2lsbCBiZSBh
ZGRlZCBpZiB0aGUgZmlyc3Qgcm91bmQgb2Ygd29yayBnb2VzIHdlbGwuIEEg
Y3Jvc3MtcGxhdGZvcm0NCnZlcnNpb24gY29udHJvbCBzeXN0ZW0gdGhhdCBj
b3JyZWN0cyB0aGUgbWFueSBkZWZpY2llbmNpZXMgaW4gQ1ZTLCBmb3INCmV4
YW1wbGUsIGlzIGFuIG9idmlvdXMgY2FuZGlkYXRlLCBidXQgaXMgcHJvYmFi
bHkgdG9vIGxhcmdlIHRvIGJlDQp0YWNrbGVkIGluaXRpYWxseSwgYW5kIGFu
eSB3b3JrIGRvbmUgYnkgU29mdHdhcmUgQ2FycGVudHJ5IGNvdWxkIHdlbGwN
CmJlIHN1cGVyc2VkZWQgYnkgQml0S2VlcGVyLiBTaW1pbGFybHksIHRoZSB3
b3JsZCBuZWVkcyBhIGdvb2QgT3Blbg0KU291cmNlIHByb2plY3QgbWFuYWdl
bWVudCB0b29sIHdpdGggdGhlIGZ1bmN0aW9uYWxpdHkgb2YgTWljcm9zb2Z0
DQpQcm9qZWN0LCBidXQgcHJvYmFibHkgbmVlZHMgdGhlIGZvdXIgdG9vbHMg
bGlzdGVkIGFib3ZlIG1vcmUgdXJnZW50bHkuDQo8L0xJPg0KDQo8TEk+PEVN
PldoYXQgbGFuZ3VhZ2VzIGFuZCB0b29scyB3aWxsIGJlIHVzZWQ/IDwvRU0+
DQo8QlI+DQpBbGwgZGV2ZWxvcG1lbnQgd29yayB3aWxsIGJlIGRvbmUgaW4g
UHl0aG9uLg0KPC9MST4NCg0KPExJPjxFTT5XaHkgUHl0aG9uPyA8L0VNPg0K
PEJSPg0KVGhpcyBpcyBhY3R1YWxseSB0aHJlZSBxdWVzdGlvbnM6DQoNCgk8
T0w+DQoNCgk8TEk+PEVNPldoeSBtYW5kYXRlIGEgbGFuZ3VhZ2U/IDwvRU0+
DQoJPEJSPg0KCUJ1aWxkaW5nIGV2ZXJ5dGhpbmcgaW4gYSBzaW5nbGUgbGFu
Z3VhZ2Ugd2lsbCBlbmNvdXJhZ2UNCglwcm9qZWN0cyB0byBzaGFyZSBjb2Rl
LCB3aGljaCB3aWxsIGJvdGgga2VlcCB0aGUgdG90YWwgdm9sdW1lDQoJb2Yg
Y29kZSBtYW5hZ2VhYmxlIGFuZCByYWlzZSB0aGUgcXVhbGl0eSBvZiB0aGUN
CglpbXBsZW1lbnRhdGlvbnMgKHNpbmNlIHRoZSBzaGFyZWQgY29kZSB3aWxs
IGJlIGV4ZXJjaXNlZCwgYW5kDQoJdGVzdGVkLCBpbiBtYW55IGRpZmZlcmVu
dCB3YXlzKS4gVXNpbmcgYSBzaW5nbGUgbGFuZ3VhZ2Ugd2lsbA0KCWFsc28g
aW1wcm92ZSB0aGUgY29tcHJlaGVuc2liaWxpdHksIGFuZCBoZW5jZSB0aGUN
CgltYWludGFpbmFiaWxpdHkgYW5kIGV4dGVuc2liaWxpdHksIG9mIHRoZSB0
b29scy4gVGhlIHZhcnlpbmcNCglzeW50YXggb2YgTWFrZSwgQXV0b2NvbmYs
IGFuZCBvdGhlciB0b29scyBpcyBhIGxhcmdlIHByYWN0aWNhbA0KCWJhcnJp
ZXIgdG8gdGhlaXIgYWRvcHRpb24gYnkgcGVvcGxlIHdobyBoYXZlIGJldHRl
ciAob3IgYXQNCglsZWFzdCBtb3JlIHByZXNzaW5nKSB0aGluZ3MgdG8gZG8g
dGhhbiBsZWFybiB5ZXQgYW5vdGhlcg0KCXN5bnRheC4gTWljcm9zb2Z0knMg
VmlzdWFsIEJhc2ljIGhhcyBzaG93biBob3cgcG93ZXJmdWwgaXQNCglpcyB0
byB1c2UgYSBzaW5nbGUsIGZsZXhpYmxlIGxhbmd1YWdlIGV2ZXJ5d2hlcmUu
DQoJPC9MST4NCg0KCTxMST48RU0+V2h5IHVzZSBhIHNjcmlwdGluZyBsYW5n
dWFnZT8gPC9FTT4NCgk8QlI+DQoJQSBsb3Qgb2YgYW5lY2RvdGFsIGV2aWRl
bmNlIHNob3dzIHRoYXQgInJlbGF4ZWQiIGhpZ2gtbGV2ZWwNCglsYW5ndWFn
ZXMgKGxpa2UgUHl0aG9uLCBQZXJsLCBhbmQgVmlzdWFsIEJhc2ljKSBhcmUg
bW9yZQ0KCXByb2R1Y3RpdmUgdmVoaWNsZXMgZm9yIHByb2Nlc3MgbWFuYWdl
bWVudCwgdGV4dCBwcm9jZXNzaW5nLA0KCWFuZCBzaW1pbGFyIHRhc2tzIHRo
YW4gdGhlaXIgInN0cmljdCIgZXF1aXZhbGVudHMgKGxpa2UgQysrDQoJYW5k
IEphdmEpLg0KCTwvTEk+DQoNCgk8TEk+PEVNPldoeSB1c2UgUHl0aG9uPyA8
L0VNPg0KCTxCUj4NCglUaGUgZm91ciBjYW5kaWRhdGVzIGNvbnNpZGVyZWQg
d2VyZSBWaXN1YWwgQmFzaWMsIFBlcmwsIFRjbCwNCglhbmQgUHl0aG9uLg0K
DQoJCTxPTD4NCg0KCQk8TEk+PEVNPlZpc3VhbCBCYXNpYyA8L0VNPg0KCQk8
QlI+DQoJCVZpc3VhbCBCYXNpYyBpcyBwcm9wcmlldGFyeSwgYW5kIHRoZXJl
IGlzIG5vDQoJCWluZGljYXRpb24gdGhhdCBhIGNyZWRpYmxlIE9wZW4gU291
cmNlIGltcGxlbWVudGF0aW9uDQoJCXdpbGwgYXBwZWFyIGFueSB0aW1lIHNv
b24uDQoJCTwvTEk+DQoNCgkJPExJPjxFTT5QZXJsPC9FTT4NCgkJPEJSPg0K
CQlQZXJsIHdhcyBhIHN0cm9uZyBjb250ZW5kZXIsIHByaW1hcmlseSBiZWNh
dXNlIG9mIHRoZQ0KCQltYW55IGxpYnJhcmllcyB0aGF0IGhhdmUgYmVlbiBk
ZXZlbG9wZWQgZm9yIGl0LCBhbmQNCgkJYmVjYXVzZSBvZiB0aGUgbnVtYmVy
IG9mIGJvb2tzIHRoYXQgZG9jdW1lbnQNCgkJaXQuIEhvd2V2ZXIsIG91ciBl
eHBlcmllbmNlIHRlYWNoaW5nIGF0IExvcyBBbGFtb3Mgd2FzDQoJCXRoYXQg
UGVybJJzIHN5bnRheCBpcyBoYXJkIHRvIGxlYXJuLCBpdHMgYmVoYXZpb3IN
CgkJb2Z0ZW4gYXJiaXRyYXJ5LCBhbmQgaXRzIHNpemUgaW50aW1pZGF0aW5n
LiBXaGlsZQ0KCQlmdWxsLXRpbWUgcHJvZmVzc2lvbmFsIHByb2dyYW1tZXJz
IHdpdGggc2V2ZXJhbCBvdGhlcg0KCQlsYW5ndWFnZXMgdW5kZXIgdGhlaXIg
YmVsdHMgbWlnaHQgKGFuZCBvZnRlbiBkbykgc2F5DQoJCXRoYXQgaXQgYWxs
IG1ha2VzIHNlbnNlIG9uY2UgeW91IGtub3cgaXQsIHdlIHdhbnQgdG8NCgkJ
bWFrZSB0aGUgbGVhcm5pbmcgY3VydmUgYXMgZ2VudGxlIGFzIHBvc3NpYmxl
Lg0KCQk8L0xJPg0KDQoNCgkJPExJPjxFTT5UY2w8L0VNPg0KCQk8QlI+DQoJ
CVRjbCBpcyBlYXNpZXIgdG8gbGVhcm4gYW5kIHJlYWQgdGhhbiBQZXJsLCBi
dXQgaXMgbm90DQoJCWFzIHdlbGwgZG9jdW1lbnRlZCwgYW5kIGRvZXNuknQg
Y29tZSB3aXRoIGFzIG1hbnkNCgkJbGlicmFyaWVzLiBIYWQgUHl0aG9uIG5v
dCBleGlzdGVkLCBUY2wgd291bGQgcHJvYmFibHkNCgkJaGF2ZSBiZWVuIGNo
b3NlbiBmb3IgdGhpcyBwcm9qZWN0Lg0KCQk8L0xJPg0KDQoJCTxMST48RU0+
UHl0aG9uPC9FTT4NCgkJPEJSPg0KCQlQeXRob24gcHJvdmlkZXMgdGhlIHNh
bWUgZnVuY3Rpb25hbGl0eSBhcyBQZXJsIG9yIFRjbCwNCgkJYnV0IGhhcyBw
cm92ZWQgdG8gYmUgZWFzaWVyIHRvIGxlYXJuLCByZWFkLCBhbmQNCgkJcmVt
ZW1iZXIuIChGb3IgZXhhbXBsZSwgd29yZHMgbGlrZSAiZXhjZXB0IiBhbmQN
CgkJInVubGVzcyIgYXBwZWFyIG11Y2ggbGVzcyBvZnRlbiBpbiBQeXRob24g
cmVmZXJlbmNlDQoJCW1hdGVyaWFsIHRoYW4gdGhleSBkbyBpbiBQZXJsIHJl
ZmVyZW5jZSBtYXRlcmlhbC4pDQoJCVB5dGhvbiBpcyBub3QgeWV0IGFzIGV4
dGVuc2l2ZWx5IGRvY3VtZW50ZWQgYXMgUGVybCwNCgkJYnV0IHRoZSBudW1i
ZXIgb2YgYm9va3MgaXMgZ3Jvd2luZywgYXMgaXMgdGhlIG51bWJlcg0KCQlv
ZiBtb2R1bGVzIGFuZCBsaWJyYXJpZXMuIEZpbmFsbHksIHRoZSBQeXRob24N
CgkJY29tbXVuaXR5IGlzIHN0aWxsIHNtYWxsIGVub3VnaCBmb3IgYSBwcm9q
ZWN0IGxpa2UNCgkJdGhpcyBvbmUgdG8gYXR0cmFjdCB0aGUgYXR0ZW50aW9u
IG9mIGEgc2lnbmlmaWNhbnQNCgkJcHJvcG9ydGlvbiBvZiBpdC4NCgkJPC9M
ST4NCg0KCQk8L09MPg0KCTwvTEk+DQoJPC9PTD4NCg0KPC9MST4NCg0KPExJ
PjxFTT5Ib3cgd2lsbCBkZXZlbG9wbWVudCBiZSBvcmdhbml6ZWQgYW5kIGNv
b3JkaW5hdGVkPyA8L0VNPg0KPEJSPg0KRXZlcnl0aGluZyB0aGUgcHJvamVj
dCBwcm9kdWNlcy0tLWRlc2lnbnMsIGNyaXRpcXVlcyBvZiB0aG9zZSBkZXNp
Z25zLA0KdGVzdCBzdWl0ZXMsIGFuZCBleGFtcGxlcywgYXMgd2VsbCBhcyBh
Y3R1YWwgc291cmNlIGNvZGUtLS13aWxsIGJlDQphdmFpbGFibGUgdGhyb3Vn
aCB0aGUgcHJvamVjdJJzIFdlYiBzaXRlIGF0DQpzb2Z0d2FyZS1jYXJwZW50
cnkuY29kZXNvdXJjZXJ5LmNvbS4gRWFjaCBwcm9qZWN0IHdpbGwgaGF2ZSBh
DQpjb29yZGluYXRvciwgd2hvc2Ugam9iIGl0IHdpbGwgYmUgdG8gbW9kZXJh
dGUgZGlzY3Vzc2lvbiwgc3luY2hyb25pemUNCnJlbGVhc2VzLCB0cmFjayB3
b3JrIGl0ZW1zLCBhbmQgcmVwb3J0IG9uIHByb2dyZXNzLiBUaGUgY29vcmRp
bmF0b3INCndpbGwgYWxzbyBiZSByZXNwb25zaWJsZSBmb3IgY29sbGF0aW5n
IGFuZCBlZGl0aW5nIGZlZWRiYWNrIGZyb20NCmp1ZGdlcyBkdXJpbmcgdGhl
IGRlc2lnbiBjb21wZXRpdGlvbi4NCjwvTEk+DQoNCjwvT0w+DQoNCjxIMj5E
ZXNpZ24gY29tcGV0aXRpb248L0gyPg0KDQo8T0w+DQoNCjxMST48RU0+V2h5
IGEgZGVzaWduIGNvbXBldGl0aW9uPzwvRU0+DQo8QlI+DQpNb3N0IE9wZW4g
U291cmNlIHBhY2thZ2VzIGhhdmUgdGhlaXIgcm9vdHMgaW4gc29tZW9uZZJz
IHBldCBob2JieQ0KcHJvamVjdCwgd2hpY2ggb3RoZXJzIGhhdmUgcGlja2Vk
IHVwLCBleHRlbmRlZCwgYW5kIG1vZGlmaWVkLiBUaGlzDQpraW5kIG9mIG9y
Z2FuaWMgZ3Jvd3RoIGhhcyBhIGxvdCBvZiBnb29kIGZlYXR1cmVzLCBidXQg
YQ0Kd2VsbC1kb2N1bWVudGVkIGRlc2lnbiBpcyBub3Qgb25lIG9mIHRoZW0u
IEFzIGEgcmVzdWx0LCBwcm9ncmFtbWVycw0Kb2Z0ZW4gaGF2ZSB0byByZWx5
IG9uIGZvbGtsb3JlIGFuZCByZXZlcnNlIGVuZ2luZWVyaW5nIGlmIHRoZXkg
d2FudCB0bw0KYWRkIHRvLCBvciBmaXgsIHRoZXNlIHRvb2xzLiBJbiBhZGRp
dGlvbiwgdGhlcmUgaXMgYSBkZWFydGggb2YNCmV4YW1wbGVzIG9mIGdvb2Qg
ZGVzaWduIGZvciBuZXcgcHJvZ3JhbW1lcnMgdG8gbGVhcm4gZnJvbS4gPEJS
PiBUaGUNClNvZnR3YXJlIENhcnBlbnRyeSBwcm9qZWN0IGhvcGVzIHRvIGFk
ZHJlc3MgYm90aCBwcm9ibGVtcyBieSBydW5uaW5nIGENCnR3by1zdGFnZSBk
ZXNpZ24gY29tcGV0aXRpb24uIFRoZSBiZXN0IGVudHJpZXMgaW4gYm90aCBy
b3VuZHMgd2lsbCBiZQ0KcHVibGlzaGVkLCBhbG9uZyB3aXRoIGNvbW1lbnRh
cnkgZnJvbSB0aGUgY29tcGV0aXRpb26Scw0KanVkZ2VzLiBUaGlzIG1hdGVy
aWFsIHdpbGwgc2VydmUgYm90aCB0byBpbmZvcm0gYW5kIGd1aWRlIGZ1cnRo
ZXINCmRldmVsb3BtZW50LCBhbmQgdG8gc2hvdyBub3ZpY2VzIHdoYXQgZXhw
ZXJpZW5jZWQgcHJvZ3JhbW1lcnMgdGhpbmsNCmFib3V0IGJlZm9yZSB0aGV5
IHN0YXJ0IGNvZGluZy4NCjwvTEk+DQoNCjxMST48RU0+V2hvIGNhbiBlbnRl
cj8gPC9FTT4NCjxCUj4NCkV2ZXJ5b25lOiBpbmRpdmlkdWFscyBhbmQgdGVh
bXMsIHN0dWRlbnRzIGFuZCBwcm9mZXNzaW9uYWxzLCBmcm9tDQphbnl3aGVy
ZSBpbiB0aGUgd29ybGQuDQo8L0xJPg0KDQo8TEk+PEVNPldoYXQgYXJlIHRo
ZSBydWxlcz8gPC9FTT4NCjxCUj5UaGUgZnVsbCBydWxlcyBhcmUgYXZhaWxh
YmxlIGF0Og0KPENFTlRFUj4NCnNvZnR3YXJlLWNhcnBlbnRyeS5jb2Rlc291
cmNlcnkuY29tL2Rlc2lnbi1jb21wZXRpdGlvbi9ydWxlcy5odG1sDQo8L0NF
TlRFUj4NCkJhc2ljYWxseSwgaW5pdGlhbCBzdWJtaXNzaW9ucyBtdXN0IGJl
IHdyaXR0ZW4gaW4gRW5nbGlzaCwgYW5kIGNhbiBiZQ0KdXAgdG8gMTAgcGFn
ZXMgbG9uZy4gRXhhbXBsZXMgY291bnQgYWdhaW5zdCB0aGlzIGxpbWl0LCBi
dXQgZGlhZ3JhbXMNCmFuZCBhIFVuaXgtc3R5bGUgbWFuIHBhZ2UgZG8gbm90
LiBBbnkgcGVyc29uIG9yIHRlYW0gbWF5IHN1Ym1pdCBvbmx5DQpvbmUgZW50
cnkgaW4gYW55IGdpdmVuIGNhdGVnb3J5LCBidXQgY2FuIHN1Ym1pdCBpbiBh
cyBtYW55IG9mIHRoZSBmb3VyDQpjYXRlZ29yaWVzIGFzIGRlc2lyZWQuDQo8
QlI+DQpUaGUgYmVzdCBmb3VyIGVudHJpZXMgaW4gZWFjaCBjYXRlZ29yeSB3
aWxsIGJlIGF3YXJkZWQgVVMkMjUwMCwgYW5kDQphc2tlZCB0byBzdWJtaXQg
ZnVsbCBkZXNpZ25zLiBQYXJ0aWNpcGFudHMgd2lsbCBiZSBzdHJvbmdseSBl
bmNvdXJhZ2VkDQp0byBwb29sIHRoZWlyIGVmZm9ydHMgZm9yIHRoZSBzZWNv
bmQgcm91bmQuIFRoZSBiZXN0IHNlY29uZC1yb3VuZA0Kc3VibWlzc2lvbiB3
aWxsIGJlIGF3YXJkZWQgYW4gYWRkaXRpb25hbCBVUyQ3NTAwLCB3aGlsZSB0
aGUgb3RoZXJzDQp3aWxsIHJlY2VpdmUgYW5vdGhlciBVUyQyNTAwIGVhY2gu
IFRoZSByZWFsIHJld2FyZCB3aWxsIGJlIHNlZWluZyB0aGUNCmRlc2lnbiBp
bXBsZW1lbnRlZCwgYW5kIGJlaW5nIGluIGEgZ29vZCBwb3NpdGlvbiB0byBi
aWQgb24gdGhlDQppbXBsZW1lbnRhdGlvbiB3b3JrLg0KPC9MST4NCg0KPExJ
PjxFTT5XaGF0IHNob3VsZCBmaXJzdC1yb3VuZCBzdWJtaXNzaW9ucyBjb250
YWluPyA8L0VNPg0KPEJSPg0KQW4gZXhhbXBsZSBvZiB3aGF0IGEgc3VibWlz
c2lvbiBzaG91bGQgY29udGFpbiwgYW5kIGhvdyBpdCBzaG91bGQgYmUNCmZv
cm1hdHRlZCBpcyBhdmFpbGFibGUgYXQ6DQo8Q0VOVEVSPg0Kc29mdHdhcmUt
Y2FycGVudHJ5LmNvZGVzb3VyY2VyeS5jb20vZGVzaWduLWNvbXBldGl0aW9u
L2V4YW1wbGUuaHRtbA0KPENFTlRFUj4NCkZpcnN0LXJvdW5kIGVudHJpZXMg
c2hvdWxkIGZvY3VzIHByaW1hcmlseSBvbiB3aGF0IHRoZSB0b29sIHdpbGwg
ZG8sDQphbmQgaG93IGl0IHdpbGwgYmUgdXNlZDogY29tbWFuZC1saW5lIG9w
dGlvbnMsIGlucHV0IGFuZCBvdXRwdXQgZmlsZQ0KZm9ybWF0cywgc2tldGNo
ZXMgb2YgV2ViIGFuZCBHVUkgaW50ZXJmYWNlcyAod2hlcmUgYXBwcm9wcmlh
dGUpLCBhbmQNCnNvIG9uLiBTZWNvbmQtcm91bmQgc3VibWlzc2lvbnMgd2ls
bCB0aGVuIGJlIGV4cGVjdGVkIHRvIGRlc2NyaWJlIGhvdw0KaXSScyBhbGwg
Z29pbmcgdG8gYmUgaW1wbGVtZW50ZWQuDQo8L0xJPg0KDQo8TEk+PEVNPldo
byB3aWxsIHRoZSBqdWRnZXMgYmU/IDwvRU0+DQo8QlI+DQo8Qj5OZWVkIHRv
IGZpcm0gdXAgdGhlIGxpc3Qgb2YganVkZ2VzIEFTQVAuPC9CPg0KPC9MST4N
Cg0KPExJPjxFTT5XaGVuIGFyZSB0aGUgZGVhZGxpbmVzPyA8L0VNPg0KPEJS
Pg0KVGhlIGRlYWRsaW5lIGZvciBmaXJzdC1yb3VuZCBzdWJtaXNzaW9ucyBp
cyBNYXJjaCAzMSwgMjAwMC4gVGhlIGZpdmUNCmJlc3QgcHJvcG9zYWxzIGlu
IGVhY2ggY2F0ZWdvcnkgd2lsbCBiZSBhbm5vdW5jZWQgb24gQXByaWwgMzAs
DQoyMDAwLiBGdWxsIHN1Ym1pc3Npb25zIGFyZSBkdWUgb24gSnVuZSAxLCAy
MDAwLCBhbmQgd2lubmVycyB3aWxsIGJlDQphbm5vdW5jZWQgb24gSnVuZSAz
MCwgMjAwMC4NCjwvTEk+DQoNCjxMST48RU0+V29uJ3QgcHJpemVzIGRpc2Nv
dXJhZ2UgY28tb3BlcmF0aW9uPyA8L0VNPg0KPEJSPg0KV2UgZG9uknQga25v
dy4gT24gdGhlIG9uZSBoYW5kLCBwZW9wbGUgbWlnaHQgd2FudCB0byBob2Fy
ZCB0aGVpcg0KYmVzdCBpZGVhczsgb24gdGhlIG90aGVyIGhhbmQsIHRoZSBi
ZXN0IGRlc2lnbnMgaW4gYm90aCByb3VuZHMgYXJlDQpnb2luZyB0byBiZSBw
dWJsaXNoZWQsIGFsb25nIHdpdGggdGhlIGp1ZGdlc5IgY29tbWVudGFyeSwg
YW5kIHdlDQp3aWxsIGJlIGVuY291cmFnaW5nIHBhcnRpY2lwYW50cyB0byBw
b29sIHRoZWlyIGVmZm9ydHMuIE1vc3Qgb2YgdGhlDQptb25leSB0aGF0IHdp
bGwgYmUgcGFpZCBvdXQgd2lsbCBnbyB0byBmdW5kIGltcGxlbWVudGF0aW9u
LCB0ZXN0aW5nLA0KYW5kIGRvY3VtZW50YXRpb247IHdlIGhvcGUgdGhhdCBw
ZW9wbGUgd2lsbCBjb2xsYWJvcmF0ZSBpbiB0aGUgZWFybHkNCnN0YWdlcywg
YW5kIHRyZWF0IHRoZSBwcml6ZXMgYXMgcmVjb2duaXRpb24gZm9yIHRoZWly
IGVmZm9ydCwgcmF0aGVyDQp0aGFuIHRyZWF0aW5nIFVTJDEwLDAwMCBhcyB0
aGVpciByZXRpcmVtZW50IGZ1bmQuDQo8L0xJPg0KDQo8L09MPg0KDQo8SDI+
RG9jdW1lbnRhdGlvbjwvSDI+DQoNCjxPTD4NCg0KPExJPjxFTT5XaGF0IGRv
Y3VtZW50YXRpb24gd2lsbCBiZSBwcm9kdWNlZD88L0VNPg0KPEJSPg0KVGhl
IFNvZnR3YXJlIENhcnBlbnRyeSBwcm9qZWN0IHdpbGwgcHJvZHVjZSBzZXZl
cmFsIGRpZmZlcmVudCBraW5kcyBvZg0KZG9jdW1lbnRhdGlvbjoNCg0KCTxP
TD4NCg0KCTxMST48RU0+RGVzaWduIGRvY3VtZW50YXRpb24uIDwvRU0+DQoJ
PEJSPg0KCUFzIHN0YXRlZCBhYm92ZSwgdGhlIGJlc3QgZGVzaWducyBpbiBl
YWNoIGNhdGVnb3J5IHdpbGwgYmUNCglwdWJsaXNoZWQsIGFsb25nIHdpdGgg
dGhlIGp1ZGdlc5IgY29tbWVudGFyeS4gVGhpcyBtYXRlcmlhbA0KCW91Z2h0
IHRvIHBsYXkgdGhlIHJvbGUgdGhhdCBtdXNpYyBjcml0aWNpc20gaGFzIHBs
YXllZCBpbiB0aGUNCglkZXZlbG9wbWVudCBvZiBtdXNpYywgYnkgZ2l2aW5n
IG5ld2NvbWVycyAoYW5kIGV4cGVyaWVuY2VkDQoJcHJvZ3JhbW1lcnMpIGJl
dHRlciBpbnNpZ2h0IGludG8gaG93IGdvb2QgZGVzaWduZXJzIHRoaW5rLg0K
CTwvTEk+DQoNCgk8TEk+PEVNPlVzZXIgZ3VpZGVzLiA8L0VNPg0KCTxCUj4N
CglUaGUgcHJvamVjdCB3aWxsIHBheSBmb3IgdGhlIGRldmVsb3BtZW50IG9m
IG1hbiBwYWdlcywgdXNlcg0KCWd1aWRlcywgb25saW5lIGhlbHAsIGFuZCBh
bGwgdGhlIG90aGVyIGRvY3VtZW50YXRpb24gbmVlZGVkIHRvDQoJdHVybiBh
IHByb2dyYW0gaW50byBhIHByb2R1Y3QuDQoJPC9MST4NCg0KCTxMST48RU0+
VGVzdCBzdWl0ZXMuIDwvRU0+DQoJPEJSPg0KCVRoZSBwcm9qZWN0IHdpbGwg
YWxzbyBwYXkgZm9yIHRoZSBkZXZlbG9wbWVudCBvZg0KCWluZHVzdHJpYWwt
c3RyZW5ndGggdGVzdCBzdWl0ZXMgZm9yIGFsbCBmb3VyIHRvb2xzLiBUaGVz
ZQ0KCXN1aXRlcyB3aWxsIGJlIHB1Ymxpc2hlZCwgYm90aCB0byBzZXJ2ZSBh
cyBhIHN0YXJ0aW5nIHBvaW50DQoJZm9yIG90aGVyIHByb2plY3RzIGFuZCB0
byBkZW1vbnN0cmF0ZSBnb29kIHByYWN0aWNlLg0KCTwvTEk+DQoNCgk8TEk+
PEVNPkNhc2Ugc3R1ZGllcy4gPC9FTT4NCgk8QlI+DQoJSXQgaXMgb2Z0ZW4g
ZWFzaWVyIHRvIHNob3cgc29tZW9uZSBob3cgdG8gZG8gc29tZXRoaW5nIHRo
YW4gdG8NCglleHBsYWluIGl0IHRvIHRoZW0uIFRoZSBTb2Z0d2FyZSBDYXJw
ZW50cnkgcHJvamVjdCB3aWxsIHBheQ0KCWZvciBjYXNlIHN0dWRpZXMgdGhh
dCBkZXNjcmliZSBob3cgdGhlc2UgdG9vbHMsIGFuZCAobW9yZQ0KCWltcG9y
dGFudGx5KSB0aGUgd29ya2luZyBwcmFjdGljZXMgdGhleSBzdXBwb3J0LCBo
YXZlIGJlZW4NCglkZXBsb3llZCBpbiBwcmFjdGljZS4gQ2hlY2tsaXN0cywg
dGVtcGxhdGVzIGZvciBmb3JtcywgYW5kDQoJb3RoZXIgZXJyYXRhIGNhbiBi
ZSBzdWJtaXR0ZWQuDQoJPC9MST4NCg0KCTwvT0w+DQoNCjwvTEk+DQoNCjxM
ST48RU0+V2hhdCBmb3JtYXQocykgd2lsbCBiZSB1c2VkPyA8L0VNPg0KPEJS
Pg0KVGhlIHByaW1hcnkgZm9ybWF0IGZvciBhbGwgZG9jdW1lbnRhdGlvbiB3
aWxsIGJlIEhUTUwuIFRoZSBwcm9qZWN0DQp3aWxsIG1pZ3JhdGUgdG8gWE1M
IHdoZW4gYW5kIGFzIGZlYXNpYmxlLg0KPC9MST4NCg0KPExJPjxFTT5XaGF0
IHJlc3RyaWN0aW9ucyBhcmUgdGhlcmUgb24gdXNpbmcgdGhlIGRvY3VtZW50
YXRpb24/PC9FTT4NCjxCUj4NCk9ubHkgdGhvc2UgdGhhdCBhbHNvIGFwcGx5
IHRvIHRoZSBzb2Z0d2FyZSwgdW5kZXIgdGhlIHRlcm1zIG9mIGl0cw0KT3Bl
biBTb3VyY2UgbGljZW5zZS4gWW91IGNhbiBjb3B5IGFuZCBkaXN0cmlidXRl
IHRoZSBkb2N1bWVudGF0aW9uIGluDQphbnkgZm9ybSwgYnV0IG9ubHkgaWYg
aXRzIGF1dGhvcihzKSBhbmQgb3JpZ2luIGFyZSBjbGVhcmx5IHNob3duLCBh
bmQNCmlmIHlvdSBpbmNsdWRlIGEgZGVzY3JpcHRpb24gb2YgaG93IHJlYWRl
cnMgY2FuIGFjY2VzcyB0aGUNCm9yaWdpbmFscy4gSW4gcGFydGljdWxhciwg
dGhlIGRvY3VtZW50YXRpb24gY2FuIGJlIHJlcHJvZHVjZWQgaW4NCmJvb2tz
LCBidXQgb25seSBpZiB0aGUgYXV0aG9ycywgb3JpZ2luLCBhbmQgbG9jYXRp
b24gb2YgdGhlIG9yaWdpbmFscw0KaXMgcHJpbnRlZCBjbGVhcmx5IG9uIGVh
Y2ggcGFnZS4NCjwvTEk+DQoNCjwvT0w+DQoNCjwvQk9EWT4NCjwvSFRNTD4N
Cg==
--168427786-691315853-945920620=:4839--


From jack@oratrix.nl  Thu Dec 23 10:24:26 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Thu, 23 Dec 1999 11:24:26 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 getopt.py,1.7,1.8
In-Reply-To: Message by Guido van Rossum <guido@CNRI.Reston.VA.US> ,
 Wed, 22 Dec 1999 13:23:45 -0500 , <199912221823.NAA16517@eric.cnri.reston.va.us>
Message-ID: <19991223102426.CCB75370CF2@snelboot.oratrix.nl>

> Vladimir.Marangozov@inrialpes.fr:
> 
> > Yes. Besides, I still think that string-based exceptions are just
> > convenient for quick & dirty, throw-away test scripts.
> 
> They have a hard-to-understand quirk though: the id() of the string is
> used to check rather than its value, so that except "foo" doesn't
> necessarily catch raise "foo"; but due to various optimization, this
> usually works, and people get bent out of shape when it doesn't.

I sort-of use this feature when I'm debugging: if I want to know what happens 
in an exception that is usually caught somewhere higher up in the call stack I 
simply put quotes around the exception name and the exception will happen 
uncaught. The same trick works for except: clauses.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From harri.pasanen@trema.com  Thu Dec 23 11:44:04 1999
From: harri.pasanen@trema.com (Harri Pasanen)
Date: Thu, 23 Dec 1999 13:44:04 +0200
Subject: [Python-Dev] Re: [PSA MEMBERS] Please test new dynamic load behavior
References: <Pine.LNX.4.10.9912221200290.16305-100000@nebula.lyra.org>
Message-ID: <38620B04.7CC64485@trema.com>


Greg Stein wrote:
> 
> Hi all,
> 
> I reorganized Python's dynamic load/import code over the past few days.
> Gudio provided some feedback, I did some more mods, and now it is checked
> into CVS. The new loading behavior has been tested on Linux, IRIX, and
> Solaris (and probably Windows by now).
> 

...


What was the motivation behind this modification?

Just curious,

-Harri


From Vladimir.Marangozov@inrialpes.fr  Thu Dec 23 12:12:40 1999
From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov)
Date: Thu, 23 Dec 1999 13:12:40 +0100 (CET)
Subject: [Python-Dev] Please test new dynamic load behavior
In-Reply-To: <Pine.LNX.4.10.9912221200290.16305-100000@nebula.lyra.org> from "Greg Stein" at Dec 22, 1999 12:11:43 PM
Message-ID: <199912231212.NAA26572@python.inrialpes.fr>

Greg Stein wrote:
> 
> Hi all,
> 
> I reorganized Python's dynamic load/import code over the past few days.
> Gudio provided some feedback, I did some more mods, and now it is checked
> into CVS. The new loading behavior has been tested on Linux, IRIX, and
> Solaris (and probably Windows by now).
> 

Great work Greg!

> Here are some of the platforms that I believe need specific testing:
> 
> - NetBSD, FreeBSD, OpenBSD, ...
> - AIX
> - HP/UX
> - BeOS
> - NeXT
> - Mac
> - OS/2
> - Win16

AFAICT, the AIX version works perfectly okay.

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From jim@digicool.com  Thu Dec 23 14:41:23 1999
From: jim@digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 09:41:23 -0500
Subject: [Python-Dev] str(1L) -> '1' ?
Message-ID: <38623493.E6BA6D6F@digicool.com>

In November there was an interesting discussion on comp.lang.python 
about the meaning of __str__ and __repr__.  One tidbit that came out
of this discussion was that __str__ for longs should drop the trailing 
'L'. Was there a decision on this? I'd really like this to happen.

We do alot of work with RDBMS systems and long integers seem to
come up alot with these systems (as do other fix-decimal number, 
but that's another topic ;).  For example, our latest Sybase and
Oracle support in Zope returns long integers for RDBMS types
like NUMBER(10,0).  The trailing 'L' in the string representation
is causeing us some headaches.  This seems also to be an issue when
using the current standard ODBC interface with Oracle, as indicated
in a DB-SIG post today.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From guido@CNRI.Reston.VA.US  Thu Dec 23 14:46:58 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 09:46:58 -0500
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: Your message of "Thu, 23 Dec 1999 09:41:23 EST."
 <38623493.E6BA6D6F@digicool.com>
References: <38623493.E6BA6D6F@digicool.com>
Message-ID: <199912231446.JAA22086@eric.cnri.reston.va.us>

[Jim F]
> In November there was an interesting discussion on comp.lang.python 
> about the meaning of __str__ and __repr__.  One tidbit that came out
> of this discussion was that __str__ for longs should drop the trailing 
> 'L'. Was there a decision on this? I'd really like this to happen.

Yes, I'd like it to happen.  I'd also like repr() of a float to return
the full precision (using the "%.17g" sprintf format).

I haven't done it for lack of time -- feel free to send a patch (don't
forget the disclaimer from http://www.python.org/1.5/bugrelease.html).

We haven't decided yet what to do with the greater topic of that
discussion (or was it a different one?) -- whether the values printed
by typing a bare expression in interactive mode should use str(),
repr(), or str-special-casing-the-snot-out-of-strings().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@digicool.com  Thu Dec 23 14:51:14 1999
From: jim@digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 09:51:14 -0500
Subject: [Python-Dev] Fixed-decimal types
Message-ID: <386236E2.F97109D3@digicool.com>

While on the subject of RDBMS systems, a common need is to be able to
work with fixed-decimal data.  I think a standard Python fixed-decimal
type would help to make Python database interfaces alot more robust.
I even wonder if the Python long type might be hijacked for this purpose
by adding a "scale" that indicates the number of digits to the right
of the decimal point.  For example, an expression like:

  1000000000.2500L

would create a fixed decimal number with a scale of 4.

People have built Python classes for fixed-decimal
types, but when working with RDBMS data, one often deals with
lots of data and efficiency matters.  I also suspect that adding
scale to longs wouldn't be that hard and would be a fairly natural
extension.

In any case, a "standard" (being in the standard library would
be sufficient) fixed-decimal type would probably lead to better
database interfaces that (at least more) properly handled 
fixed-decimal data.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From guido@CNRI.Reston.VA.US  Thu Dec 23 14:56:33 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 09:56:33 -0500
Subject: [Python-Dev] Fixed-decimal types
In-Reply-To: Your message of "Thu, 23 Dec 1999 09:51:14 EST."
 <386236E2.F97109D3@digicool.com>
References: <386236E2.F97109D3@digicool.com>
Message-ID: <199912231456.JAA22134@eric.cnri.reston.va.us>

What would be scale of the product of two fixed-decimal numbers?
E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L?  There are
arguments for either.  Same question for division (harder, I think).

I like the idea of using the dd.ddL notation for this.

I have no time to implement it but would not be unwilling to accept
patches.  They would have to be accompanied with a wet signature, see
http://www.python.org/1.5/wetsign.html.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@digicool.com  Thu Dec 23 15:00:25 1999
From: jim@digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 10:00:25 -0500
Subject: [Python-Dev] re: Open Source design competition / Python / software
 tools
References: <Pine.LNX.4.10.9912222242070.4839-200000@akbar.nevex.com>
Message-ID: <38623909.CDF41014@digicool.com>

gvwilson@nevex.com wrote:
> 
> Hi, folks.  I hope you don't mind another mail out of the blue, but I got
> notice on Saturday that the Department of Energy is giving me $860K over
> two years to support development of easier-to-use software engineering
> tools.  All of the work will be Open Source, and will be done in Python,
> with a strong emphasis on design, testing, and documentation.  The
> project's long-term objective is to encourage scientists and engineers to
> treat programs in the same way as they do other experiments, i.e. to
> calibrate, test, peer review, and so on.
> 
> To kick-start things, we're going to be holding a two-round design
> competition.  Anyone (individual or team, professional or student) can
> submit a short entry for the first round; the judges will pick four
> candidates to go forward in each of four categories, and those
> individuals or teams will be asked to submit full entries. The four
> categories are:
> 
> * an issue tracking system to replace Gnats and Bugzilla;
> 
> * a build system to replace make;
> 
> * a platform inspection and configuration system to replace autoconf;
>   and
> 
> * a testing framework to replace XUnit, Expect, and DejaGnu.
> 
> Would you be interested in participating in any way

Are these categories fixed? I see a very strong need for an 
open-source UML modeling tool. UML is extremely powerful, but current
UML tools largely suck and are very expensive.  We are contemplating
launching an open-source development effort to build UML modeling tools
using Zope or the Zope object database as a repository. A contest
like this could help to kick-start this effort, but tools to automate
requirements and design seem to be missing. This is odd, considering that
up-front activities like requirements and design have the largest impact
on software-engineering project success.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From andy@robanal.demon.co.uk  Thu Dec 23 15:13:22 1999
From: andy@robanal.demon.co.uk (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Thu, 23 Dec 1999 07:13:22 -0800 (PST)
Subject: [Python-Dev] Fixed-decimal types
Message-ID: <19991223151322.5698.qmail@web604.mail.yahoo.com>

--- Guido van Rossum <guido@CNRI.Reston.VA.US> wrote:
> What would be scale of the product of two
> fixed-decimal numbers?
> E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to
> 4.00L?  There are
> arguments for either.  Same question for division
> (harder, I think).
Most commonly one is trying to avoid rounding errors
when dealing with money - a few cents rounding error
tends to result in a few billable hours with the
accountants at the end of the year!

SQL dialects and type-safe languages would make you
specify the precision of the variable to be assigned,
so the issue does not arise for other languages.  

For the work I do, simply taking the precision of the
most precise input (4.00L)would do the trick, but your
answer (4.0000L) is purer.  We should provide a
rounding function, and in practice anyone using such a
function would round (or floor, or ceiling) to get to
the desired precision immediately.

I'm not sure on division either but I'm sure there are
precedents to look at.

On the subject of adding new types to the standard
library, what are the plans on dates and times?  Would
a cut-down mxDateTime ever be considered?  It is fully
Open Source (unlike mxODBC) and was designed for the
DBAPI.

Regards,

Andy

=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com


From guido@CNRI.Reston.VA.US  Thu Dec 23 15:23:43 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 10:23:43 -0500
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
In-Reply-To: Your message of "Thu, 23 Dec 1999 07:13:22 PST."
 <19991223151322.5698.qmail@web604.mail.yahoo.com>
References: <19991223151322.5698.qmail@web604.mail.yahoo.com>
Message-ID: <199912231523.KAA22232@eric.cnri.reston.va.us>

> On the subject of adding new types to the standard
> library, what are the plans on dates and times?  Would
> a cut-down mxDateTime ever be considered?  It is fully
> Open Source (unlike mxODBC) and was designed for the
> DBAPI.

I don't know much about date/time types, or about mxDateTime.
My intuition is that there are too many ways to do it, and that being
compatible with commercial databases may not be the right way to do it
for core Python.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Thu Dec 23 15:27:59 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 23 Dec 1999 10:27:59 -0500 (EST)
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: <38623493.E6BA6D6F@digicool.com>
References: <38623493.E6BA6D6F@digicool.com>
Message-ID: <14434.16255.58344.646524@weyr.cnri.reston.va.us>

Jim Fulton writes:
 > In November there was an interesting discussion on comp.lang.python 
 > about the meaning of __str__ and __repr__.  One tidbit that came out
 > of this discussion was that __str__ for longs should drop the trailing 
 > 'L'. Was there a decision on this? I'd really like this to happen.

  I liked that result as well, and thought about it just the other
day.  Luckily, you sent a note this morning and made me think about
again.  I'll have something checked into CVS shortly.  ;)


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From Mike.Da.Silva@uk.fid-intl.com  Thu Dec 23 16:30:07 1999
From: Mike.Da.Silva@uk.fid-intl.com (Da Silva, Mike)
Date: Thu, 23 Dec 1999 16:30:07 -0000
Subject: [Python-Dev] Fixed Decimal types
Message-ID: <DBF3B37F7BF1D111B2A10000F6B14B1FDDAF86@ukhil704nts.hld.uk.fid-intl.com>

	Andy Robinson wrote:
		For the work I do, simply taking the precision of the
		most precise input (4.00L)would do the trick, but your
		answer (4.0000L) is purer.  We should provide a
		rounding function, and in practice anyone using such a
		function would round (or floor, or ceiling) to get to
		the desired precision immediately.

		I'm not sure on division either but I'm sure there are
		precedents to look at.

	The AS400 provides a useful example of the right way to do scaled
decimals.

	In the RPG programming language, all internal calculations (i.e.
multiplication, division) are performed to the maximum precision of the
intermediate result (in the multiplication example below), the intermediate
result would be 4.0000L.  When the intermediate result is assigned to the
target scaled decimal number, the decimal precision is automatically
extended or truncated to fit the target precision.  One extra wrinkle in all
of this is the option to "half-adjust" the intermediate value on assignment;
that is to apply automatic 5/4 rounding to the precision of the target.

	So, if the target field is defined as numeric(4,2), the result will
be 4.00L.

	These are probably the kind of semantics that a scaled decimal type
would require in Python also; i.e. allow unlimited precision in intermediate
calculations, with a sensible set of rules for assignment to a variable of
different scale and precision.

	However, unlike RPG, we should probably ensure that attempts to
overflow or underflow the scale result in NaN or Overflow conditions, rather
than assuming the user is right and losing the significant digits.

	Regards,
	Mike da Silva


From jim@digicool.com  Thu Dec 23 16:37:10 1999
From: jim@digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 11:37:10 -0500
Subject: [Python-Dev] Fixed-decimal types
References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us>
Message-ID: <38624FB6.ED903F@digicool.com>

Guido van Rossum wrote:
> 
> What would be scale of the product of two fixed-decimal numbers?
> E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L?  There are
> arguments for either.  Same question for division (harder, I think).

I'd be inclined to start by doing some research to see if some standard
(SQL?) defines this somewhere.  It would be nice if someone has already 
done the requirements work for us. :)

> I like the idea of using the dd.ddL notation for this.
> 
> I have no time to implement

Me neither.

> it but would not be unwilling to accept patches. 

Cool.  If no one else volunteers, then I'll try to find a way
to get this done (not necessarily by me). I think it is pretty
important.

> They would have to be accompanied with a wet signature, see
> http://www.python.org/1.5/wetsign.html.

Yup.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From andy@robanal.demon.co.uk  Thu Dec 23 16:38:50 1999
From: andy@robanal.demon.co.uk (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Thu, 23 Dec 1999 08:38:50 -0800 (PST)
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
Message-ID: <19991223163850.15619.qmail@web604.mail.yahoo.com>

Sorry, should have replied to the list...

--- Andy Robinson <captainrobbo@yahoo.com> wrote:
> Date: Thu, 23 Dec 1999 08:37:18 -0800 (PST)
> From: Andy Robinson <captainrobbo@yahoo.com>
> Reply-to: andy@robanal.demon.co.uk
> Subject: Re: [Python-Dev] Date and timetypes (was:
> Fixed-decimal types)
> To: Guido van Rossum <guido@CNRI.Reston.VA.US>
> 
> --- Guido van Rossum <guido@CNRI.Reston.VA.US>
> wrote:
> > I don't know much about date/time types, or about
> > mxDateTime.
> > My intuition is that there are too many ways to do
> > it, and that being
> > compatible with commercial databases may not be
> the
> > right way to do it
> > for core Python.
> > 
> 
> OK.  Let me rephrase it.  Say we form a consensus on
> 'the right way'.  Are you amenable to some solution
> which goes back before 1970 and after 2038 going
> into
> the standard library?
> 
> And does your answer change if it involves some
> compiled code as well?  
> 
> I mention mxDateTime because it was agreed by a
> Python
> SIG, is mature and stable, and I find it very
> useful. 
> And the core type is pretty small - much of the
> helper
> stuff in the package now could be kept separate from
> the main Python distribution.  
> 
> - Andy
> 
> 
> =====
> Andy Robinson
> Robinson Analytics Ltd.
> ------------------
> My opinions are the official policy of Robinson
> Analytics Ltd.
> They just vary from day to day.
> 
>
_________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at
> http://mail.yahoo.com
> 


=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


From guido@CNRI.Reston.VA.US  Thu Dec 23 16:42:33 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 11:42:33 -0500
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
In-Reply-To: Your message of "Thu, 23 Dec 1999 08:38:50 PST."
 <19991223163850.15619.qmail@web604.mail.yahoo.com>
References: <19991223163850.15619.qmail@web604.mail.yahoo.com>
Message-ID: <199912231642.LAA22598@eric.cnri.reston.va.us>

> > OK.  Let me rephrase it.  Say we form a consensus on 'the right
> > way'.  Are you amenable to some solution which goes back before
> > 1970 and after 2038 going into the standard library?

No problem.

> > And does your answer change if it involves some
> > compiled code as well?

I'd rather not.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Thu Dec 23 17:05:52 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 23 Dec 1999 11:05:52 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
 <14432.594.33416.600794@weyr.cnri.reston.va.us>
 <199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <14434.22128.639699.738932@dolphin.mojam.com>

    Guido> (The next step would be to outlaw raise with a string argument; I
    Guido> think I can't make that for 1.6.  But it would be a good idea to
    Guido> scan the standard library for string exceptions and convert all
    Guido> of them.)

Agreed.  I know Zope uses (at least, my Zope-using code uses) stuff like 

    raise 'Redirect', url

to map names onto HTTP response codes.  Makes it easier on people to
remember names instead of numeric codes.  I suspect it will take the Zopers
awhile to convert to using class-based exceptions if they haven't already.
(For all I know I may be using a deprecated feature.)

Skip


From gvwilson@nevex.com  Thu Dec 23 17:24:05 1999
From: gvwilson@nevex.com (gvwilson@nevex.com)
Date: Thu, 23 Dec 1999 12:24:05 -0500 (EST)
Subject: [Python-Dev] re: Open Source design competition / Python /
 software  tools
In-Reply-To: <38623909.CDF41014@digicool.com>
Message-ID: <Pine.LNX.4.10.9912231219380.12516-100000@akbar.nevex.com>

Hi, everyone.  I'm sending my reply to Jim's message to the whole
python-dev list; I'll send follow-ups to individuals if people would
prefer.

> > * an issue tracking system to replace Gnats and Bugzilla;
> > 
> > * a build system to replace make;
> > 
> > * a platform inspection and configuration system to replace autoconf;
> >   and
> > 
> > * a testing framework to replace XUnit, Expect, and DejaGnu.

> Jim Fulton asked:
> Are these categories fixed?

For the first round, yes --- I have to prove that this model can solve
small problems before I'll be given the funding to tackle larger ones, and
I think that a UML modeling tool is definitely "large" :-).  I also have
to demonstrate uptake, and I think more people will adopt a sane
replacement for Autoconf in the next 18 months than would adopt a UML
modeler.  However, decent Open Source CASE tools are very (very) high on
my personal list --- if this works, I'd like to tackle them (along with
providing support for DDD, and a few other thingsl ike that).

Greg


From gstein@lyra.org  Thu Dec 23 18:26:44 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 23 Dec 1999 10:26:44 -0800 (PST)
Subject: [Python-Dev] Re: Please test new dynamic load behavior
In-Reply-To: <38620B04.7CC64485@trema.com>
Message-ID: <Pine.LNX.4.10.9912231022280.16305-100000@nebula.lyra.org>

On Thu, 23 Dec 1999, Harri Pasanen wrote:
> Greg Stein wrote:
> > Hi all,
> > 
> > I reorganized Python's dynamic load/import code over the past few days.
> > Gudio provided some feedback, I did some more mods, and now it is checked
> > into CVS. The new loading behavior has been tested on Linux, IRIX, and
> > Solaris (and probably Windows by now).
> 
> ...
> 
> What was the motivation behind this modification?

Harri -

With the new code structure, it is much easier to maintain Python's
loading code.

Each platform has its own file (e.g. dynload_aix.c) rather than being all
jammed together into importdl.c. This isn't a huge win by itself, but does
increase readability/maintainability. The big improvement, however, is
when you are adding support for new platforms or loading mechanisms. A new
dynload_*.c can be written and one line added to configure.in, and you're
done. No need to make importdl.c even uglier.  (actually, importdl.c no
longer contains *any* platform specific code; it has all been moved to the
dynload_*.c files)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim@digicool.com  Thu Dec 23 19:39:37 1999
From: jim@digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 14:39:37 -0500
Subject: [Python-Dev] Fixed-decimal types
References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com>
Message-ID: <38627A79.BF379672@digicool.com>

Jim Fulton wrote:
> 
> Guido van Rossum wrote:
> >
> > What would be scale of the product of two fixed-decimal numbers?
> > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L?  There are
> > arguments for either.  Same question for division (harder, I think).
> 
> I'd be inclined to start by doing some research to see if some standard
> (SQL?) defines this somewhere.  It would be nice if someone has already
> done the requirements work for us. :)

Here is what the book "SQL-99 Complete, Really" says that the SQL
standard says:

  - for addition and subtraction of two "exact" (fixed-decimal)
    numbers, the result has the maximum of the scales.

  - for multiplication of two "exact" (fixed-decimal)
    numbers, the result has the sum of the scales.

  - punts on division

  - for addition, subtraction, multiplication or division
    between "exact" (fixed point) and "approximate" (floating point)
    yields an approximate result.  This means that fixed-decimal
    coerces to float.

I'm curious to see who else chips in with examples from other systems.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From jim@digicool.com  Thu Dec 23 19:43:41 1999
From: jim@digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 14:43:41 -0500
Subject: [Python-Dev] Fixed Decimal types
References: <DBF3B37F7BF1D111B2A10000F6B14B1FDDAF86@ukhil704nts.hld.uk.fid-intl.com>
Message-ID: <38627B6D.447A9553@digicool.com>

"Da Silva, Mike" wrote:
> 
>         Andy Robinson wrote:
>                 For the work I do, simply taking the precision of the
>                 most precise input (4.00L)would do the trick, but your
>                 answer (4.0000L) is purer.  We should provide a
>                 rounding function, and in practice anyone using such a
>                 function would round (or floor, or ceiling) to get to
>                 the desired precision immediately.
> 
>                 I'm not sure on division either but I'm sure there are
>                 precedents to look at.
> 
>         The AS400 provides a useful example of the right way to do scaled
> decimals.
> 
>         In the RPG programming language, all internal calculations (i.e.
> multiplication, division) are performed to the maximum precision of the
> intermediate result (in the multiplication example below), the intermediate
> result would be 4.0000L.  When the intermediate result is assigned to the
> target scaled decimal number, the decimal precision is automatically
> extended or truncated to fit the target precision.  One extra wrinkle in all
> of this is the option to "half-adjust" the intermediate value on assignment;
> that is to apply automatic 5/4 rounding to the precision of the target.

Yee ha! This is great input. Anyone have any other examples of what
any other systems do? Anyone got a PL/I manual handy. ;)

>         So, if the target field is defined as numeric(4,2), the result will
> be 4.00L.

Since Python doesn't have types values, this is not an issue
internally, but would be an issue when binding to external databases.

>         These are probably the kind of semantics that a scaled decimal type
> would require in Python also; i.e. allow unlimited precision in intermediate
> calculations, with a sensible set of rules for assignment to a variable of
> different scale and precision.
> 
>         However, unlike RPG, we should probably ensure that attempts to
> overflow or underflow the scale result in NaN or Overflow conditions, rather
> than assuming the user is right and losing the significant digits.

Since this would be based on infinite-precision numbers, I don't
think that this would be an issue.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From guido@CNRI.Reston.VA.US  Thu Dec 23 19:44:36 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 14:44:36 -0500
Subject: [Python-Dev] Fixed-decimal types
In-Reply-To: Your message of "Thu, 23 Dec 1999 14:39:37 EST."
 <38627A79.BF379672@digicool.com>
References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com>
 <38627A79.BF379672@digicool.com>
Message-ID: <199912231944.OAA23337@eric.cnri.reston.va.us>

Jim Fulton wrote:

>   - for addition and subtraction of two "exact" (fixed-decimal)
>     numbers, the result has the maximum of the scales.

One could argue that this is incorrect: if "3.1" means that I know the
value to one decimal of precision, and "2.01" means that I know that
value to two decimals of precision, stating the result of their sum as
"5.11" suggests that I know the result to two decimals of precision,
which is of course false: because I only knew one decimal of precision
for one of the operands, I only know (at most!) one decimal of
precision for the result.

Not arguing for this interpretation, just indicating that doing fixed
precision arithmetic right is hard.  I'm waiting for Tim Peters'
contribution, but he's on vacation so it may be a while.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com  Thu Dec 23 20:48:56 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 23 Dec 1999 15:48:56 -0500
Subject: [Python-Dev] Fixed Decimal types
In-Reply-To: <38627B6D.447A9553@digicool.com>
Message-ID: <1266141247-31971518@hypernet.com>

Jim Fulton wrote:
> "Da Silva, Mike" wrote:

[AS400 RPG rules...]

> Yee ha! This is great input. Anyone have any other examples of
> what any other systems do? Anyone got a PL/I manual handy. ;)

From memory of IBM COBOL and SQL, the rules for 
intermediates seem similar to what Mike outlines. In both 
cases, the target is pre-specified, and I think by default you 
get auto-rounding.

Tim's BCD class seem to always return the higher precision 
on an arithmetic op, although the intermediate is full precision.
 
>> However, unlike RPG, we should probably ensure 
>> that attempts to overflow or underflow the scale 
>> result in NaN or Overflow conditions, rather
>> than assuming the user is right and losing 
>> the significant digits.
 
> Since this would be based on infinite-precision numbers, I don't
> think that this would be an issue.

It's an issue if the result of an arithmetic op is other than "full" 
precision. The issue certainly comes up when you e.g. talk to 
a DB, and it might be better to have it come up sooner rather 
than later.

- Gordon


From jim@digicool.com  Thu Dec 23 22:18:37 1999
From: jim@digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 17:18:37 -0500
Subject: [Python-Dev] re: Open Source design competition / Python /software
 tools
References: <Pine.LNX.4.10.9912231219380.12516-100000@akbar.nevex.com>
Message-ID: <38629FBD.3B8F47D4@digicool.com>

gvwilson@nevex.com wrote:
> 
> Hi, everyone.  I'm sending my reply to Jim's message to the whole
> python-dev list; I'll send follow-ups to individuals if people would
> prefer.
> 
> > > * an issue tracking system to replace Gnats and Bugzilla;
> > >
> > > * a build system to replace make;
> > >
> > > * a platform inspection and configuration system to replace autoconf;
> > >   and
> > >
> > > * a testing framework to replace XUnit, Expect, and DejaGnu.
> 
> > Jim Fulton asked:
> > Are these categories fixed?
> 
> For the first round, yes 

OK.

>--- I have to prove that this model can solve
> small problems before I'll be given the funding to tackle larger ones, and
> I think that a UML modeling tool is definitely "large" :-). 

Well, since you gave rational ..... :)

<speech>
Isn't the Open Source community especially good at large problems?
Note that I'm thinking more in terms of an open source UML community
of tools, based around an existing repository rather than on a single 
monolithic tool.  I envision a community of diagramming and other small
tools orbiting Zope or ZODB. The hardest part of a UML tool is the
repository, and I think we've mostly got that.

I think that what the Open Source community desperately needs 
are tools for managing and sharing the most important artifacts
in the development process.
</speech>

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From gstein@lyra.org  Fri Dec 24 00:09:29 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 23 Dec 1999 16:09:29 -0800 (PST)
Subject: [Python-Dev] re: Open Source design competition / Python /software
 tools
In-Reply-To: <38629FBD.3B8F47D4@digicool.com>
Message-ID: <Pine.LNX.4.10.9912231605030.412-100000@nebula.lyra.org>

On Thu, 23 Dec 1999, Jim Fulton wrote:
> gvwilson@nevex.com wrote:
>...
> >--- I have to prove that this model can solve
> > small problems before I'll be given the funding to tackle larger ones, and
> > I think that a UML modeling tool is definitely "large" :-). 
> 
> Well, since you gave rational ..... :)
> 
> <speech>
> Isn't the Open Source community especially good at large problems?

Very true, I agree, but part of Greg's problem is "proving" that to the
DoE. Somebody has said those four problems are sufficient to do so, and
(probably) because they are reasonably constrained to allow completion
within a specified timeframe.

> Note that I'm thinking more in terms of an open source UML community
> of tools, based around an existing repository rather than on a single 
> monolithic tool.  I envision a community of diagramming and other small
> tools orbiting Zope or ZODB. The hardest part of a UML tool is the
> repository, and I think we've mostly got that.

Greg's proposal is quite specific. "A community" isn't, so it might not
help to create a proof to the DoE (otherwise, they could look at the Zope
community, or other communities!).

Jim: there isn't anything stopping or impeding the creation of an Open
Source community for UML modeling. This DoE competition won't affect
that...

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim@digicool.com  Fri Dec 24 00:27:53 1999
From: jim@digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 19:27:53 -0500
Subject: [Python-Dev] re: Open Source design competition / Python
 /softwaretools
References: <Pine.LNX.4.10.9912231605030.412-100000@nebula.lyra.org>
Message-ID: <3862BE09.9AF62090@digicool.com>

Greg Stein wrote:
> 
(snip)
> Jim: there isn't anything stopping or impeding the creation of an Open
> Source community for UML modeling.

Of course not.

> This DoE competition won't affect that...

Perhaps it could help it.
 
> Happy Holidays,

You too.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From ping@lfw.org  Fri Dec 24 08:55:28 1999
From: ping@lfw.org (Ka-Ping Yee)
Date: Fri, 24 Dec 1999 00:55:28 -0800 (PST)
Subject: [Python-Dev] re: Open Source design competition / Python /
 software tools
In-Reply-To: <Pine.LNX.4.10.9912222242070.4839-200000@akbar.nevex.com>
Message-ID: <Pine.LNX.4.10.9912240049360.655-100000@skuld.lfw.org>

On Wed, 22 Dec 1999 gvwilson@nevex.com wrote:
> To kick-start things, we're going to be holding a two-round design
> competition.  Anyone (individual or team, professional or student) can
> submit a short entry for the first round; the judges will pick four
> candidates to go forward in each of four categories, and those
> individuals or teams will be asked to submit full entries. The four
> categories are:
> 
> * an issue tracking system to replace Gnats and Bugzilla;

Hi there.

At ILM we've been using a system that i hacked up quickly in Python
called "Roundup".  It has a number of interesting properties that
have made it really useful to us, and arguably better than any of
the existing open-source bug-tracking things out there that i know
of.  It is not just a Web app; it lives between the Web and e-mail,
because we do so much of our communication that way.

For example, each request item gets its own virtual mailing list,
updated on the fly without the need for explicit subscription (if
you cc: somebody while discussing the bug, they get subscribed).
Empirically i've discovered that unsubscription is actually
unnecessary (!) because conversation will stop on a topic when it
gets resolved or when it ceases to be interesting.  These are
fine-grained discussion lists on a per-topic level.

This is just to let you know i'm interested.  I'm currently asking
for permission to open-source Roundup; if it can't be done, or
doesn't happen quickly enough, i'll just have to take a weekend and
rewrite the thing.  There were a few things i wanted to fix anyway.


-- ?!ng

"You should either succeed gloriously or fail miserably.  Just getting
by is the worst thing you can do."
    -- Larry Smith


From Vladimir.Marangozov@inrialpes.fr  Fri Dec 24 12:07:05 1999
From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov)
Date: Fri, 24 Dec 1999 13:07:05 +0100 (CET)
Subject: [Python-Dev] Exceptions
In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 01:23:45 PM
Message-ID: <199912241207.NAA18783@python.inrialpes.fr>

Guido van Rossum wrote:
> 
> Vladimir.Marangozov@inrialpes.fr:
> 
> > Yes. Besides, I still think that string-based exceptions are just
> > convenient for quick & dirty, throw-away test scripts.
> 
> They have a hard-to-understand quirk though: the id() of the string is
> used to check rather than its value, so that except "foo" doesn't
> necessarily catch raise "foo"; but due to various optimization, this
> usually works, and people get bent out of shape when it doesn't.

Which brings 2 important questions:

1. In the long run, which one is better -- compare and check exceptions by
   reference (by name) or by value?

   (currently, this is done by reference on predefined object types:
    strings, classes or instances)

   I'd say, exceptions have to be compared (catched) by value, i.e. use
   "e1 == e2" instead of "e1 is e2".

2. Should we limit the exception "types"?

   I'd say, no. My Pythonic view of things says that we raise "objects",
   be they classes, instances, strings or, why not, ints.

   However, if one wants to put some order in the "unordered set" of exceptions
   s/he uses, then classes is the way to do it, because classes were given some
   nice properties, like inheritance, that allow to group and to organize logically
   the objects we throw and catch as exceptions (+ other bonus properties coming
   from classes).

   Note that conceptually, when we say "strings and ints", we have in mind
   "string instances and int instances", whose "classes" are written in C.
   When there will be String and Int classes of some sort as first class objects,
   then we'll fall back to the terminology: Exceptions can be classes or instances.

If point 1 and (optionally) point 2 is implemented, the hard-to-understand quirk
wouldn't be an issue and string-based exceptions would have a legal reason to stay
and live.

> Since you have to give your exception a name, how hard is it to say
> 
> class MyError(Exception): pass
> 
> rathern than
> 
> MyError = "MyError"
> 
> ?

You know what I think about "names"...  I may have defined my exception conventions
and be interested in catching an exception named 404, implying that "a 404 bobo"
occured deeply in my code ("deeply in my code" meaning for example: database 4,
service 0, customer group 4, or just a standard HTTP "Code 404 - Not Found".)

Pushing this to the extreme to catapult your thoughts into the next millenium. :)
and to emphasize the importance of discussing and anwsering objectively the above
questions 1) and 2).

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From mal@lemburg.com  Fri Dec 24 11:03:37 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 24 Dec 1999 12:03:37 +0100
Subject: [Python-Dev] str(1L) -> '1' ?
References: <38623493.E6BA6D6F@digicool.com> <199912231446.JAA22086@eric.cnri.reston.va.us>
Message-ID: <38635309.2AEFF18D@lemburg.com>

Guido van Rossum wrote:
> 
> [Jim F]
> > In November there was an interesting discussion on comp.lang.python
> > about the meaning of __str__ and __repr__.  One tidbit that came out
> > of this discussion was that __str__ for longs should drop the trailing
> > 'L'. Was there a decision on this? I'd really like this to happen.
> 
> Yes, I'd like it to happen.  I'd also like repr() of a float to return
> the full precision (using the "%.17g" sprintf format).

While we're at it: how about adding a PyLong_AsString() API
to the C interface ? I currently use PyObject_Str() in mxODBC
and then slice off the 'L' -- not very elegant. A PyLong_AsString()
API would much better suit the task.

Merry Christmas,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     7 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Fri Dec 24 11:11:29 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 24 Dec 1999 12:11:29 +0100
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
References: <19991223163850.15619.qmail@web604.mail.yahoo.com> <199912231642.LAA22598@eric.cnri.reston.va.us>
Message-ID: <386354E1.DA560F42@lemburg.com>

Guido van Rossum wrote:
> 
> > > OK.  Let me rephrase it.  Say we form a consensus on 'the right
> > > way'.  Are you amenable to some solution which goes back before
> > > 1970 and after 2038 going into the standard library?
> 
> No problem.
> 
> > > And does your answer change if it involves some
> > > compiled code as well?
> 
> I'd rather not.

As far as mxDateTime goes, I'd rather not see it in the core
distribution. Including the mx stuff in a separate PythonPowerTools
distribution would be cool though. For a start in this direction
see e.g.:

     http://startship.skyport.net/~lemburg/PPowerTools-0.2.zip

Note that I'll wrap all my mx extensions into a new mx package
which will come in several flavours next year. There will no
longer be separate packages due to the various naming
collisions and to enable intra-mx-package dependencies.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     7 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From andy@robanal.demon.co.uk  Fri Dec 24 12:22:29 1999
From: andy@robanal.demon.co.uk (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Fri, 24 Dec 1999 04:22:29 -0800 (PST)
Subject: [Python-Dev] Fixed Decimal types
Message-ID: <19991224122229.23506.qmail@web606.mail.yahoo.com>

> >> However, unlike RPG, we should probably ensure 
> >> that attempts to overflow or underflow the scale 
> >> result in NaN or Overflow conditions, rather
> >> than assuming the user is right and losing 
> >> the significant digits.
>  
> > Since this would be based on infinite-precision
> numbers, I don't
> > think that this would be an issue.


Three very general observations before I disappear for
Christmas:

(1) I think there is great mileage in combining the
fixed-decimal concept with Martin Fowler's Quantity
pattern, so that a variable could be defined as not
just two decimal places but also (say) "GBP" or "USD",
and it would be an error to add the two.  Same applies
for adding metres, kilograms and other quantities. 
There has also been discussion that the 'type' of a
quantity should determine what math should apply.

(2) If Python is going to be used increasingly in
eCommerce, it should be good at dealing with money -
maybe not in the core language, but we should aim for
one standard package.  

(3) We have a python-finance list
(python-finance@egroups.com), recently generalized to
cover business systems, which is a good place to
discuss this if anyone wants to.  There are people
there who have time, would love to prototype something
(indeed some work started in this area 3 months back),
and would use it at work too.  This would be an ideal
first target for that group - or indeed for a
finance-sig.  I'll pursue this in the New Year.

Merry Christmas,

Andy

=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


From jack@oratrix.nl  Fri Dec 24 12:34:28 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Fri, 24 Dec 1999 13:34:28 +0100
Subject: [Python-Dev] Fixed Decimal types
In-Reply-To: Message by =?iso-8859-1?q?Andy=20Robinson?=
 <captainrobbo@yahoo.com> ,
 Fri, 24 Dec 1999 04:22:29 -0800 (PST) , <19991224122229.23506.qmail@web606.mail.yahoo.com>
Message-ID: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl>

> (1) I think there is great mileage in combining the
> fixed-decimal concept with Martin Fowler's Quantity
> pattern, so that a variable could be defined as not
> just two decimal places but also (say) "GBP" or "USD",
> and it would be an error to add the two.  Same applies
> for adding metres, kilograms and other quantities. 
> There has also been discussion that the 'type' of a
> quantity should determine what math should apply.

Isn't this something that is ideally suited for implementation in a Python 
module, based on a core implementation of fixed decimal numbers?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From gstein@lyra.org  Fri Dec 24 20:05:22 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 24 Dec 1999 12:05:22 -0800 (PST)
Subject: [Python-Dev] Fixed Decimal types
In-Reply-To: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl>
Message-ID: <Pine.LNX.4.10.9912241204290.412-100000@nebula.lyra.org>

On Fri, 24 Dec 1999, Jack Jansen wrote:
> > (1) I think there is great mileage in combining the
> > fixed-decimal concept with Martin Fowler's Quantity
> > pattern, so that a variable could be defined as not
> > just two decimal places but also (say) "GBP" or "USD",
> > and it would be an error to add the two.  Same applies
> > for adding metres, kilograms and other quantities. 
> > There has also been discussion that the 'type' of a
> > quantity should determine what math should apply.
> 
> Isn't this something that is ideally suited for implementation in a Python 
> module, based on a core implementation of fixed decimal numbers?

I'd agree with Jack here.

The "simple" change of a scale for the Long values is nice. Starting to
lump in features like this begins to get a little messier...

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Fri Dec 24 20:13:50 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 24 Dec 1999 12:13:50 -0800 (PST)
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: <38635309.2AEFF18D@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912241211460.412-100000@nebula.lyra.org>

On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> Guido van Rossum wrote:
> > [Jim F]
> > > In November there was an interesting discussion on comp.lang.python
> > > about the meaning of __str__ and __repr__.  One tidbit that came out
> > > of this discussion was that __str__ for longs should drop the trailing
> > > 'L'. Was there a decision on this? I'd really like this to happen.
> > 
> > Yes, I'd like it to happen.  I'd also like repr() of a float to return
> > the full precision (using the "%.17g" sprintf format).
> 
> While we're at it: how about adding a PyLong_AsString() API
> to the C interface ? I currently use PyObject_Str() in mxODBC
> and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> API would much better suit the task.

Fred just checked in a change yesterday. PyObject_Str() on a Long no
longer includes the 'L'.

You're going to need to update your code :-)
[ I've got some here and there to fix, too, with the idiom:
     if type(v) is type(1L): return str(v)[:-1]
  ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From mal@lemburg.com  Sun Dec 26 22:29:28 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 26 Dec 1999 23:29:28 +0100
Subject: [Python-Dev] str(1L) -> '1' ?
References: <Pine.LNX.4.10.9912241211460.412-100000@nebula.lyra.org>
Message-ID: <386696C8.6EBBF428@lemburg.com>

Greg Stein wrote:
> 
> On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> > While we're at it: how about adding a PyLong_AsString() API
> > to the C interface ? I currently use PyObject_Str() in mxODBC
> > and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> > API would much better suit the task.
> 
> Fred just checked in a change yesterday. PyObject_Str() on a Long no
> longer includes the 'L'.

Ah, ok... scanning the patches: they don't provide an externed
C interface... I would like to have such a beast if possible
(basically, the new long_format() as PyLong_AsString()).

> You're going to need to update your code :-)
> [ I've got some here and there to fix, too, with the idiom:
>      if type(v) is type(1L): return str(v)[:-1]
>   ]

Your above example will effectively divide the long value by 10
which will probably break things in very subtle ways... hmm, this
change ought to be made *very* visible to people upgrading to
1.6, IMHO.

I'll fix mxODBC to only truncate the string value iff
the 'L' is present.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     5 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From andy@robanal.demon.co.uk  Mon Dec 27 10:43:17 1999
From: andy@robanal.demon.co.uk (Andy Robinson)
Date: Mon, 27 Dec 1999 10:43:17 GMT
Subject: [Python-Dev] Fixed Decimal types
In-Reply-To: <Pine.LNX.4.10.9912241204290.412-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912241204290.412-100000@nebula.lyra.org>
Message-ID: <38674259.5377973@post.demon.co.uk>

On Fri, 24 Dec 1999 12:05:22 -0800 (PST), you wrote:

>On Fri, 24 Dec 1999, Jack Jansen wrote:
>> > (1) I think there is great mileage in combining the
>> > fixed-decimal concept with Martin Fowler's Quantity
>> > pattern, so that a variable could be defined as not
>> > just two decimal places but also (say) "GBP" or "USD",
>> > and it would be an error to add the two.  Same applies
>> > for adding metres, kilograms and other quantities.=20
>> > There has also been discussion that the 'type' of a
>> > quantity should determine what math should apply.
>>=20
>> Isn't this something that is ideally suited for implementation in a =
Python=20
>> module, based on a core implementation of fixed decimal numbers?
>
>I'd agree with Jack here.
>
Me too - I thought I said that in point 2, but in retrospect I didn't
say it clearly enough :-)


- Andy


From gstein@lyra.org  Mon Dec 27 11:31:29 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 27 Dec 1999 03:31:29 -0800 (PST)
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: <386696C8.6EBBF428@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912270330180.412-100000@nebula.lyra.org>

On Sun, 26 Dec 1999, M.-A. Lemburg wrote:
> Greg Stein wrote:
> > On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> > > While we're at it: how about adding a PyLong_AsString() API
> > > to the C interface ? I currently use PyObject_Str() in mxODBC
> > > and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> > > API would much better suit the task.
> > 
> > Fred just checked in a change yesterday. PyObject_Str() on a Long no
> > longer includes the 'L'.
> 
> Ah, ok... scanning the patches: they don't provide an externed
> C interface... I would like to have such a beast if possible
> (basically, the new long_format() as PyLong_AsString()).

What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry
Point.

> > You're going to need to update your code :-)
> > [ I've got some here and there to fix, too, with the idiom:
> >      if type(v) is type(1L): return str(v)[:-1]
> >   ]
> 
> Your above example will effectively divide the long value by 10
> which will probably break things in very subtle ways... hmm, this

Yah :-(  Not a lot of fun, but I think for the best.

> change ought to be made *very* visible to people upgrading to
> 1.6, IMHO.

Yes.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From mal@lemburg.com  Mon Dec 27 12:51:36 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 27 Dec 1999 13:51:36 +0100
Subject: [Python-Dev] str(1L) -> '1' ?
References: <Pine.LNX.4.10.9912270330180.412-100000@nebula.lyra.org>
Message-ID: <386760D8.E897FADF@lemburg.com>

Greg Stein wrote:
> 
> On Sun, 26 Dec 1999, M.-A. Lemburg wrote:
> > Greg Stein wrote:
> > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> > > > While we're at it: how about adding a PyLong_AsString() API
> > > > to the C interface ? I currently use PyObject_Str() in mxODBC
> > > > and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> > > > API would much better suit the task.
> > >
> > > Fred just checked in a change yesterday. PyObject_Str() on a Long no
> > > longer includes the 'L'.
> >
> > Ah, ok... scanning the patches: they don't provide an externed
> > C interface... I would like to have such a beast if possible
> > (basically, the new long_format() as PyLong_AsString()).
> 
> What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry
> Point.

What's wrong with a rich C API :-) ?

The long_format function would be very useful for programs
interacting with other software at C level. Making it
external would give the programmer the ability to pass
long string representations in any base to other programs,
which is very useful for e.g. database interaction or
crypto software.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     4 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From bkc@murkworks.com  Mon Dec 27 22:04:25 1999
From: bkc@murkworks.com (Brad Clements)
Date: Mon, 27 Dec 1999 17:04:25 -0500
Subject: [Python-Dev] Re: [PSA MEMBERS] Re: Please test new dynamic load behavior
In-Reply-To: <Pine.LNX.4.10.9912231022280.16305-100000@nebula.lyra.org>
References: <38620B04.7CC64485@trema.com>
Message-ID: <199912272204.RAA26173@anvil.murkworks.com>

On 23 Dec 99, at 10:26, Greg Stein wrote:

> > > I reorganized Python's dynamic load/import code over the past few days.
> > > Gudio provided some feedback, I did some more mods, and now it is checked
> > > into CVS. The new loading behavior has been tested on Linux, IRIX, and
> > > Solaris (and probably Windows by now).


FYI, I downloaded the import stuff from CVS and used it in my port of 
Python to NetWare. Good timing, as I was just tackling dynamic 
loading on NetWare when I saw your message.

The new scheme is much better, and works for me.

Though I do need to add some special "un-import" code similar to what 
BEOS does. 


Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
netmeeting: ils://ils.murkworks.com               AOL-IM: BKClements


From skip@mojam.com (Skip Montanaro)  Tue Dec 28 21:41:33 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 28 Dec 1999 15:41:33 -0600
Subject: [Python-Dev] Better text processing support in py2k?
Message-ID: <199912282141.PAA31426@dolphin.mojam.com>

It just occurred to me as I was replying to a request on the main list, that
Python's text handling capabilities could be a bit better than they are.
This will probably not come as a revelation to many of you, but I finally
put it together with the standard argument against beefing things up

    One fix would be to add regular expressions to the language core and
    have special syntax for them, as Perl has done. However, I don't like
    this solution because Python is a general-purpose language, and regular
    expressions are used for the single application domain of text
    processing. For other application domains, regular expressions may be of
    no interest, and you might want to remove them to save memory and code
    size.

and the observation that Python does support some builtin objects and syntax
that are fairly specific to some much more restricted application domains
than text processing.

I stole the above quote from Andrew Kuchling's Python Warts page, which I
also happened to read earlier today.

What AMK says makes perfect sense until you examine some of the other things
that are in the language, like the Ellipsis object and complex numbers.  If
I recall correctly both were added as a result of the NumPy package
development.

I have nothing against ellipses or complex numbers.  They are fine first
class objects that should remain in the language. But I have never used
either one in my day-to-day work.  On the other hand, I read files and
manipulate them with regular expressions all the time.  I rather suspect
that more people use Python for some sort of text processing than any other
single application domain.  Python should be good at it.

While I don't want to turn Python into Perl, I would like to see it do a
better job of what most people probably use the language for.  Here is a
very short list of things I think need attention:

    1. When using something like the simple file i/o idiom

       for line in f.readlines():
	   dofunstuff(line)

       the programmer should not have to care how big the file is.  It
       should just work in a reasonably efficient manner without gobbling up
       all of memory.  I realize this may require some change to the syntax
       of the common idiom.

    2. The re module needs to be sped up, if not to catch up with Perl, then
       to catch up with the deprecated regex module.  Depending how far
       people want to go with things, adding some language syntax to support
       regular expressions might be in order.  I don't see that as
       compelling as adding complex numbers however.  Another possibility,
       now that Barry Warsaw has opened the floodgates, is to add regular
       expression methods to strings.

    3. I've not yet used it, but I am told the pattern matching in
       Marc-Andre Lemburg's mxTextTools
       (http://starship.python.net/crew/lemburg/) is both powerful and
       efficient (though it certainly appears complex).  Perhaps it deserves
       consideration for incorporation into the core Python distribution.

I'm sure other people will come up with other suggestions.

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From akuchlin@mems-exchange.org  Tue Dec 28 22:00:11 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Tue, 28 Dec 1999 17:00:11 -0500 (EST)
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com>
References: <199912282141.PAA31426@dolphin.mojam.com>
Message-ID: <14441.13035.802146.730160@amarok.cnri.reston.va.us>

Skip Montanaro writes:
>What AMK says makes perfect sense until you examine some of the other things
>that are in the language, like the Ellipsis object and complex numbers.  If
>I recall correctly both were added as a result of the NumPy package
>development.

True, but note that you can compile Python with WITHOUT_COMPLEX
defined to remove complex numbers.

>    1. When using something like the simple file i/o idiom
>       for line in f.readlines():
>	   dofunstuff(line)
>       the programmer should not have to care how big the file is.

What about 'for line in fileinput.input()', which already exists?
(Hmmm... if you have an already open file object, I don't think you
can pass it to fileinput.input(); maybe that should be fixed.)

On a vaguely related note, since there are many things like parser
generators and XML stuff and mxTextTools, I've been speculating about
a text processing topic guide.  If you know of Python packages related
to text processing, please send me a private e-mail with a link.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Constraints often boost creativity.
    -- Jim Hugunin, 11 Feb 1999


From skip@mojam.com (Skip Montanaro)  Tue Dec 28 22:26:53 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 28 Dec 1999 16:26:53 -0600 (CST)
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <14441.13035.802146.730160@amarok.cnri.reston.va.us>
References: <199912282141.PAA31426@dolphin.mojam.com>
 <14441.13035.802146.730160@amarok.cnri.reston.va.us>
Message-ID: <14441.14637.682862.999776@dolphin.mojam.com>

    Andrew> True, but note that you can compile Python with WITHOUT_COMPLEX
    Andrew> defined to remove complex numbers.

That's true, but that wasn't my point.  I'm not arguing for or against space
efficiency, just that the the rather timeworn argument about not doing
anything special to support text processing because Python is a general
purpose language is a red herring.

    >> 1. When using something like the simple file i/o idiom
    >> for line in f.readlines():
    >>   dofunstuff(line)
    >> the programmer should not have to care how big the file is.

    Andrew> What about 'for line in fileinput.input()', which already
    Andrew> exists?  (Hmmm... if you have an already open file object, I
    Andrew> don't think you can pass it to fileinput.input(); maybe that
    Andrew> should be fixed.)

Well, a couple reasons jump to mind:

   1. fileinput.FileInput isn't particularly efficient.  At its heart, its
      __getitem__ method makes a simple readline() call instead of buffering
      some amount of readlines(sizehint) bytes.  This can be fixed, but I'm
      not sure what would happen to its semantics.

   2. As you pointed out, it's not all that general.

My point, not at all well stated, is that the programmer shouldn't have to
worry (much?) about the conditions under which he does file i/o.   Right
now, if I know the file is small(ish), I can do

    for line in f.readlines():
        dofunstuff(line)

but I have to know that the file won't be big, because readlines() will
behave badly (perhaps even generate a MemoryError exception) if the file is
large.  In that case, I have to fall back to the safer (and slower)

    line = f.readline()
    while line:
        dofunstuff(line)
	line = f.readline()

or the more efficient, but more cumbersome

    lines = f.readlines(sizehint)
    while lines:
        for line in lines:
	    dofunstuff(line)
	lines = f.readlines(sizehint)

That's three separate idioms the programmer has to be aware of when writing
code to read a text file based upon the perceived need for speed, memory
usage and desired clarity:

    fast/memory-intensive/clear
    slow/memory-conserving/not-as-clear
    fast/memory-conserving/fairly-muddy

Any particular reason that the readline method can't return an iterator that
supports __getitem__ and buffers input?  (Again, remember this is for py2k,
so the potential breakage such a change might cause is a consideration, but
not a showstopper.)

    Andrew> On a vaguely related note, since there are many things like
    Andrew> parser generators and XML stuff and mxTextTools, I've been
    Andrew> speculating about a text processing topic guide.  If you know of
    Andrew> Python packages related to text processing, please send me a
    Andrew> private e-mail with a link.

This sounds like a good idea to me.

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From andy@robanal.demon.co.uk  Wed Dec 29 08:34:43 1999
From: andy@robanal.demon.co.uk (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Wed, 29 Dec 1999 00:34:43 -0800 (PST)
Subject: [Python-Dev] Better text processing support in py2k?
Message-ID: <19991229083443.27817.qmail@web6005.mail.yahoo.com>

--- Skip Montanaro <skip@mojam.com> wrote:
>     fast/memory-intensive/clear
>     slow/memory-conserving/not-as-clear
>     fast/memory-conserving/fairly-muddy
> 
> Any particular reason that the readline method can't
> return an iterator that
> supports __getitem__ and buffers input?  (Again,
> remember this is for py2k,
> so the potential breakage such a change might cause
> is a consideration, but
> not a showstopper.)

Why not generalize fileinput to do buffering instead?

More generally, Java has the notion of 'stackable
streams' - e.g. construct a 'BufferedFile' around a
'File', maybe construct a 'Line-oriented file' around
that etc.  Each one takes a file-like object as an
argument to the constructor.  Things you might want to
do:
- buffering
- international encoding conversions
- line delimiters other than CR/LF/CRLF
- read/write Python objects (i.e. use pickle/marshal)
- easy interfaces to parsers

This took me a couple of hours to get used to (and at
the time I thought 'Yuk!' when I saw first saw four
nested constructors), but gives you very precise
control and a lot of versatility when handling files. 
It's an idiom Python does not use much but maybe it
should.

I'd argue that maybe some enhancements to fileinput.py
- adding some streams to provide building blocks for
these operations - would get us the power you want and
a lot more versatility besides.


=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://messenger.yahoo.com


From mal@lemburg.com  Wed Dec 29 16:55:21 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 29 Dec 1999 17:55:21 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <19991229083443.27817.qmail@web6005.mail.yahoo.com>
Message-ID: <386A3CF9.8AF0EA60@lemburg.com>

Andy Robinson wrote:
> 
> --- Skip Montanaro <skip@mojam.com> wrote:
> >     fast/memory-intensive/clear
> >     slow/memory-conserving/not-as-clear
> >     fast/memory-conserving/fairly-muddy
> >
> > Any particular reason that the readline method can't
> > return an iterator that
> > supports __getitem__ and buffers input?  (Again,
> > remember this is for py2k,
> > so the potential breakage such a change might cause
> > is a consideration, but
> > not a showstopper.)
> 
> Why not generalize fileinput to do buffering instead?
> 
> More generally, Java has the notion of 'stackable
> streams' - e.g. construct a 'BufferedFile' around a
> 'File', maybe construct a 'Line-oriented file' around
> that etc.  Each one takes a file-like object as an
> argument to the constructor.  Things you might want to
> do:
> - buffering
> - international encoding conversions
> - line delimiters other than CR/LF/CRLF
> - read/write Python objects (i.e. use pickle/marshal)
> - easy interfaces to parsers

If all goes well we'll have something like this
in Python 1.6 at least for the encoding/decoding
part file reading and writing. You basically take
a file object and then wrap some StreamCodecs around
it to get the functionality you need. Very simple
and very intuitive.

> This took me a couple of hours to get used to (and at
> the time I thought 'Yuk!' when I saw first saw four
> nested constructors), but gives you very precise
> control and a lot of versatility when handling files.
> It's an idiom Python does not use much but maybe it
> should.
> 
> I'd argue that maybe some enhancements to fileinput.py
> - adding some streams to provide building blocks for
> these operations - would get us the power you want and
> a lot more versatility besides.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Get ready to party !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From bckfnn@pipmail.dknet.dk  Wed Dec 29 18:51:52 1999
From: bckfnn@pipmail.dknet.dk (Finn Bock)
Date: Wed, 29 Dec 1999 18:51:52 GMT
Subject: [Python-Dev] zipfile.py
In-Reply-To: <3857B97E.3684224F@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com>
Message-ID: <386a582d.6762574@pipmail.dknet.dk>

James C. Ahlstrom wrote:

>  ftp://ftp.interet.com/pub/pylib.html

I feel that it smell a bit too much like a tool and too little like an general
programming api.

- It can only add disk files. The ability to write data to a zip entry through 
  a file-like object or from a string would make it more like an API, IMHO
-  Some kind of access to the TOC entry fields (date, size, compressed
  size etc) also seems like a nice feature.
- The data for an entry must be available in memory. Could be a problem 
  for huge files, but most like not in practical use.

I admit that I am fond of the api from java.util.zip.ZipFile and
java.util.zip.ZipOutputStream.

Regards,
Finn Bock


From tim_one@email.msn.com  Thu Dec 30 06:08:58 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 01:08:58 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com>
Message-ID: <000001bf528c$5cbdb9a0$a02d153f@tim>

[Skip Montanaro, wants nicer text facilities]
> ...
> I rather suspect that more people use Python for some sort of
> text processing than any other single application domain.

Hmm.  You're probably right, but I'm an exception.

> Python should be good at it.

And I guess I'm an exception mostly *because* Perl is better at easy text
crunching and Icon is better at hard text-crunching -- that is, I use the
right tool for the job <wink>.

> While I don't want to turn Python into Perl, I would like to see
> it do a better job of what most people probably use the language
> for.  Here is a very short list of things I think need attention:
>
>     1. [*A* clear way to do memory- and time-efficient textfile
>         input]

I agree, but unsure how to fix it.  The best way to write this now is

    # f is some open file object.
    while 1:
        lines = f.readlines(BUFSIZE)
        if not lines:
            break
        for line in lines:
            process(line)

and it's not something anyone figures out on their own -- or enjoys typing
or explaining afterwards.

Perl gets its line-at-a-time speed by peeking and poking C FILE structs
directly in compiler- and platform-specific ways -- ways that vendors
*should* have done in their own fgets implementations, but almost never do.
I have no idea whether it works well with Perl's nascent notions of
threading, but in the absence of that "the system" doesn't know Perl is
cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one
line at a time -- even mixing in C-level ungetc calls works (well, sometimes
<0.1 wink -- they don't always peek and poke enough fields>)).

The Python QIO extension module is much easier to port but less compatible
(it doesn't use stdio, so QIO-opened files don't play well with others) and
slower (although that's likely repairable -- he's got two passes over the
buffer where one hairier pass should suffice).

>     2. The re module needs to be sped up, if not to catch up with
>        Perl, then to catch up with the deprecated regex module.

The irony here is that the re engine is very often unboundedly faster than
the regex engine -- provided you're chewing over large strings.  Some tests
/F ran showed that the length-independent *overhead* of invoking re is about
10x higher than for regex.  Presumably the bulk of that is due to re.py,
i.e. that you get to the re engine via going thru Python layers on your way
in and out, while regex was pure C.

In any case, /F is working on a new engine (for Unicode), and I believe he
has this all well in hand.

> Depending how far people want to go with things, adding some
> language syntax to support regular expressions might be in order.
> ...
>     3. I've not yet used it, but I am told the pattern matching in
>        Marc-Andre Lemburg's mxTextTools
>       (http://starship.python.net/crew/lemburg/)
>        is both powerful and efficient (though it certainly appears
>        complex).  Perhaps it deserves consideration for
>        incorporation into the core Python distribution.

It's not complex, it's complicated -- and *that's* what makes it un-Pythonic
<wink>.  Tony Ibbs has written a friendly wrapper around mxTextTools that
suppresses much of the non-essential complication.  OTOH, if you go into
this with a regexp mindset, it will run much slower than a real regexp
package, because the bulk of the latter is devoted to doing optimization;
mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls
if you e.g. try to implement naive backtracking).

You should go to the REBOL site and look at the description of REBOL's PARSE
verb in the FAQ ... mumble, mumble ... at

    http://www.rebol.com/faq.html#11550948

Here's an example pulled from that page (this is a REBOL code fragment):

    digit: charset "0123456789"
    expr: [term ["+" | "-"] expr | term]
    term: [factor ["*" | "/"] term | factor]
    factor: [primary "**" factor | primary]
    primary: [value | "(" expr ")"]
    value: [digit value | digit]

    parse "1 + 2 ** 9" expr

There hasn't been a pattern scheme this clean, convenient or powerful since
SNOBOL4.  It exploits REBOL's Forth-like (lack of!) syntax, and
Smalltalk-like penchant for passing around thunks (anonymous closures --
"[...]" in REBOL builds a lexically-scoped entity called "a block", which
can be treated as code (executed) or data (manipulated like a Python list)
at will).

Now the example doesn't show this, but you can freely mix computations into
the middle of the patterns; only *some* of the words in the blocks have
special meaning to PARSE.  The fragment above is already way beyond what can
be accomplished with regexps, but that's just the start of it.  Perl too is
slamming in more & more ways to get user code to interact with its regexp
engine.

So REBOL has a *very* nice approach to this; I believe it's unreasonably
clumsy to mimic in Python primarily because of forward references (note e.g.
that the block attached to "expr" above refers to "term" before the latter
has been bound -- but the stuff inside [...] is just a closure so that
doesn't matter -- it only matters that term gets bound before expr is
*executed*).  I hit a similar snag years ago when trying to mimic SNOBOL4's
approach in Python.

Perl's endless abuse of regexps is making that language more absurd by the
month.

The other major approach to mixing patterns with computation is due to Icon,
another language where a regexp mindset is fatal.  On a whim, I whipped up
the attached, which illustrates a bit of the Icon approach in Pythonic terms
(but without language support for generators, the *heart* of it can't really
be captured).  Here's an example of how this could be used to implement (the
simplest form of) string.split:

def mysplit(str):
    s = Searcher(str)
    white = CharSet(" \t\n")
    result = []
    s.many(white)            # consume initial whitespace
    while s.notmany(white):  # consume non-whitespace
        result.append(s.get_match())
        s.many(white)
    return result

>>> mysplit("   \t Hey,   that's\tpretty\n\n neat!  ")
['Hey,', "that's", 'pretty', 'neat!']
>>>

The primary thing to note is that there's no seam between analyzing the
string and doing computation on the partial results -- "the program is the
pattern".  This is what Icon does to perfection, Perl is moving toward, and
REBOL is arriving at from a different direction.  It's The Future <0.9
wink>.

Without generators it's difficult to work backtracking into the Searcher
class, but, as above, in my experience the backtracking feature of regexps
is rarely *needed*!  For example, at various points "split" wants to suck up
all the whitespace characters, and that's *it* -- the backtracking
possibility in the regexp \s+ is often a bug just waiting for unexpected
*context* to trigger it.  A hairy regexp is pure hell; but what simpler
regexps can do don't require all that funky regexp machinery.

BTW, the mxTextTools engine could be used to get blazing implementations of
the primary Searcher methods (it excels at simple analysis).  OTOH, making
lots of calls to analyze short strings is slow.  The only clean solutions to
that are Perl's and Icon's (build everyting into one language so the
compiler can optimize stuff away), and REBOL's (make no distinction between
code and data, so that code can be analyzed & optimized at runtime -- and
build the entire implementation around making closures and calls
supernaturally fast).

the-less-you-use-regexps-the-less-you-miss-'em<wink>-ly y'rs  - tim

class CharSet:
    def __init__(self, seq):
        self.seq = seq
        d = {}
        for ch in seq:
            d[ch] = 1
        self.haskey = d.has_key

    def __call__(self, ch):
        return self.haskey(ch)

    def __add__(self, other):
        if isinstance(other, CharSet):
            other = other.seq
        return CharSet(self.seq + other)

def _normalize_index(i, n):
    assert n >= 0
    if i >= 0:
        return min(i, n)
    elif n == 0:
        return 0
    # want smallest q s.t. i + q*n >= 0
    # <->  q*n >= -i
    # <->  q >= -i/n
    # so q = ceiling(-i/n) = -floor(i/n)
    return i - (i/n)*n

class Searcher:
    def __init__(self, str, lo=0, hi=None):
        """Create object to search in str[lo:hi].

        lo defaults to 0.
        hi defaults to len(str).
        len(str) is repeatedly added to negative lo or hi until
        reaching a number >= 0.
        If lo > hi, a uselessly empty slice will be searched.
        The search cursor is initialized to lo.
        """

        self.s = str
        self.lo = _normalize_index(lo, len(str))
        if hi is None:
            self.hi = len(str)
        else:
            self.hi = _normalize_index(hi, len(str))
        if self.lo > self.hi:
            self.hi = self.lo
        self.i = self.lo
        self.lastmatch = None, None

    def any(self, charset, consume=1):
        """Try to match single character in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i = self.i
        if i < self.hi and charset(self.s[i]):
            if consume:
                self.__consume(i+1)
            return 1
        return 0

    def notany(self, charset, consume=1):
        """Try to match single character not in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i = self.i
        if i < self.hi and not charset(self.s[i]):
            if consume:
                self.__consume(i+1)
            return 1
        return 0

    def many(self, charset, consume=1):
        """Try to match one or more characters in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i, n, s = self.i, self.hi, self.s
        j = i
        while j < n and charset(s[j]):
            j = j+1
        if i < j:
            if consume:
                self.__consume(j)
            return 1
        return 0

    def notmany(self, charset, consume=1):
        """Try to match one or more characters not in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i, n, s = self.i, self.hi, self.s
        j = i
        while j < n and not charset(s[j]):
            j = j+1
        if i < j:
            if consume:
                self.__consume(j)
            return 1
        return 0

    def match(self, str, consume=1):
        """Try to match string "str".

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i = self.i
        j = i + len(str)
        if self.s[i:j] == str:
            if consume:
                self.__consume(j)
            return 1
        return 0

    def get_str(self):
        """Return subject string."""
        return self.s

    def get_lo(self):
        """Return low slice bound."""
        return self.lo

    def get_hi(self):
        """Return high slice bound."""
        return self.hi

    def get_pos(self):
        """Return current value of search cursor."""
        return self.i

    def get_match_indices(self):
        """Return slice indices of last "consumed" match."""
        return self.lastmatch

    def get_match(self):
        """Return last "consumed" matching substring."""
        i, j = self.lastmatch
        if i is None:
            return ValueError("no match to return!")
        return self.s[i:j]

    def set_pos(self, pos, consume=1):
        """Set search cursor to new value.  No return value.

        If optional arg "consume" is true, the last match is set to
        the slice between pos and the current cursor position.
        """

        p = _normalize_index(pos, len(self.s))
        if not self.lo <= p <= self.hi:
            raise ValueError("pos out of bounds: " + `pos`)
        if consume:
            self.__consume(p)
        else:
            self.i = p

    def move_pos(self, incr, consume=1):
        """Move the cursor by incr characters.  No return value.

        If the new value is outside the slice bounds, it's clipped.
        If optional arg "consume" is true, the last match is set to
        the slice between the old and new cursor positions.
        """

        newi = self.i + incr
        if newi < self.lo:
            newi = self.lo
        elif newi > self.hi:
            newi = self.hi
        if consume:
            self.__consume(newi)
        else:
            self.i = newi

    def __consume(self, newi):
        i, j = self.i, newi
        if i > j:
            i, j = j, i
        self.lastmatch = i, j
        self.i = newi


From tim_one@email.msn.com  Thu Dec 30 06:09:14 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 01:09:14 -0500
Subject: [Python-Dev] Fixed-decimal types
In-Reply-To: <199912231944.OAA23337@eric.cnri.reston.va.us>
Message-ID: <000201bf528c$657c3080$a02d153f@tim>

[Guido]
> ...
> Not arguing for this interpretation, just indicating that doing
> fixed precision arithmetic right is hard.

It's not so much hard as it is arbitrary.  The floating-point world is
standardized now, but the fixed-point world remains a mish-mash of
incompatible legacy schemes carried across generations of products for no
reason other than product-specific compatibility.  So despite that
fixed-point has a specialty audience, whatever rules Python chooses will
leave it incompatible with much of that audience's (mixed!) expectations.

If fixed-point is needed, and my FixedPoint.py isn't good enough (all other
fixed point pkgs I've seen for Python were braindead), then it should be
implemented such that developers can control both rounding and precision
propagation.  I'll attach suitable kernels; they haven't been tested but any
bugs discovered will be trivial to fix (there are no difficulties here, but
typos are likely); the kernels supply the bulk of what's required, whether
implemented in Python or C; various packages can wrap them to supply
whatever policies they like; see FixedPoint.py for exact string<->FixedPoint
and exact float->FixedPoint conversions; and that's the end of my
involvement in fixed-point <wink>.

Python should certainly *not* add a "scale factor" to its current long
implementation; fixed-point should be a distinct type, as scale-factor
fiddling is clumsy and pervasive (long arithmetic is challenging enough to
get correct and quick without this obfuscating distraction; and by leaving
scale factors out of it, it's much easier to plug in alternative bigint
implementations (like GMP)).

One other point:  some people are going to want BCD (binary-coded decimal),
which suffers the same mish-mash of legacy policies, but with a different
data representation.  The point is that many commercial applications spend
much more time doing I/O conversions than arithmetic, and BCD accepts slow
arithmetic (in the absence of special HW support) in return for fast scaling
& I/O conversion.

Forgetting the database-heads for a moment, decimal *floating*-point is what
calculators do, so that's what "real people" are most comfortable with.  The
IEEE-854 std (IEEE-754's younger and friendlier brother) specifies that
completely.  Add a means to boost "global" precision (a la REXX), and it's a
powerful tool even for experts (benefits approximating those of unbounded
rational arithmetic but with bounded & user-controllable expense).

can-never-have-too-many-numeric-types-but-always-have-
    too-few-literal-notations-ly y'rs  - tim


# Kernels for fixed-point decimal arithmetic.

# _add, _sub, _mul, _div all have arglist
#     n1, p1, n2, p2, p, round=DEFAULT_ROUND
# n1 and n2 are longs; p1, p2 and p ints >= 0.
# The inputs are exactly n1/10**p1 and n2/10**p2.
#
# The return value is the integer n such that n/10**p is the best
# approximation to the infinite-precision result.  In other words, p1
# and p2 are the input precisions and p is the desired output
# precision, where precision is the # of digits *after* the decimal
# point.
#
# What "best approximation" means is determined by the round function.
# In many cases rounding isn't required, but when it is
#     round(top, bot)
# is returned.  top and bot are longs, with bot > 0 guaranteed.  The
# infinite-precision result is top/bot.  round must return an integer
# (long) approximation to top/bot, using whichever rounding discipline
# you want.  By default, IEEE round-to-nearest/even is used; see the
# _roundXXX functions for examples of suitable rounding functions.
#
# Note:  The only code here that knows we're working in decimal is
# function _tento; simply change the "10L" in that to do fixed-point
# arithmetic in some other base.
#
# Example:
#
# >>> r7 = _div(1L, 0, 7L, 0, 20)  # 1/7
# >>> r7
# 14285714285714285714L
# >>> r5 = _div(1L, 0, 5L, 0, 20)  # 1/5
# >>> r5
# 20000000000000000000L
# >>> sum = _add(r7, 20, r5, 20, 20)  # 1/7 + 1/5 = 12/35
# >>> sum
# 34285714285714285714L
# >>> _mul(sum, 20, 35L, 0, 20)
# 1199999999999999999990L
# >>> _mul(sum, 20, 35L, 0, 18)
# 12000000000000000000L
# >>> _mul(sum, 20, 35L, 0, 0)
# 12L
# >>>

###################################################################
# Sample rounding functions.
###################################################################

# Round to minus infinity.

def _roundminf(top, bot):
    assert bot > 0
    return top / bot

# Round to plus infinity.

def _roundpinf(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    if r:
        q = q + 1
    return q

# IEEE nearest/even rounding (closest integer; in case of tie closest
# even integer).

def _roundne(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    c = cmp(r << 1, bot)
    # c < 0 <-> r < bot/2, etc
    if c > 0 or (c == 0 and (q & 1) == 1):
        q = q + 1
    return q

# "Add a half and chop" rounding (remainder < 1/2 toward 0; remainder
# >= half away from 0).

def _roundhalf(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    c = cmp(r << 1, bot)
    # c < 0 <-> r < bot/2, etc
    if c > 0 or (c == 0 and q >= 0):
        q = q + 1
    return q

# Round toward 0 (throw away remainder).

def _roundchop(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    if r and q < 0:
        q = q + 1
    return q

###################################################################
# Kernels for + - * /.
###################################################################

DEFAULT_ROUND = _roundne

def _add(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    # (n1/10**p1 + n2/10**p2) * 10**p ==
    # (n1*10**(max-p1) + n2*10**(max-p2))/10**max * 10**p
    max = p1    # until proven otherwise
    if p1 < p2:
        n1 = n1 * _tento(p2 - p1)
        max = p2
    elif p2 < p1:
        n2 = n2 * _tento(p1 - p2)
    n3 = n1 + n2
    p3 = p - max
    if p3 > 0:
        n3 = n3 * _tento(p3)
    elif p3 < 0:
        n3 = round(n3, _tento(-p3))
    return n3

def _sub(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    return _add(n1, p1, -n2, p2, p, round)

def _mul(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    # (n1/10**p1 * n2/10**p2) * 10**p ==
    # (n1*n2)/10**(p1+p2) * 10**p
    n3 = n1 * n2
    p3 = p - p1 - p2
    if p3 > 0:
        n3 = n3 * _tento(p3)
    elif p3 < 0:
        n3 = round(n3, _tento(-p3))
    return n3

def _div(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    if n2 == 0:
        raise ZeroDivisionError("scaled integer")
    # (n1/10**p1 / n2/10**p2) * 10**p ==
    # (n1/n2) * 10**(p2-p1+p)
    p3 = p2 - p1 + p
    if p3 > 0:
        n1 = n1 * _tento(p3)
    elif p3 < 0:
        n2 = n2 * _tento(-p3)
    if n2 < 0:
        n1 = -n1
        n2 = -n2
    return round(n1, n2)

def _tento(i, _cache={}):
    assert i >= 0
    try:
        return _cache[i]
    except KeyError:
        answer = _cache[i] = 10L ** i
        return answer


From fredrik@pythonware.com  Thu Dec 30 11:05:45 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 30 Dec 1999 12:05:45 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <000001bf528c$5cbdb9a0$a02d153f@tim>
Message-ID: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com>

Tim Peters is back from his vacation:
> > While I don't want to turn Python into Perl, I would like to see
> > it do a better job of what most people probably use the language
> > for.  Here is a very short list of things I think need attention:
> >
> >     1. [*A* clear way to do memory- and time-efficient textfile
> >         input]
> 
> I agree, but unsure how to fix it.  The best way to write this now is
> 
>     # f is some open file object.
>     while 1:
>         lines = f.readlines(BUFSIZE)
>         if not lines:
>             break
>         for line in lines:
>             process(line)
> 
> and it's not something anyone figures out on their own -- or enjoys typing
> or explaining afterwards.
> 
> Perl gets its line-at-a-time speed by peeking and poking C FILE structs
> directly in compiler- and platform-specific ways -- ways that vendors
> *should* have done in their own fgets implementations, but almost never do.
> I have no idea whether it works well with Perl's nascent notions of
> threading, but in the absence of that "the system" doesn't know Perl is
> cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one
> line at a time -- even mixing in C-level ungetc calls works (well, sometimes
> <0.1 wink -- they don't always peek and poke enough fields>)).
> 
> The Python QIO extension module is much easier to port but less compatible
> (it doesn't use stdio, so QIO-opened files don't play well with others) and
> slower (although that's likely repairable -- he's got two passes over the
> buffer where one hairier pass should suffice).

we have something called SIO which uses memory mapping
where possible, and just a more aggressive read-ahead for
other cases.  on a windows box, a traditional while/readline
loop runs 3-5 times faster than before.  with SRE instead of
re, a while/readline/match loop runs up to 10 times faster
than before.

note that this is without *any* changes to the Python
source code...

> >     2. The re module needs to be sped up, if not to catch up with
> >        Perl, then to catch up with the deprecated regex module.
> 
> The irony here is that the re engine is very often unboundedly faster than
> the regex engine -- provided you're chewing over large strings.  Some tests
> /F ran showed that the length-independent *overhead* of invoking re is about
> 10x higher than for regex.  Presumably the bulk of that is due to re.py,
> i.e. that you get to the re engine via going thru Python layers on your way
> in and out, while regex was pure C.

I've attached some old benchmarks.  I think the current code
base is a bit faster, but you get the idea.

> In any case, /F is working on a new engine (for Unicode), and I believe he
> has this all well in hand.

with a little luck, the new module will replace both pcre
and regex...

not to mention that it's fairly easy to write your own front-
end to the matching engine -- the expression parser and the
compiler are both written in good old python.

</F>

$ python sre_bench.py
          0     5    50   250  1000  5000 25000
----- ----- ----- ----- ----- ----- ----- -----
search for Python|Perl in Perl ->
sre8  0.007 0.008 0.010 0.010 0.020 0.073 0.349
sre16 0.007 0.007 0.008 0.010 0.020 0.075 0.353
re    0.097 0.097 0.101 0.103 0.118 0.175 0.480
regex 0.007 0.007 0.009 0.020 0.059 0.271 1.320

search for (Python|Perl) in Perl ->
sre8  0.007 0.007 0.007 0.010 0.020 0.074 0.344
sre16 0.007 0.007 0.008 0.010 0.020 0.074 0.347
re    0.110 0.104 0.111 0.115 0.125 0.184 0.559
regex 0.006 0.006 0.009 0.019 0.057 0.285 1.432

search for Python in Python ->
sre8  0.007 0.007 0.007 0.011 0.021 0.072 0.387
sre16 0.007 0.007 0.008 0.010 0.022 0.082 0.365
re    0.107 0.097 0.105 0.102 0.118 0.175 0.511
regex 0.009 0.008 0.010 0.018 0.036 0.139 0.708

search for .*Python in Python ->
sre8  0.008 0.007 0.008 0.011 0.021 0.079 0.379
sre16 0.008 0.008 0.008 0.011 0.022 0.075 0.402
re    0.102 0.108 0.119 0.183 0.400 1.545 7.284
regex 0.013 0.019 0.072 0.318 1.231 8.035 45.366

search for .*Python.* in Python ->
sre8  0.008 0.008 0.008 0.011 0.021 0.080 0.383
sre16 0.008 0.008 0.008 0.011 0.021 0.079 0.395
re    0.103 0.108 0.119 0.184 0.418 1.685 8.378
regex 0.013 0.020 0.073 0.326 1.264 9.961 46.511

search for .*(Python) in Python ->
sre8  0.007 0.008 0.008 0.011 0.021 0.077 0.378
sre16 0.007 0.008 0.008 0.011 0.021 0.077 0.444
re    0.108 0.107 0.134 0.240 0.637 2.765 13.395
regex 0.026 0.112 3.820 87.322 (skipped)

search for .*P.*y.*t.*h.*o.*n.* in Python ->
sre8  0.010 0.010 0.014 0.031 0.093 0.419 2.212
sre16 0.010 0.011 0.014 0.030 0.093 0.419 2.292
re    0.112 0.121 0.195 0.521 1.747 8.298 40.877
regex 0.026 0.048 0.248 1.148 4.550 24.720 ...

(searching for patterns in padded strings; sre8
is the sre engine compiled for 8-bit characters,
sre16 is the same engine compiled for 16-bit
characters)


From mal@lemburg.com  Thu Dec 30 11:52:50 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 30 Dec 1999 12:52:50 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <000001bf528c$5cbdb9a0$a02d153f@tim>
Message-ID: <386B4792.A551022A@lemburg.com>

Tim Peters wrote:
> 
> [Skip Montanaro, wants nicer text facilities]
> > While I don't want to turn Python into Perl, I would like to see
> > it do a better job of what most people probably use the language
> > for.  Here is a very short list of things I think need attention:
> >
> >     1. [*A* clear way to do memory- and time-efficient textfile
> >         input]
>
> ...
> 
> The Python QIO extension module is much easier to port but less compatible
> (it doesn't use stdio, so QIO-opened files don't play well with others) and
> slower (although that's likely repairable -- he's got two passes over the
> buffer where one hairier pass should suffice).

What is QIO ?
 
> > Depending how far people want to go with things, adding some
> > language syntax to support regular expressions might be in order.
> > ...
> >     3. I've not yet used it, but I am told the pattern matching in
> >        Marc-Andre Lemburg's mxTextTools
> >       (http://starship.python.net/crew/lemburg/)
> >        is both powerful and efficient (though it certainly appears
> >        complex).  Perhaps it deserves consideration for
> >        incorporation into the core Python distribution.
> 
> It's not complex, it's complicated -- and *that's* what makes it un-Pythonic
> <wink>.  Tony Ibbs has written a friendly wrapper around mxTextTools that
> suppresses much of the non-essential complication.  OTOH, if you go into
> this with a regexp mindset, it will run much slower than a real regexp
> package, because the bulk of the latter is devoted to doing optimization;
> mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls
> if you e.g. try to implement naive backtracking).

All true. mxTextTools provides the tools, not the magic. But this
is also its strength: you can optimize the hell out of your particular
parsing requirement without having to think about how the RE optimizer
works.

> You should go to the REBOL site and look at the description of REBOL's PARSE
> verb in the FAQ ... mumble, mumble ... at
> 
>     http://www.rebol.com/faq.html#11550948
> 
> Here's an example pulled from that page (this is a REBOL code fragment):
> 
>     digit: charset "0123456789"
>     expr: [term ["+" | "-"] expr | term]
>     term: [factor ["*" | "/"] term | factor]
>     factor: [primary "**" factor | primary]
>     primary: [value | "(" expr ")"]
>     value: [digit value | digit]
> 
>     parse "1 + 2 ** 9" expr
> 
> There hasn't been a pattern scheme this clean, convenient or powerful since
> SNOBOL4.  It exploits REBOL's Forth-like (lack of!) syntax, and
> Smalltalk-like penchant for passing around thunks (anonymous closures --
> "[...]" in REBOL builds a lexically-scoped entity called "a block", which
> can be treated as code (executed) or data (manipulated like a Python list)
> at will).

Looks nice indeed, but how does executable code fit into
that definition ? (mxTextTools allows you to write your own
parsing elements in Python, BTW; it should be possible to
use those mechanisms to achieve a similar intergration.)
 
> ...
>
> BTW, the mxTextTools engine could be used to get blazing implementations of
> the primary Searcher methods (it excels at simple analysis).  OTOH, making
> lots of calls to analyze short strings is slow.

That's why mxTextTools converts these search idioms into byte codes
which it executes at C level. Some future version will even "precompile"
the tuple input and then omit the type checks during the search...
that should give another noticeable speedup. Note that recursion
etc. can be done at C level too -- Python function calls are not
needed.

> The only clean solutions to
> that are Perl's and Icon's (build everyting into one language so the
> compiler can optimize stuff away), and REBOL's (make no distinction between
> code and data, so that code can be analyzed & optimized at runtime -- and
> build the entire implementation around making closures and calls
> supernaturally fast).

Just for kicks, here is the mysplit() function using mxTextTools:

from mx.TextTools import *

table = (
    # Match all whitespace
    (None,AllInSet,whitespace_set,+1),
    # Match and tag all non-whitespace
    ('text',AllInSet + AppendMatch,nonwhitespace_set,+1),
    # Loop until EOF
    (None,EOF,Here,-2),
    )

def mysplit(text):

    return tag(text,table)[1]

The timings:
 mysplit: 5.84 sec.
 string.split: 3.62 sec.

Note that you can customize the above to split text at any
character set you like, not just whitespace... without
compiling or writing C code. The function mx.TextTools.setsplit()
provides this functionality as pure C function.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Get ready to party !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jim@interet.com  Thu Dec 30 14:21:36 1999
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 30 Dec 1999 09:21:36 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk>
Message-ID: <386B6A70.3C9A0042@interet.com>

Finn Bock wrote:
> 
> James C. Ahlstrom wrote:
> 
> >  ftp://ftp.interet.com/pub/pylib.html
> 
> I feel that it smell a bit too much like a tool and too little like an general
> programming api.

It was meant to be an API except for writepy(), which is clearly a tool.
 
> - It can only add disk files. The ability to write data to a zip entry through
>   a file-like object or from a string would make it more like an API, IMHO

I could add a method
     writestr(self, string, year, month, day, hour, minute, second, ...)
There are a lot of fields required which usually come from the file.

> -  Some kind of access to the TOC entry fields (date, size, compressed
>   size etc) also seems like a nice feature.

This access is provided directly by self.TOC, and the fields are
documented.

> - The data for an entry must be available in memory. Could be a problem
>   for huge files, but most like not in practical use.

I agree, but adding loops will make it slower.  What do others think?
 
> I admit that I am fond of the api from java.util.zip.ZipFile and
> java.util.zip.ZipOutputStream.

I don't know this API.  If writestr() is not sufficient, what
API would you like?

JimA


From bckfnn@pipmail.dknet.dk  Thu Dec 30 19:14:14 1999
From: bckfnn@pipmail.dknet.dk (Finn Bock)
Date: Thu, 30 Dec 1999 19:14:14 GMT
Subject: [Python-Dev] zipfile.py
In-Reply-To: <386B6A70.3C9A0042@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk> <386B6A70.3C9A0042@interet.com>
Message-ID: <386baec9.2867733@pipmail.dknet.dk>

[I wrote]

> - It can only add disk files. The ability to write data to a zip entry through
>   a file-like object or from a string would make it more like an API, IMHO

[JimA wrote]

>I could add a method
>     writestr(self, string, year, month, day, hour, minute, second, ...)
>There are a lot of fields required which usually come from the file.

Something like that seems fine to me. 

[I wrote]

> -  Some kind of access to the TOC entry fields (date, size, compressed
>   size etc) also seems like a nice feature.

[JimA answers]

>This access is provided directly by self.TOC, and the fields are
>documented.

Good enough. My bad, I was looking for getter methods. (me being a java dude)

[I wrote]

> I admit that I am fond of the api from java.util.zip.ZipFile and
> java.util.zip.ZipOutputStream.

[JimA asks]

>I don't know this API.  If writestr() is not sufficient, what
>API would you like?

This is only meant as a source for inspiration, certainly as a request for
change. writestr would answer my complaint nicely. Below, only one ZipEntry can
be actively read or written to at a time. All the small details of performance
and implementation complexity are ignored. 

class ZipFile:
    def getEntry(name):
          ...
          self.activeentry = ZipEntry(name)
          return self.activeentry

class ZipEntry:
     #enough methods and fields to fake file-ness to casual users like me.
     def write(list): ...
     def writelines(str): ...
     def read(size=None): ...
     def readlines(sizehint=-1): ...

     def seek(offset): ...
     def flush(): ...
     def close(str): ...

     def getSize(): ....
     def getCompressedSize(): ....
     def getFlags(): ....


regards,
finn


From tim_one@email.msn.com  Fri Dec 31 03:35:18 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 22:35:18 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <386B4792.A551022A@lemburg.com>
Message-ID: <000001bf5340$0fb20300$e12d153f@tim>

[M.-A. Lemburg]
> What is QIO ?

See DejaNews (I don't save URLs).  "Quick" line-oriented text input adapted
from INN.  Someone rewrote that as a Python extension module.

>>     http://www.rebol.com/faq.html#11550948

> Looks nice indeed, but how does executable code fit into
> that definition ?

See the URL above I didn't save <wink>.  PARSE's "pattern" argument is a
block.  Blocks can be (& often are) nested.  Whether any given block is code
or data is all the same to REBOL, so passing nested code blocks in PARSE's
pattern argument is easy.  Because blocks are lexically scoped, assignments
(etc) inside a block are (well, can be) visible to its context; etc.  It's a
very Lispish approach.  REBOL is essentially Scheme under the covers, but
with syntax much more like Forth's (whitespace-separated strings of
arbitrary non-whitespace characters, with few pre-assigned meanings or
restrictions -- in fact, it's impossible for a compiler to determine where a
REBOL function call begins or ends!  can't be known until runtime).

> (mxTextTools allows you to write your own parsing elements
> in Python, BTW; it should be possible to use those mechanisms
> to achieve a similar intergration.)

It can't capture the flavor -- although I don't know that it needs to
<wink>.  There's no distinction between "the pattern language" and "the
computational language" in REBOL or Icon, and it's hard to explain what a
maddening distinction that can be once you've lived without it.  mxTextTools
embedding would feel more like Icon, where the matching engine is fully
exposed to the programmer (REBOL hides it, allowing only "approved"
interactions).

>> OTOH, making lots of calls to analyze short strings is slow.

> That's why mxTextTools converts these search idioms into byte
> codes which it executes at C level. Some future version will
> even "precompile" the tuple input and then omit the type checks
> during the search...that should give another noticeable speedup.
> Note that recursion etc. can be done at C level too -- Python
> function calls are not needed.

That's also the curse of having distinct languages; e.g., Python already had
recursion, but you needed to reimplement it in a different way with
different syntax and different rules in your pattern language.  In Icon etc,
there's no difference between a recursive pattern and a recursive function,
except in *what* it computes.  The machinery is all the same, and both more
powerful and easier to learn because of that.

> ...
> Just for kicks, here is the mysplit() function using mxTextTools:
>
> from mx.TextTools import *
>
> table = (
>     # Match all whitespace
>     (None,AllInSet,whitespace_set,+1),
>     # Match and tag all non-whitespace
>     ('text',AllInSet + AppendMatch,nonwhitespace_set,+1),
>     # Loop until EOF
>     (None,EOF,Here,-2),
>     )
>
> def mysplit(text):
>
>     return tag(text,table)[1]
>
> The timings:
>  mysplit: 5.84 sec.
>  string.split: 3.62 sec.
>
> Note that you can customize the above to split text at any
> character set you like, not just whitespace... without
> compiling or writing C code.

That's equally true of the example I posted <wink>.  Now what if I wanted to
stop splitting right after I find a keyword, recognized as such because it's
a key in some passed-in dictionary?  In my example, I make an obvious local
code change, from

    while s.notmany(white):  # consume non-whitespace
        result.append(s.get_match())
        s.many(white)

to

    while s.notmany(white):  # consume non-whitespace
        word = s.get_match()
        result.append(word)
        if dictionary.has_key(word):
            break
        s.many(white)

What does it do to your example?  Or what if the target string isn't "a
string" (the code I posted only assumes the "str" object responds to
indexing and slicing -- any buffer object is fine -- so my example doesn't
change at all)?  Or what if you need to pass the tokens on as they're found,
pipeline style?  Etc.  This is why I do complex string processing in Icon
<0.9 wink>.

OTOH, at what it does well, mxTextTools runs quicker than Icon.  Its biggest
problem has always been that e.g. nobody knows what the hell

     (None,EOF,Here,-2),

*means* at first glance -- or third <wink>.

an-extreme-on-the-transparency-vs-speed-curve-ly y'rs  - tim


From mal@lemburg.com  Fri Dec 31 11:18:57 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 31 Dec 1999 12:18:57 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <000001bf5340$0fb20300$e12d153f@tim>
Message-ID: <386C9121.E9D9DC01@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > What is QIO ?
> 
> See DejaNews (I don't save URLs).  "Quick" line-oriented text input adapted
> from INN.  Someone rewrote that as a Python extension module.

Ok, thanks.
 
> >>     http://www.rebol.com/faq.html#11550948
> 
> > Looks nice indeed, but how does executable code fit into
> > that definition ?
> 
> See the URL above I didn't save <wink>.  PARSE's "pattern" argument is a
> block.  Blocks can be (& often are) nested.  Whether any given block is code
> or data is all the same to REBOL, so passing nested code blocks in PARSE's
> pattern argument is easy.  Because blocks are lexically scoped, assignments
> (etc) inside a block are (well, can be) visible to its context; etc.  It's a
> very Lispish approach.  REBOL is essentially Scheme under the covers, but
> with syntax much more like Forth's (whitespace-separated strings of
> arbitrary non-whitespace characters, with few pre-assigned meanings or
> restrictions -- in fact, it's impossible for a compiler to determine where a
> REBOL function call begins or ends!  can't be known until runtime).

If I understand the concept correctly, I think Python could do
pretty much the same thing. The bummer is of course the need
for new keywords and byte codes (although these could be
split out into a separate text scanning engine). Using Python
function calls would slow down things to an extent that would
render the added functionality useless, well IMHO anyways ;-)

> > (mxTextTools allows you to write your own parsing elements
> > in Python, BTW; it should be possible to use those mechanisms
> > to achieve a similar intergration.)
> 
> It can't capture the flavor -- although I don't know that it needs to
> <wink>.  There's no distinction between "the pattern language" and "the
> computational language" in REBOL or Icon, and it's hard to explain what a
> maddening distinction that can be once you've lived without it.  mxTextTools
> embedding would feel more like Icon, where the matching engine is fully
> exposed to the programmer (REBOL hides it, allowing only "approved"
> interactions).

Of course its hard for a Turing Machine to capture the flavor
of any high level language :-) When you're programming
the mxTextTools Tagging Engine directly you feel like writing
assembler... but things are moving in the right direction:
Tony Ibbs has a nice meta-language and M.C. Fletcher his
SimpleParse to cover up these insufficiencies.
 
> >> OTOH, making lots of calls to analyze short strings is slow.
> 
> > That's why mxTextTools converts these search idioms into byte
> > codes which it executes at C level. Some future version will
> > even "precompile" the tuple input and then omit the type checks
> > during the search...that should give another noticeable speedup.
> > Note that recursion etc. can be done at C level too -- Python
> > function calls are not needed.
> 
> That's also the curse of having distinct languages; e.g., Python already had
> recursion, but you needed to reimplement it in a different way with
> different syntax and different rules in your pattern language.  In Icon etc,
> there's no difference between a recursive pattern and a recursive function,
> except in *what* it computes.  The machinery is all the same, and both more
> powerful and easier to learn because of that.

Agreed.
 
> > ...
> > Just for kicks, here is the mysplit() function using mxTextTools:
> >
> > from mx.TextTools import *
> >
> > table = (
> >     # Match all whitespace
> >     (None,AllInSet,whitespace_set,+1),
> >     # Match and tag all non-whitespace
> >     ('text',AllInSet + AppendMatch,nonwhitespace_set,+1),
> >     # Loop until EOF
> >     (None,EOF,Here,-2),
> >     )
> >
> > def mysplit(text):
> >
> >     return tag(text,table)[1]
> >
> > The timings:
> >  mysplit: 5.84 sec.
> >  string.split: 3.62 sec.
> >
> > Note that you can customize the above to split text at any
> > character set you like, not just whitespace... without
> > compiling or writing C code.
> 
> That's equally true of the example I posted <wink>.  Now what if I wanted to
> stop splitting right after I find a keyword, recognized as such because it's
> a key in some passed-in dictionary?  In my example, I make an obvious local
> code change, from
> 
>     while s.notmany(white):  # consume non-whitespace
>         result.append(s.get_match())
>         s.many(white)
> 
> to
> 
>     while s.notmany(white):  # consume non-whitespace
>         word = s.get_match()
>         result.append(word)
>         if dictionary.has_key(word):
>             break
>         s.many(white)
> 
> What does it do to your example? 

You'd replace the 'text' tagobj with a callable object and
write AllInSet + CallTag as command. The Tagging Engine will
then call the object with arguments (taglist,text,l,r,subtags)
and let it decide what to do.

In your example it would check the dictionary and raise an
exception in case a keyword is found to stop any further
scanning. If it's not a keyword, it would simply append
the found string to the taglist and return None.

Here's the code:

from mx.TextTools import *

import exceptions

stoplist = {'abc':1, 'def':1}

class KeywordFound(exceptions.StandardError):
    def __init__(self, taglist):
        self.taglist = taglist

def callable(taglist,text,l,r,subtags):

    taglist.append(text[l:r])
    if stoplist.has_key(text[l:r]):
        raise KeywordFound(taglist)

table = (
    # Match all whitespace
    (None,AllInSet,whitespace_set,+1),
    # Match and tag all non-whitespace
    (callable,AllInSet + CallTag,nonwhitespace_set,+1),
    # Loop until EOF
    (None,EOF,Here,-2),
    )

def mysplitex(text):

    try:
        return tag(text,table)[1]
    except KeywordFound,data:
        return data.taglist

> Or what if the target string isn't "a
> string" (the code I posted only assumes the "str" object responds to
> indexing and slicing -- any buffer object is fine -- so my example doesn't
> change at all)? 

The current version only handles string objects, but I am
already beginning to convert all the APIs in mxTextTools to
"s#" or "t#" style (can't decide which to use... "s#" is great
for processing raw data, while "t#" more closely refers to
text processing).

> Or what if you need to pass the tokens on as they're found,
> pipeline style?  Etc.  This is why I do complex string processing in Icon
> <0.9 wink>.

You can have all that extra magic via callable tag objects
or callable matching functions. It's not exactly nice to
write, but I'm sure that a meta-language could do the 
conversions for you.
 
> OTOH, at what it does well, mxTextTools runs quicker than Icon.  Its biggest
> problem has always been that e.g. nobody knows what the hell
> 
>      (None,EOF,Here,-2),
> 
> *means* at first glance -- or third <wink>.

The structure of those tag tables is very simple:

(tagobject, command, argument[, jump offset in case of failure
			     [, jump offset in case of success]])
                               
Please remember that this is byte code, not some higher level
abstraction. The design is very much inverted from what you'd
usually do: design a nice language and then try to find suitable
set of byte codes to make it work as intended.

Anyway, I'll keep focussing on the speed aspect of mxTextTools;
others can focus on abstractions, so that eventually everybody
will be happy :-)

Happy New Year,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Get ready to party !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim_one@email.msn.com  Fri Dec 31 22:53:49 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 31 Dec 1999 17:53:49 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com>
Message-ID: <000701bf53e1$e7119760$472d153f@tim>

[Fredrik Lundh, whose very nice eMatter book is on sale until
  the end of the 20th century (as real people think of it),
  although the eMatter distribution scheme has lots of problems
  [just an editorial note from a bot who has to-- for unknown
   reasons Fatbrain "is working on" --delete the Fatbrain
   registry tree and reregister the book almost every time he
   tries to open it <wink>
  ]
]

> we have something called SIO which uses memory mapping
> where possible, and just a more aggressive read-ahead for
> other cases.  on a windows box, a traditional while/readline
> loop runs 3-5 times faster than before.  with SRE instead of
> re, a while/readline/match loop runs up to 10 times faster
> than before.
>
> note that this is without *any* changes to the Python
> source code...

If so, there's potential for significantly more speed.  Python does its
line-at-a-time input with a character-at-a-time macro-in-a-loop, the same
way naive vendors (read "almost all vendors") implement fgets.  It's
replacing that inner loop with direct peeking into the FILE buffer that gets
Perl its dramatic speed -- despite that Perl has fancier input functionality
(the oft-requested automagical "input record separator").  So it sounds like
the Perl trick is orthogonal to SIO's tricks; Perl isn't doing mmaps or
read-aheads or anything else fancy under the covers -- it only optimizes the
inner loop!

> ...
> with a little luck, the new module will replace both pcre
> and regex...

If something more tangible than luck would help to make this come true, feel
free to mention it <wink>.

> not to mention that it's fairly easy to write your own front-
> end to the matching engine -- the expression parser and the
> compiler are both written in good old python.

Ah, good news / bad news.  Perl refugees aren't accustomed to "precompiling"
regexp objects, so write code that will cause regexps to get recompiled over
& over.  Even if you cache the results under the covers, the overhead of the
Python call to the regexp compiler will likely take as long as the engine
takes to search.

Personally, in such cases, I think they should learn how to use the language
<0.5 wink>.


From tim_one@email.msn.com  Fri Dec 31 22:53:56 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 31 Dec 1999 17:53:56 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <386C9121.E9D9DC01@lemburg.com>
Message-ID: <000901bf53e1$eb4248c0$472d153f@tim>

>> This is why I do complex string processing in Icon <0.9 wink>.

[MAL]
> You can have all that extra magic via callable tag objects
> or callable matching functions. It's not exactly nice to
> write, but I'm sure that a meta-language could do the
> conversions for you.

That wasn't my point:  I do it in Icon because it *is* "exactly nice to
write", and doesn't require any yet-another meta-language.  It's all
straightforward, in a way that separate schemes pasted together can never be
(simply because they *are* "separate schemes pasted together" <wink>).

The point of my Python examples wasn't that they could do something
mxTextTools can't do, but that they were *Python* examples:  every variation
I mentioned (or that you're likely to think of) was easy to handle for any
Python programmer because the "control flow" and "data type" etc aspects
could be handled exactly the way they always are in *non* pattern-matching
Python code too, rather than recoded in pattern-scheme-specific different
ways (e.g., where I had a vanailla "if/break", you set up a special
exception to tickle the matching engine).

I'm not attacking mxTextTools, so don't feel compelled to defend it --
people using regexps in those examples are dead in the water.  mxTextTools
is very good at what it does; if we have a real disagreement, it's probably
that I'm less optimistic about the prospects for higher-level wrappers
(e.g., MikeF's SimpleParse is much slower than "a real" BNF parsing system
(ARBNFPS), in part because he isn't doing all the optimizations ARBNFPS
does, but also in part because ARBNFPS uses an underlying engine more
optimized to its specific task than mxTextTool's more-general engine *can*
be).  So I don't see mxTextTools as being the answer to everything -- and if
you hadn't written it, you would agree with that on first glance <wink>.

> Anyway, I'll keep focussing on the speed aspect of mxTextTools;
> others can focus on abstractions, so that eventually everybody
> will be happy :-)

You and I will be, anyway <wink>.


From guido at CNRI.Reston.VA.US  Wed Dec  1 18:32:08 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 12:32:08 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Fri, 19 Nov 1999 14:59:11 CST."
             <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> 
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> 
Message-ID: <199912011732.MAA10419@eric.cnri.reston.va.us>

> My first Python-Dev post.  :-)

Welcome!

> >We had some discussion a while back about enabling thread support by
> >default, if the underlying OS supports it obviously.  

I agree with this.  MacOS seems to be the only OS without threads
these days.

> What's the consensus about Python microthreads -- a likely candidate
> for incorporation in 1.6 (or later)?

What are microthreads?  If you think about threads implemented in the
Python VM instead of in the OS, forget it.

> Also, we have a couple minor convenience functions for Python in an 
> MSDEV environment, an exposure of OutputDebugString for writing to 
> the DevStudio log window and a means of tripping DevStudio C/C++ layer
> breakpoints from Python code (currently experimental).  The msvcrt 
> module seems like a likely candidate for these, would these be 
> welcome additions?

Sure -- send patches.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From petrilli at amber.org  Wed Dec  1 18:39:00 1999
From: petrilli at amber.org (Christopher Petrilli)
Date: Wed, 1 Dec 1999 12:39:00 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: <199912011732.MAA10419@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Wed, Dec 01, 1999 at 12:32:08PM -0500
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us>
Message-ID: <19991201123900.A7419@trump.amber.org>

Guido van Rossum [guido at CNRI.Reston.VA.US] wrote:
> > >We had some discussion a while back about enabling thread support by
> > >default, if the underlying OS supports it obviously.  
> 
> I agree with this.  MacOS seems to be the only OS without threads
> these days.

I believe the new GUISI package has pthread-API compatible threads
implemented, which talk to the underlying ThreadManager.  With MacOSX
being impending before 1.6 (i.e. early 2000), I'd say this is a good
way to go.  Threads are VERY useful for a lot of problem domains.

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org


From guido at CNRI.Reston.VA.US  Wed Dec  1 18:54:53 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 12:54:53 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Wed, 01 Dec 1999 12:39:00 EST."
             <19991201123900.A7419@trump.amber.org> 
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us>  
            <19991201123900.A7419@trump.amber.org> 
Message-ID: <199912011754.MAA10465@eric.cnri.reston.va.us>

> > I agree with this.  MacOS seems to be the only OS without threads
> > these days.
> 
> I believe the new GUISI package has pthread-API compatible threads
> implemented, which talk to the underlying ThreadManager.  With MacOSX
> being impending before 1.6 (i.e. early 2000), I'd say this is a good
> way to go.  Threads are VERY useful for a lot of problem domains.

What's GUISI?  The son of GUSI?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Wed Dec  1 18:55:19 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 12:55:19 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Wed, 01 Dec 1999 12:32:08 EST."
             <199912011732.MAA10419@eric.cnri.reston.va.us> 
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com>  
            <199912011732.MAA10419@eric.cnri.reston.va.us> 
Message-ID: <199912011755.MAA10476@eric.cnri.reston.va.us>

> > Also, we have a couple minor convenience functions for Python in an 
> > MSDEV environment, an exposure of OutputDebugString for writing to 
> > the DevStudio log window and a means of tripping DevStudio C/C++ layer
> > breakpoints from Python code (currently experimental).  The msvcrt 
> > module seems like a likely candidate for these, would these be 
> > welcome additions?
> 
> Sure -- send patches.

I hadn't seen Mark Hammond's response -- I take it back.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Wed Dec  1 19:15:26 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 13:15:26 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Sat, 20 Nov 1999 11:04:28 +1100."
             <005f01bf32ea$d0b82b90$0501a8c0@bobcat> 
References: <005f01bf32ea$d0b82b90$0501a8c0@bobcat> 
Message-ID: <199912011815.NAA10506@eric.cnri.reston.va.us>

> This is really a pointer to the fact that some or all of the win32api
> should be moved into the core - registry access is the thing people
> most want, but there are plenty of other useful things that people
> reguarly use...
> 
> Guido objects to the coding style, but hopefully that wont be a big
> issue.  IMO, the coding style isnt "bad" - it is just more an "MS"
> flavour than a "Python" flavour - presumably people reading the code
> will have some experience with Windows, so it wont look completely
> foreign to them.  The good thing about taking it "as-is" is that it
> has been fairly well bashed on over a few years, so is really quite
> stable.  The final "coding style" issue is that there are no "doc
> strings" - all documentation is embedded in C comments, and extracted
> using a tool called "autoduck" (similar to "autodoc").  However, Im
> sure we can arrange something there, too.

That's a good summary of the status quo.  I would appreciate it if
win32all could become part of the core.  However the coding style
issues need to be addressed (I also believe that it needs to be
compiled in C++ mode).  One concern that Mark doesn't mention is that
there are some safety issues -- you can abuse some of the calls to
cause segfaults, whether intentional or by mistake, and that's not a
good thing.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Wed Dec  1 19:55:40 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 13:55:40 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Wed, 24 Nov 1999 09:43:57 EST."
             <383BF9AD.E183FB98@interet.com> 
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org>  
            <383BF9AD.E183FB98@interet.com> 
Message-ID: <199912011855.NAA10662@eric.cnri.reston.va.us>

> I would like to argue that on Windows, import of dynamic libraries is
> broken.  If a file something.pyd is imported, then sys.path is searched
> to find the module.  If a file something.dll is imported, the same thing
> happens.  But Windows defines its own search order for *.dll files which
> Python ignores.  I would suggest that this is wrong for files named
> *.dll,
> but OK for files named *.pyd.

I think you misunderstand some of the issues.

Python cannot import every .dll file.  Only .dll files that conform to
the convention for Python extension modules can be imported.  (The
convention is that it must export an init<module> function.)

On most other platforms, shared libraries must have a specific
extension (e.g. .so on most Unix).  Python allows you to drop such a
file into any directory where is looks for modules, and it will then
direct the dynamic load support to load that specific file.

This seems logical -- Python extensions must live in directories that
Python searches (Python must do its own search because the search
order is significant).

On Windows, Python uses the same strategy.  The only modification is
that it is allowed to give the file a different extension, namely
.pyd, to indicate that this really is a Python extension and not a
regular DLL.  This was mostly introduced because it is apparently
common to have an existing DLL "foo.dll" and write a Python wrapper
for it that is also called "foo".  Clearly, two files foo.dll are too
confusing, so we let you name the wrapper foo.pyd.  But because the
file format is essentially that of a DLL, we don't *require* this
renaming; some ways of creating DLLs in the first place may make it
difficult to do.

> A SysAdmin should be able to install and maintain *.dll as she has
> been trained to do.  This makes maintaining Python installations
> simpler and more un-surprising.

I don't see that a SysAdmin needs to do much DLL management.  This is
up to installer scripts.  Anyway how hard can it be for a SysAdmin to
leave DLLs in specific directories alone?

> I have no solution to the backward compatibilty problem.  But the
> code is only a couple lines.  A LoadLibrary() call does its own
> path searching.

But at what point should this LoadLibrary() call be called?  The
import statement contains no clue that a DLL is requested -- the
sys.path search reveals that.

I claim that there is nothing with the current strategy.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw at cnri.reston.va.us  Wed Dec  1 20:01:12 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Wed, 1 Dec 1999 14:01:12 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs
References: <199911161700.MAA02716@eric.cnri.reston.va.us>
	<14389.31511.706588.20840@anthem.cnri.reston.va.us>
Message-ID: <14405.28792.184298.298597@anthem.cnri.reston.va.us>

>>>>> "BAW" == Barry A Warsaw <bwarsaw at cnri.reston.va.us> writes:

    BAW> There was a suggestion to start augmenting the checkin emails
    BAW> to include the diffs of the checkin.  This would let you keep
    BAW> a current snapshot of the tree without having to do a direct
    BAW> `cvs update'.

The voting has stopped, with the "yeah" vote slightly head of the
"nay" vote.  We'll go with context diffs, and we'll be implementing
Greg Stein's approach with the xml-checkins list: truncating diffs to
H number of lines at the top and T number of lines at the bottom, so
as not to overwhelm incoming email.

I'll try to get this going sometime today (no promises).  You'll
likely see a number of tests coming through python-checkins in the
meantime.  I'll send a message out when it's done.

-Barry


From da at ski.org  Wed Dec  1 20:34:56 1999
From: da at ski.org (David Ascher)
Date: Wed, 1 Dec 1999 11:34:56 -0800 (Pacific Standard Time)
Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues
In-Reply-To: <14405.25141.297349.76968@gargle.gargle.HOWL>
Message-ID: <Pine.WNT.4.04.9912011132130.222-100000@rigoletto.ski.org>

On Wed, 1 Dec 1999, Geoffrey Furnish wrote:

[...]

> Well, like I said above, I haven't analyzed your posts for technical
> details, so I can't say whether you made avoidable mistakes.  But I
> definitely do agree with you that it is roughly 100 times harder than
> it needs to be, to use Python from C++.  The charter of this sig is to 
> fix that, by developing the additional software that would allow
> Python's compiled interface to be exploited from C++ "with ease".
> 
> The first and most basic issue, is compiling Python so it initializes
> C++ global objects correctly.  There is a patch on the sig's www site
> to help with that.

Any opinions from this esteemed body re: integrating said patch in the
main tree?

--david


From jim at interet.com  Wed Dec  1 20:47:14 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 14:47:14 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org>  
	            <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us>
Message-ID: <38457B42.85552AC@interet.com>

Guido van Rossum wrote:
> 
> > I would like to argue that on Windows, import of dynamic libraries is
> > broken.  If a file something.pyd is imported, then sys.path is searched
> > to find the module.  If a file something.dll is imported, the same thing
> > happens.  But Windows defines its own search order for *.dll files which
> > Python ignores.  I would suggest that this is wrong for files named
> > *.dll,
> > but OK for files named *.pyd.
> 
> I think you misunderstand some of the issues.
> 
> Python cannot import every .dll file.  Only .dll files that conform to
> the convention for Python extension modules can be imported.  (The
> convention is that it must export an init<module> function.)

Of course I meant that the test is LoadLibrary(module) followed
by GetProcAddress(h, "init" + module).  Both must succeed.

> This seems logical -- Python extensions must live in directories that
> Python searches (Python must do its own search because the search
> order is significant).

The PYTHONPATH search path is what I am trying to get away
from.  If I eliminate PYTHONPATH I still can not use the
Windows DLL search path (which is superior) because DLLs
are searched on PYTHONPATH too; thus my post.  I don't believe
it is important for Python module.dll to be located on PYTHONPATH.

> > A SysAdmin should be able to install and maintain *.dll as she has
> > been trained to do.  This makes maintaining Python installations
> > simpler and more un-surprising.
> 
> I don't see that a SysAdmin needs to do much DLL management.  This is
> up to installer scripts.  Anyway how hard can it be for a SysAdmin to
> leave DLLs in specific directories alone?

The problem is maintaining PYTHONPATH plus having DLL's on a
non-standard search path.  Yes, PythonDev[:] and professional
SysAdmins can do it.  But it is not as simple as it could be.
Someone has to write the install scripts.  And what if something
doesn't work?  Think of Python being used as a teaching language
for the 8th grade.  Think of the 8th grade teacher trying to get
all this right.  The only thing that works is simplicity.

> But at what point should this LoadLibrary() call be called?  The
> import statement contains no clue that a DLL is requested -- the
> sys.path search reveals that.

Just after built-in and frozen modules.

> I claim that there is nothing with the current strategy.

Thank you for thoughtfully considering and commenting at length
on this issue.  Lets ignore it for the moment.  The other
problems with PYTHONPATH are more pressing.  But if those
issues are solved, this one will stick out.

JimA


From da at ski.org  Wed Dec  1 20:59:44 1999
From: da at ski.org (David Ascher)
Date: Wed, 1 Dec 1999 11:59:44 -0800 (Pacific Standard Time)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <38457B42.85552AC@interet.com>
Message-ID: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>

On Wed, 1 Dec 1999, James C. Ahlstrom wrote:

> > This seems logical -- Python extensions must live in directories that
> > Python searches (Python must do its own search because the search
> > order is significant).
> 
> The PYTHONPATH search path is what I am trying to get away
> from.  If I eliminate PYTHONPATH I still can not use the
> Windows DLL search path (which is superior) because DLLs
> are searched on PYTHONPATH too; thus my post.  I don't believe
> it is important for Python module.dll to be located on PYTHONPATH.

Why is the DLL search path superior?  

In my experience, the DLL search path (PATH for short) is problematic
because it requires either using the System control panel or modifying
autoexec.bat, both of which can have massive systemic effects completely
unrelated to Python if a mistake is made during the modification.

On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH,
although I think there are significant variations in how that works across
platforms.  Most beginning unix users have no idea how to modify their
LD_LIBRARY_PATH, as they typically don't understand the configuration
mechanisms on Unix (system vs. user-specific, login vs. shell-specific,
different shell configuration languages, etc.).

I know it's not what you had in mind, but have you tried doing something
like:

  import sys, os, string
  sys.path.extend(string.split(os.environ['PATH'], ';'))

--david


From gmcm at hypernet.com  Wed Dec  1 21:19:13 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Wed, 1 Dec 1999 15:19:13 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>
References: <38457B42.85552AC@interet.com>
Message-ID: <1268042932-41354568@hypernet.com>

David Ascher wrote:
> On Wed, 1 Dec 1999, James C. Ahlstrom wrote:
> 
> > > This seems logical -- Python extensions must live in
> > > directories that Python searches (Python must do its own
> > > search because the search order is significant).
> > 
> > The PYTHONPATH search path is what I am trying to get away
> > from.  If I eliminate PYTHONPATH I still can not use the
> > Windows DLL search path (which is superior) because DLLs are
> > searched on PYTHONPATH too; thus my post.  I don't believe it
> > is important for Python module.dll to be located on PYTHONPATH.
> 
> Why is the DLL search path superior?  
> 
> In my experience, the DLL search path (PATH for short) 

Make that:
 [ os.path.dirname(sys.executable),
   os.getcwd(),
   win32api.GetSystemDirectory(),
   os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'), 
   win32api.GetWindowsDirectory()
 ] + string.split(os.environ['PATH'], ';')

> is
> problematic because it requires either using the System control
> panel or modifying autoexec.bat, both of which can have massive
> systemic effects completely unrelated to Python if a mistake is
> made during the modification.

Hear, hear!

[snip]


- Gordon


From jim at interet.com  Wed Dec  1 21:36:04 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 15:36:04 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>
Message-ID: <384586B4.48905B32@interet.com>

David Ascher wrote:

> Why is the DLL search path superior?
> 
> In my experience, the DLL search path (PATH for short) is problematic
> because it requires either using the System control panel or modifying
> autoexec.bat, both of which can have massive systemic effects completely
> unrelated to Python if a mistake is made during the modification.

I agree that altering PATH is problematic.  So is altering PYTHONPATH
and for exactly the same reason.  That is why I think PYTHONPATH is
a bad idea.

The reason the DLL search path is superior is that it is not just PATH.
It defines a path which includes the install directory of the
application
plus the system directories, and this path is discovered at runtime.  So
it is not necessary to set a global PYTHONPATH, nor make registry
entries,
nor do anything at all.  It Just Works.

The Windows DLL search path is:

1) The directory of the executable program.  That means you can just
   throw all your DLL's in with the *.exe's, and it all Just Works.

2) The current directory.  Also useful.

3) The Windows system directory (call GetSystemDirectory() to get this).
4) The Windows directory (call GetWindowsDirectory() to get this).

   These two directories are used for system files.  Think of /sbin,
/bin.
   Windows apps usually throw some of their DLL's here, especially if
they
   are of general interest.

5) The directories in PATH.  This is relatively useless, and AFAIK it
   is seldom used in a real installation.  It is a left-over from DOS.
   That is also why it appears last.

> On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH,
> although I think there are significant variations in how that works across
> platforms.  Most beginning unix users have no idea how to modify their
> LD_LIBRARY_PATH, as they typically don't understand the configuration
> mechanisms on Unix (system vs. user-specific, login vs. shell-specific,
> different shell configuration languages, etc.).

I agree.

> 
> I know it's not what you had in mind, but have you tried doing something
> like:
> 
>   import sys, os, string
>   sys.path.extend(string.split(os.environ['PATH'], ';'))

Adding PATH (or anything else) to PYTHONPATH is making it worse.  Have
you tried "import sys; print sys.path" on Windows?  It is junk.

JimA


From jim at interet.com  Wed Dec  1 21:44:00 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 15:44:00 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <38457B42.85552AC@interet.com> <1268042932-41354568@hypernet.com>
Message-ID: <38458890.BCB36FE2@interet.com>

Gordon McMillan wrote:

> Make that:
>  [ os.path.dirname(sys.executable),
>    os.getcwd(),
>    win32api.GetSystemDirectory(),
>    os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'),
>    win32api.GetWindowsDirectory()
>  ] + string.split(os.environ['PATH'], ';')

Very nice!  "../SYSTEM" needed on NT I guess.

JimA


From fredrik at pythonware.com  Wed Dec  1 21:56:16 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 1 Dec 1999 21:56:16 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com>
Message-ID: <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>

James C. Ahlstrom <jim at interet.com> wrote:
> Adding PATH (or anything else) to PYTHONPATH is making it worse.  Have
> you tried "import sys; print sys.path" on Windows?  It is junk.

not on my machine.

it would help if you stopped assuming that every-
one have the same problems as you have.  we've
distributed several python apps on windows, and
frankly, I don't understand what you're talking
about.

</F>


From jim at interet.com  Wed Dec  1 22:26:37 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 16:26:37 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>
Message-ID: <3845928D.C0462322@interet.com>

Fredrik Lundh wrote:

> > you tried "import sys; print sys.path" on Windows?  It is junk.
> 
> not on my machine.

On my Windows machine I get:

['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib',
  '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin']

PYTHONPATH is N:/prd/winlease/vest.
os.path.dirname(sys.executable) is F:/bin.
The others are junk.  What do you get?  Did
you change sys.path from the default?

> it would help if you stopped assuming that every-
> one have the same problems as you have.  we've
> distributed several python apps on windows, and
> frankly, I don't understand what you're talking
> about.

We distribute our app by freezing all *.py files
into a DLL, and we don't set PYTHONPATH on the
target machine.  The files are located with the
executable file and are found there.  This works
fine and we don't have a problem with it.

It would help me a lot if you could describe how you
distribute your app.  Do you set PYTHONPATH on the
target machine?

JimA


From da at ski.org  Wed Dec  1 22:41:31 1999
From: da at ski.org (David Ascher)
Date: Wed, 1 Dec 1999 13:41:31 -0800 (Pacific Standard Time)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <384586B4.48905B32@interet.com>
Message-ID: <Pine.WNT.4.04.9912011251250.254-100000@rigoletto.ski.org>

On Wed, 1 Dec 1999, James C. Ahlstrom wrote:

> > In my experience, the DLL search path (PATH for short) is problematic
> > because it requires either using the System control panel or modifying
> > autoexec.bat, both of which can have massive systemic effects completely
> > unrelated to Python if a mistake is made during the modification.
> 
> I agree that altering PATH is problematic.  So is altering PYTHONPATH
> and for exactly the same reason.  That is why I think PYTHONPATH is
> a bad idea.

I see.  Thanks for the explanation. I didn't know the complete story of
the "Windows DLL search path".  BTW, I think a huge difference b/w
PYTHONPATH and PATH is the system-wide nature of PATH, vs. the
Python-restriced nature of PYTHONPATH.

--david


From mhammond at skippinet.com.au  Wed Dec  1 23:29:38 1999
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Thu, 2 Dec 1999 09:29:38 +1100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <Pine.WNT.4.04.9912011251250.254-100000@rigoletto.ski.org>
Message-ID: <009c01bf3c4b$8f119090$0501a8c0@bobcat>

> I see.  Thanks for the explanation. I didn't know the
> complete story of
> the "Windows DLL search path".  BTW, I think a huge difference b/w
> PYTHONPATH and PATH is the system-wide nature of PATH, vs. the
> Python-restriced nature of PYTHONPATH.

And more to the point - and the critical distinction - is that
PYTHONPATH is actually specific to the Python _app_, not just Python
on the machine.

Sure - the standard Python installation puts a "default" PYTHONPATH
suitable for general purpose development - but any distributed
application _can_ define their own PYTHONPATH that is independant of
any other Python systems or applications.  People have been doing this
for years, including MS :-)

Sorry Jim, but count this as another vote against it - which isnt to
argue that the current system is perfect, simply (IMO) better than the
Windows path and DLL search order.

Mark.


From guido at CNRI.Reston.VA.US  Thu Dec  2 00:00:21 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:00:21 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Wed, 01 Dec 1999 16:26:37 EST."
             <3845928D.C0462322@interet.com> 
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>  
            <3845928D.C0462322@interet.com> 
Message-ID: <199912012300.SAA10861@eric.cnri.reston.va.us>

> Fredrik Lundh wrote:
> 
> > > you tried "import sys; print sys.path" on Windows?  It is junk.
> > 
> > not on my machine.
> 
> On my Windows machine I get:
> 
> ['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib',
>   '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin']
> 
> PYTHONPATH is N:/prd/winlease/vest.
> os.path.dirname(sys.executable) is F:/bin.
> The others are junk.  What do you get?  Did
> you change sys.path from the default?

You must not have used the standard Python installer; if you had used
it you wouldn't have had this problem (and perhaps we wouldn't have
had this discussion).

The problem is that you apparently have installed python.exe in
f:\bin.  "Modern" Python versions execute some code at startup that
comes up with a suitable value for sys.path; the Windows version of
this code is in PC/getpathp.c -- I recommend that you study it.  This
code tries to find the Python install directory by looking for a
"landmark" file relative to the executable path, and then adds a bunch
of directory entries to the path relative to the install directory.
If it fails, it defaults to "." for the install directory.  The
entries '.\\DLLs', '.\\lib', '.\\lib\\plat-win', '.\\lib\\lib-tk' are
all a result of this failing.

As long as this works, there is no need for the user (or anyone) to
ever set the PYTHONPATH variable -- that variable is only needed to
add directories in front of sys.path for stuff that getpathp.c doesn't
know about (e.g. PIL, Numeric, etc.).  With packagized versions of
those modules, even that won't be necessary, because the packages will
be dropped in the Python install directory (typically C:\Program
Files\Python).

I believe that most of your desire to get rid of PYTHONPATH comes from
your insistence to bypass the default installer.  There's probably a
way to install your app in such a way that the getpathp.c algorithm
actually succeeds?  There's also a separate env variable, PYTHONHOME,
which overrides the Python install directory; if getpathp.c sees that
it is set, it will bypass the search relative to the executable's
path.

I take blame for not documenting all this well enough.  However I wish
you stopped criticizing the design -- I think the design is quite
solid.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Thu Dec  2 00:09:43 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:09:43 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Wed, 01 Dec 1999 14:47:14 EST."
             <38457B42.85552AC@interet.com> 
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org> <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us>  
            <38457B42.85552AC@interet.com> 
Message-ID: <199912012309.SAA10873@eric.cnri.reston.va.us>

> > This seems logical -- Python extensions must live in directories that
> > Python searches (Python must do its own search because the search
> > order is significant).
> 
> The PYTHONPATH search path is what I am trying to get away
> from.  If I eliminate PYTHONPATH I still can not use the
> Windows DLL search path (which is superior) because DLLs
> are searched on PYTHONPATH too; thus my post.  I don't believe
> it is important for Python module.dll to be located on PYTHONPATH.

But I do.

First of all, I'm not sure whether you're talking here about sys.path
or PYTHONPATH.  As I explained in a previous post, you should normally
not have to set PYTHONPATH at all.  Let's assume you really meant
sys.path.

Let's assume sys.path is [A, B].  Let's assume there's a foo.py and a
foo.dll.  If foo.py lives in A and foo.dll lives in B, then import foo
should load foo.py.  If it's the other way around, it should load
foo.dll.  If we were to use the default DLL search path, there's no
way that we can get this behavior: either you have to look for a DLL
first, which means there's no way for foo.py to override foo.dll, or
you have to look for a DLL last, and then there's no way for a foo.dll
to override foo.py.  It is desirable that both overrides are possible:
we want to be able to have foo.dll override foo.py, because perhaps
foo.py should only be used when for some reason foo.dll can't be
loaded (say foo.py does the same thing only slower); but we also want
to be able to have foo.py override foo.dll (by simply placing it in a
directory that's earlier on the path) e.g. in a situation where the
dll version does something undesirable and we want to create a safe
substitute.  (Deleting files is not always an option.)

> The problem is maintaining PYTHONPATH plus having DLL's on a
> non-standard search path.

I've commented already that PYTHONPATH maintenance is probably a red
herring due to your non-standard install.  I'm not sure what the
problem is with having a DLL on a non-std path?

> Yes, PythonDev[:] and professional
> SysAdmins can do it.  But it is not as simple as it could be.
> Someone has to write the install scripts.

The distutil-sig (a.k.a. Greg Ward :-) is taking care of this as we
speak.

> And what if something
> doesn't work?  Think of Python being used as a teaching language
> for the 8th grade.  Think of the 8th grade teacher trying to get
> all this right.  The only thing that works is simplicity.

We will provide an installer that Just Works [tm].

> > But at what point should this LoadLibrary() call be called?  The
> > import statement contains no clue that a DLL is requested -- the
> > sys.path search reveals that.
> 
> Just after built-in and frozen modules.

See my long comment above.

> > I claim that there is nothing with the current strategy.
> 
> Thank you for thoughtfully considering and commenting at length
> on this issue.  Lets ignore it for the moment.  The other
> problems with PYTHONPATH are more pressing.  But if those
> issues are solved, this one will stick out.

And those other issues should be resolved in a different way than what
you have been proposing.  See other post.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Thu Dec  2 00:11:28 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:11:28 -0500
Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues
In-Reply-To: Your message of "Wed, 01 Dec 1999 11:34:56 PST."
             <Pine.WNT.4.04.9912011132130.222-100000@rigoletto.ski.org> 
References: <Pine.WNT.4.04.9912011132130.222-100000@rigoletto.ski.org> 
Message-ID: <199912012311.SAA10888@eric.cnri.reston.va.us>

> > The first and most basic issue, is compiling Python so it initializes
> > C++ global objects correctly.  There is a patch on the sig's www site
> > to help with that.
> 
> Any opinions from this esteemed body re: integrating said patch in the
> main tree?

I presume you meant me :-)

I'll give it a try tonight.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy at cnri.reston.va.us  Thu Dec  2 00:24:06 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 1 Dec 1999 18:24:06 -0500 (EST)
Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01
Message-ID: <14405.44566.832799.96438@goon.cnri.reston.va.us>

It looks like there has been some mail glitch that result in no
digests being sent between 11/26 and 12/01 and no messages being
archived between 11/24 and 12/01.  Does anyone keep a personal archive
that has those messages?  I'd like to read them.

Jeremy


From guido at CNRI.Reston.VA.US  Thu Dec  2 00:28:14 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:28:14 -0500
Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01
In-Reply-To: Your message of "Wed, 01 Dec 1999 18:24:06 EST."
             <14405.44566.832799.96438@goon.cnri.reston.va.us> 
References: <14405.44566.832799.96438@goon.cnri.reston.va.us> 
Message-ID: <199912012328.SAA12879@eric.cnri.reston.va.us>

> It looks like there has been some mail glitch that result in no
> digests being sent between 11/26 and 12/01 and no messages being
> archived between 11/24 and 12/01.  Does anyone keep a personal archive
> that has those messages?  I'd like to read them.

I do :-)

I'll provide Jeremy with an archive.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw at cnri.reston.va.us  Thu Dec  2 05:24:03 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Wed, 1 Dec 1999 23:24:03 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs
References: <199911161700.MAA02716@eric.cnri.reston.va.us>
	<14389.31511.706588.20840@anthem.cnri.reston.va.us>
Message-ID: <14405.62563.345566.500106@anthem.cnri.reston.va.us>

Okay folks, I think I've got the diff thing working now.  The trick
(for you CVS heads) was that you can't do a `cvs diff' while you're
executing a loginfo script.  Lock contention (repeat after me: "I Love
CVS!").  Anyway, let's see how you all like it.

Note that based on a suggestion by Greg Stein, seconded by GvR, I do
not send out the entire diff of every file (which could potentially be
huge).  I send out 20 lines from the head of the diff and 20 lines
from the tail, and suppress everything inbetween.  Those numbers can
be easily tweaked, and I'm not sure what the ideal is.  Let's see what
the emails look like when stuff starts getting checked in.

Enjoy,
-Barry


From jack at oratrix.nl  Thu Dec  2 12:00:45 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Thu, 02 Dec 1999 12:00:45 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order 
In-Reply-To: Message by Guido van Rossum <guido@CNRI.Reston.VA.US> ,
	     Wed, 01 Dec 1999 18:09:43 -0500 , <199912012309.SAA10873@eric.cnri.reston.va.us> 
Message-ID: <19991202110045.96F33370CF2@snelboot.oratrix.nl>

On the Mac I've introduced "magic cookies" into sys.path, which allow you to 
do interesting searches (like searching for a DLL or PYC-resource in the 
application itself) at known places in the import process.

There isn't a cookie for "search along the standard MacOS dll search path" 
(which is somewhat similar to the Windows dll search path) because I haven't 
seen a reason for it, but there's nothing to stop it. And if you'd insert that 
cookie it would be perfectly clear (at least, it should be) that only dll 
modules will be found in that step, not .py modules.

Actually I'm so happy with the magic cookie scheme that I've advocated at 
various times in the past that something similar also be used for determining 
where builtin modules and frozen modules appear in sys.path...
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From guido at CNRI.Reston.VA.US  Thu Dec  2 12:59:34 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 06:59:34 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Thu, 02 Dec 1999 12:00:45 +0100."
             <19991202110045.96F33370CF2@snelboot.oratrix.nl> 
References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> 
Message-ID: <199912021159.GAA13732@eric.cnri.reston.va.us>

> On the Mac I've introduced "magic cookies" into sys.path, which
> allow you to do interesting searches (like searching for a DLL or
> PYC-resource in the application itself) at known places in the
> import process.

> There isn't a cookie for "search along the standard MacOS dll search
> path" (which is somewhat similar to the Windows dll search path)
> because I haven't seen a reason for it, but there's nothing to stop
> it. And if you'd insert that cookie it would be perfectly clear (at
> least, it should be) that only dll modules will be found in that
> step, not .py modules.

> Actually I'm so happy with the magic cookie scheme that I've
> advocated at various times in the past that something similar also
> be used for determining where builtin modules and frozen modules
> appear in sys.path...

I see the magic cookies as a poor man's (but more compatible!) version
of a chain of importers as advocated by Greg Stein and other imputil
fans.  I like the idea, except that I think that the chain should be
manipulatable more easily than the current imputil implementation.
(I'll have more comments on Greg's comments later, when I've actually
read them through.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein at lyra.org  Thu Dec  2 13:09:40 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 2 Dec 1999 04:09:40 -0800 (PST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <199912021159.GAA13732@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912020404500.18236-100000@nebula.lyra.org>

On Thu, 2 Dec 1999, Guido van Rossum wrote:
>...
> I see the magic cookies as a poor man's (but more compatible!) version
> of a chain of importers as advocated by Greg Stein and other imputil
> fans.  I like the idea, except that I think that the chain should be
> manipulatable more easily than the current imputil implementation.
> (I'll have more comments on Greg's comments later, when I've actually
> read them through.)

Anything in sys.path that is not a string pointing to a directory is not
very compatible. My current proposal keeps the existing semantics for
sys.path (the proposal adds functionality thru other mechanisms, rather
than changing/interfering with existing ones).

I look forward to your comments! I'll definitely provide new solutions
where you find problems :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik at pythonware.com  Thu Dec  2 13:53:03 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 2 Dec 1999 13:53:03 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <19991202110045.96F33370CF2@snelboot.oratrix.nl>  <199912021159.GAA13732@eric.cnri.reston.va.us>
Message-ID: <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com>

Guido van Rossum <guido at CNRI.Reston.VA.US> wrote:
> > Actually I'm so happy with the magic cookie scheme that I've
> > advocated at various times in the past that something similar also
> > be used for determining where builtin modules and frozen modules
> > appear in sys.path...
> 
> I see the magic cookies as a poor man's (but more compatible!) version
> of a chain of importers as advocated by Greg Stein and other imputil
> fans.  I like the idea, except that I think that the chain should be
> manipulatable more easily than the current imputil implementation.

I know this has been asked before, but cannot recall
any of the arguments against it: how about replacing
Jack's magic cookies with importer objects?

(in other words, if a path item is a string, import as
usual.  otherwise, ask the importer for a code object
or maybe better, a module object).

</F>


From jack at oratrix.nl  Thu Dec  2 14:23:31 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Thu, 02 Dec 1999 14:23:31 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order 
In-Reply-To: Message by "Fredrik Lundh" <fredrik@pythonware.com> ,
	     Thu, 2 Dec 1999 13:53:03 +0100 , <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> 
Message-ID: <19991202132331.E3F8D370CF2@snelboot.oratrix.nl>

> > I see the magic cookies as a poor man's (but more compatible!) version
> > of a chain of importers as advocated by Greg Stein and other imputil
> > fans. [...]
> 
> I know this has been asked before, but cannot recall
> any of the arguments against it: how about replacing
> Jack's magic cookies with importer objects?

For the record: I definitely agree with both comments here. The only thing 
that would need solving (but maybe it already is? Greg?) is the external 
representation of an importer, as I'd definitely want to be able to name them 
in PYTHONPATH (or the mac equivalent).
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From jim at interet.com  Thu Dec  2 15:19:31 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 09:19:31 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <009c01bf3c4b$8f119090$0501a8c0@bobcat>
Message-ID: <38467FF3.D938EE4@interet.com>

Mark Hammond wrote:

> Sure - the standard Python installation puts a "default" PYTHONPATH
> suitable for general purpose development - but any distributed
> application _can_ define their own PYTHONPATH that is independant of
> any other Python systems or applications.  People have been doing this
> for years, including MS :-)

How is this done?
 
> Sorry Jim, but count this as another vote against it - which isnt to
> argue that the current system is perfect, simply (IMO) better than the
> Windows path and DLL search order.

Sigh.....

JimA


From jim at interet.com  Thu Dec  2 16:49:10 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 10:49:10 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>  
	            <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us>
Message-ID: <384694F6.E5D74221@interet.com>

Guido van Rossum wrote:

> You must not have used the standard Python installer; if you had used
> it you wouldn't have had this problem (and perhaps we wouldn't have
> had this discussion).

Correct, I did not use the standard Python installer.  I compiled
Python from the source distribution.  There are good reasons for this
in my case.

First, my real issue is how to DISTRIBUTE Python programs, not to get
Python working on my own machine.  We have 12 machines on a network.
It is not acceptable to run a Python installation script on every one
of them just to run a simple Python program.  OK, I guess I could do 12,
but what about a larger company?  And we ship to hundreds of customers.
I can distribute simple C or C++ programs without a hassle, why not
Python?
It is not acceptable to ask our customers to run a separate Python
installer.
We have our own Wise installer to install our software.  Every
commercial
vendor has Wise, Install Shield or other installer in place.  No
commercial
vendor is going to abandon Wise et al. and move to The Official Python
Installer because it will not have the features of Wise (such as binary
patches across the network), and because what it does won't be
documented,
and because it is Just Different.

Second, I can not run ANY installer on my development machine, Python or
otherwise.  This is a general Windows problem not specific to Python.
Right now our help system is broken on every office machine except the
one where the help system installer was run (where we develop help).
If I run a Python installer, it may Just Work here.  So testing is
fine, but when I distribute the program to customers where the install
program has not been run it fails.  The installer made registry entries,
installed files, etc.  And what did it do??  No one knows.  And how do I
install at a customer site if I don't have documentation on what the
Help
installer or Python installer did??  No one knows.  Who fixes it if
something goes wrong??  Hours on the phone to Help System customer
support.
Does it work on Windows 2000??  No one knows.

> f:\bin.  "Modern" Python versions execute some code at startup that
> comes up with a suitable value for sys.path; the Windows version of
> this code is in PC/getpathp.c -- I recommend that you study it.  This

> [ Highly useful discussion of startup...]

Thank you, I will study this.

> know about (e.g. PIL, Numeric, etc.).  With packagized versions of
> those modules, even that won't be necessary, because the packages will
> be dropped in the Python install directory (typically C:\Program
> Files\Python).

Yes, this is essential.  Packages must be easily installed.  I was
hoping
for single file package archive files.

> I believe that most of your desire to get rid of PYTHONPATH comes from
> your insistence to bypass the default installer.

Correct, I refuse to execute the default installer.  And I am
a patient person who loves Python, so I will read getpathp.c
to see what is happening.  But other commercial developers, students,
teachers, SysAdmins etc. are not so patient.  In the interest of
promoting Python, there should be documentation on the official
way to easily install Python programs.

> There's probably a
> way to install your app in such a way that the getpathp.c algorithm
> actually succeeds?  There's also a separate env variable, PYTHONHOME,

Perhaps, and if there is it should be prominently documented in the
How to Distribute Your App section of the manual.  I
am worried about supporting versioning, but I will think about it.

> I take blame for not documenting all this well enough.  However I wish
> you stopped criticizing the design -- I think the design is quite
> solid.

Thank you for the explanation.  I will study the design again.  I
always wondered what PYTHONHOME did.

JimA


From guido at CNRI.Reston.VA.US  Thu Dec  2 17:03:09 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 11:03:09 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Thu, 02 Dec 1999 10:49:10 EST."
             <384694F6.E5D74221@interet.com> 
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us>  
            <384694F6.E5D74221@interet.com> 
Message-ID: <199912021603.LAA14455@eric.cnri.reston.va.us>

> Perhaps, and if there is it should be prominently documented in the
> How to Distribute Your App section of the manual.  I
> am worried about supporting versioning, but I will think about it.

Join the distutil-SIG, they are discussing just this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal at lemburg.com  Thu Dec  2 16:48:40 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 02 Dec 1999 16:48:40 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <19991202110045.96F33370CF2@snelboot.oratrix.nl>  <199912021159.GAA13732@eric.cnri.reston.va.us> <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com>
Message-ID: <384694D8.DCA3D75E@lemburg.com>

Fredrik Lundh wrote:
> 
> Guido van Rossum <guido at CNRI.Reston.VA.US> wrote:
> > > Actually I'm so happy with the magic cookie scheme that I've
> > > advocated at various times in the past that something similar also
> > > be used for determining where builtin modules and frozen modules
> > > appear in sys.path...
> >
> > I see the magic cookies as a poor man's (but more compatible!) version
> > of a chain of importers as advocated by Greg Stein and other imputil
> > fans.  I like the idea, except that I think that the chain should be
> > manipulatable more easily than the current imputil implementation.
> 
> I know this has been asked before, but cannot recall
> any of the arguments against it: how about replacing
> Jack's magic cookies with importer objects?
> 
> (in other words, if a path item is a string, import as
> usual.  otherwise, ask the importer for a code object
> or maybe better, a module object).

Plus, for backward compatibility, make sure that str(importerobj)
returns something which resembles a non-existing directory.

Note that the builtin importer skips non-string entries
in sys.path, so the above will only be needed for existing
import hooks.

Still, I would like to rephrase my 0.02EUR which I already
posted twice... why not start to think about what these
importers would do first ? If there are only a handful of
wishes we could just add them to the builtin machinery and
be done with it...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    29 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido at CNRI.Reston.VA.US  Thu Dec  2 17:28:28 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 11:28:28 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Fri, 19 Nov 1999 22:43:32 EST."
             <1269053086-27079185@hypernet.com> 
References: <1269053086-27079185@hypernet.com> 
Message-ID: <199912021628.LAA14506@eric.cnri.reston.va.us>

> No success whatsoever in either direction across Samba. In 
> fact the mtime of my Linux home directory as seen from NT is 
> Jan 1, 1980.

That's only the case for an NT mount point (something of the form
\\host\name; I notice that os.stat() only believes it exists if you
append a backslash: \\host\name\).  For interior directories, at least
with the Samba version that I'm using, os.stat() seems to give correct
results.

I think that this whole issue (that doing a stat on a directory to
find out whether files in it were modified doesn't give usable
results) is widely blown out of proportion.

The only useful bit of info is that mtimes may have an up to 2 second
granularity, and that anything as recent as 2 seconds should be
considered as newer than the cache even if the cache is also less than
2 seconds.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at interet.com  Thu Dec  2 17:28:50 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 11:28:50 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org> <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us>  
	            <38457B42.85552AC@interet.com> <199912012309.SAA10873@eric.cnri.reston.va.us>
Message-ID: <38469E42.AF0A0D55@interet.com>

Guido van Rossum wrote:

> Let's assume sys.path is [A, B].  Let's assume there's a foo.py and a
> foo.dll.  If foo.py lives in A and foo.dll lives in B, then import foo
> ...

Thank you for the detailed discussion showing that sys.path is
needed so a choice can be made whether to load foo.dll or
foo.py.  As you correctly point out, a separate search path
defeats this behavior.

But I don't think the usefulness of the feature compensates for
its resultant complexity.  Specifically, it will be hard to
create this behavior in archive files.

As I envision archive files (which of course is subject to change)
they contain *.pyc files and not DLL's.  The DLL's must be in a
./DLL directory since the OS can not load them from strings.  So
if every *.pyc is in an archive file, your only choice is whether
to load all DLL's first or last.  That is, archive.pyl is either
before or after ./DLL.

If a package (probably with lots of subdirectories) author depends on
having a search path within a package which discriminates between
pyc and DLL files with equal names, then that search path plus the
existence of the DLL's must be recorded in the archive.

This is much more complicated than just an archive with all *.pyc
files entered in a dotted name space:
  foo
  foo.sub1
  foo.sub2
  foo.sub2.pkx

I would question whether equally named foo.dll and foo.py is worth it.
The alternative (which is IMHO more common) is to code the choice in
Python in the module that cares about it.

> > And what if something
> > doesn't work?  Think of Python being used as a teaching language
> > for the 8th grade.  Think of the 8th grade teacher trying to get
> > all this right.  The only thing that works is simplicity.
> 
> We will provide an installer that Just Works [tm].

OK for this case.  Not enough for Python program distribution. 

JimA


From jim at interet.com  Thu Dec  2 17:30:49 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 11:30:49 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us>  
	            <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us>
Message-ID: <38469EB9.5EDB9617@interet.com>

Guido van Rossum wrote:
> 
> > Perhaps, and if there is it should be prominently documented in the
> > How to Distribute Your App section of the manual.  I
> > am worried about supporting versioning, but I will think about it.
> 
> Join the distutil-SIG, they are discussing just this.

I already belong to the distutil-SIG and have seen no such
discussion.

Jim


From guido at CNRI.Reston.VA.US  Thu Dec  2 18:17:52 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 12:17:52 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Thu, 02 Dec 1999 11:30:49 EST."
             <38469EB9.5EDB9617@interet.com> 
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us>  
            <38469EB9.5EDB9617@interet.com> 
Message-ID: <199912021717.MAA14682@eric.cnri.reston.va.us>

[Jim]
> > > Perhaps, and if there is it should be prominently documented in the
> > > How to Distribute Your App section of the manual.  I
> > > am worried about supporting versioning, but I will think about it.

[me]
> > Join the distutil-SIG, they are discussing just this.

[Jim again]
> I already belong to the distutil-SIG and have seen no such
> discussion.

Sorry, you're right (except for a brief exchange between you and Paul
Dubois :-).  But I think they should, it falls under their charter.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Thu Dec  2 18:30:02 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 2 Dec 1999 12:30:02 -0500 (EST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <199912021717.MAA14682@eric.cnri.reston.va.us>
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>
	<384586B4.48905B32@interet.com>
	<011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>
	<3845928D.C0462322@interet.com>
	<199912012300.SAA10861@eric.cnri.reston.va.us>
	<384694F6.E5D74221@interet.com>
	<199912021603.LAA14455@eric.cnri.reston.va.us>
	<38469EB9.5EDB9617@interet.com>
	<199912021717.MAA14682@eric.cnri.reston.va.us>
Message-ID: <14406.44186.574647.651111@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > Sorry, you're right (except for a brief exchange between you and Paul
 > Dubois :-).  But I think they should, it falls under their charter.

  This was deliberatly postponed until after extension packages are
supported and in place.  I know Greg is interested in application
installation as well as package installation.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From gmcm at hypernet.com  Thu Dec  2 18:53:03 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Thu, 2 Dec 1999 12:53:03 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912021628.LAA14506@eric.cnri.reston.va.us>
References: Your message of "Fri, 19 Nov 1999 22:43:32 EST."             <1269053086-27079185@hypernet.com> 
Message-ID: <1267965342-1446902@hypernet.com>

[Gordon]
> > No success whatsoever in either direction across Samba. In fact
> > the mtime of my Linux home directory as seen from NT is Jan 1,
> > 1980.
[Guido]
> That's only the case for an NT mount point (something of the form
> \\host\name; I notice that os.stat() only believes it exists if
> you append a backslash: \\host\name\).  For interior directories,
> at least with the Samba version that I'm using, os.stat() seems
> to give correct results.

Correct (as I discovered not long after I posted). (I find that 
from NT I have to stat some file _in_ the directory to get an 
updated mtime from the stat _of_ the directory).
 
> I think that this whole issue (that doing a stat on a directory
> to find out whether files in it were modified doesn't give usable
> results) is widely blown out of proportion.

This has come up twice: re caching importers and dircache.py 
(used only by dircmp). We've arrived at the fact that it _can_ 
be made to work on Windows boxes. NFS? Andrew (anyone 
still use that)?

IOW, do we want to trust it? Do we want to document that it 
might not be trustworthy in some situations? Make it optional-
for-wizards? Kill it?
 
IOOW, what's the proper proportion ;-)?

> The only useful bit of info is that mtimes may have an up to 2
> second granularity, and that anything as recent as 2 seconds
> should be considered as newer than the cache even if the cache is
> also less than 2 seconds.


From guido at CNRI.Reston.VA.US  Thu Dec  2 21:43:46 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 15:43:46 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Fri, 19 Nov 1999 05:29:50 PST."
             <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> 
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> 
Message-ID: <199912022043.PAA15108@eric.cnri.reston.va.us>

Here's the promised response to Greg's response to my wishlist.

> On Thu, 18 Nov 1999, Guido van Rossum wrote:
> > Gordon McMillan wrote:
> >...
> > > I think imputil's emulation of the builtin importer is more of a 
> > > demonstration than a serious implementation. As for speed, it 
> > > depends on the test. 
> > 
> > Agreed.  I like some of imputil's features, but I think the API
> > need to be redesigned.
> 
> It what ways? It sounds like you've applied some thought. Do you have any
> concrete ideas yet, or "just a feeling" :-)  I'm working through some
> changes from JimA right now, and would welcome other suggestions. I think
> there may be some outstanding stuff from MAL, but I'm not sure (Marc?)

I actually think that the way the PVM (Python VM) calls the importer
ought to be changed.  Assigning to __builtin__.__import__ is a crock.
The API for __import__ is a crock.

> >...
> > So here's a challenge: redesign the import API from scratch.
> 
> I would suggest starting with imputil and altering as necessary. I'll use
> that viewpoint below.
> 
> > Let me start with some requirements.
> > 
> > Compatibility issues:
> > ---------------------
> > 
> > - the core API may be incompatible, as long as compatibility layers
> > can be provided in pure Python
> 
> Which APIs are you referring to? The "imp" module? The C functions? The
> __import__ and reload builtins?

> I'm guessing some of imp, the two builtins, and only one or two C
> functions.

All of those.

> > - support for rexec functionality
> 
> No problem. I can think of a number of ways to do this.

Agreed, I think that imputil can do this.

> > - support for freeze functionality
> 
> No problem. A function in "imp" must be exposed to Python to support this
> within the imputil framework.

Agreed.  It currently exports init_frozen() which is about the right
functionality.

> > - load .py/.pyc/.pyo files and shared libraries from files
> 
> No problem. Again, a function is needed for platform-specific loading of
> shared libraries.

Is it useful to expose the platform differences?  The current
imp.load_dynamic() should suffice.

> > - support for packages
> 
> No problem. Demo's in current imputil.
> 
> > - sys.path and sys.modules should still exist; sys.path might
> > have a slightly different meaning
> 
> I would suggest that both retain their *exact* meaning. We introduce
> sys.importers -- a list of importers to check, in sequence. The first
> importer on that list uses sys.path to look for and load modules. The
> second importer loads builtins and frozen code (i.e. modules not on
> sys.path).

This is looking like the redesign I was looking for.  (Note that
imputil's current chaining is not good since it's impossible to remove
or reorder importers, which I think is a required feature; an explicit
list would solve this.)

Actually, the order is the other way around, but by now you should
know that.  It makes sense to have separate ones for builtin and
frozen modules -- these have nothing in common.

There's another issue, which isn't directly addressed by imputil,
although with clever use of inheritance it might be doable.  I'd like
more support for this however.  Quite orthogonally to the issue of
having separate importers, I might want to recognize new extensions.
Take the example of the ILU folks.  They want to be able to drop a
file "foo.isl" in any directory on sys.path and have the ILU stubber
automatically run if you try to import foo (the client stubs) or
foo__skel (the server skeleton).

This doesn't fit in the sys.importers strategy, because they want to
be able to drop their .isl files in any directory along sys.path.
(Or, more likely, they want to have control over where in sys.modules
the directory/directories with .isl files are placed.)  This requires
an ugly modification to the _fs_import() function.  (Which should have
been a method, by the way, to make overriding it in a subclass of
PathImporter easier!)

I've been thinking here along the lines of a strategy where the
standard importer (the one that walks sys.path) has a set of hooks
that define various things it could look for, e.g. .py files, .pyc
files, .so or .dll files.  This list of hooks could be changed to
support looking for .isl files.

There's an old, subtle issue that could be solved through this as
well: whether or not a .pyc file without a .py file should be accepted
or not.  Long ago (in Python 0.9.8) a .pyc file alone would never be
loaded.  This was changed at the request of a small but vocal minority
of Python developers who wanted to distribute .pyc files without .py
files.  It has occasionally caused frustration because sometimes
developers move .py files around but forget to remove the .pyc files,
and then the .pyc file is silently picked up if it occurs on sys.path
earlier than where the .py was moved to.

Having a set of hooks for various extensions would make it possible to
have a default where lone .pyc files are ignored, but where one can
insert a .pyc importer in the list of hooks that does the right thing
here.  (Of course, it may be possible that this whole feature of lone
.pyc files should be replaced since the same need is easily taken care
of by zip importers.

I also want to support (Jim A notwithstanding :-) a feature whereby
different things besides directories can live on sys.path, as long as
they are strings -- these could be added from the PYTHONPATH env
variable.  Every piece of code that I've ever seen that uses sys.path
doesn't care if a directory named in sys.path doesn't exist -- it may
try to stat various files in it, which also don't exist, and as far as
it is concerned that is just an indication that the requested module
doesn't live there.

Again, we would have to dissect imputil to support various hooks that
deal with different kind of entities in sys.path.  The default hook
list would consist of a single item that interprets the name as a
directory name; other hooks could support zip files or URLs.  Jack's
"magic cookies" could also be supported nicely through such a
mechanism.

> Users can insert/append new importers or alter sys.path as before.
> 
> sys.modules continues to record name:module mappings.

Yes.

Note that the interpretation of __file__ could be problematic.  To
what value do you set __file__ for a module loaded from a zip archive?

> > - $PYTHONPATH and $PYTHONHOME should still be supported
> 
> No problem.
> 
> > (I wouldn't mind a splitting up of importdl.c into several
> > platform-specific files, one of which is chosen by the configure
> > script; but that's a bit of a separate issue.)
> 
> Easy enough. The standard importer can select the appropriate
> platform-specific module/function to perform the load. i.e. these can move
> to Modules/ and be split into a module-per-platform.

Again: what's the advantage of exposing the platform specificity?

> > New features:
> > -------------
> > 
> > - Integrated support for Greg Ward's distribution utilities (i.e. a
> >   module prepared by the distutil tools should install painlessly)
> 
> I don't know the specific requirements/functionality that would be
> required here (does Greg? :-), but I can't imagine any problem with this.

Probably more support is required from the other end: once it's common
for modules to be imported from zip files, the distutil code needs to
support the creation and installation of such zip files.  Also, there
is a need for the install phase of distutil to communicate the
location of the zip file to the Python installation.

> > - Good support for prospective authors of "all-in-one" packaging tool
> >   authors like Gordon McMillan's win32 installer or /F's squish.  (But
> >   I *don't* require backwards compatibility for existing tools.)
> 
> Um. *No* problem. :-)

:-)

> > - Standard import from zip or jar files, in two ways:
> > 
> >   (1) an entry on sys.path can be a zip/jar file instead of a directory;
> >       its contents will be searched for modules or packages

Note that this is what I mention above for distutil support.

> While this could easily be done, I might argue against it. Old
> apps/modules that process sys.path might get confused.

Above I argued that this shouldn't be a problem.

> If compatibility is not an issue, then "No problem."
> 
> An alternative would be an Importer instance added to sys.importers that
> is configured for a specific archive (in other words, don't add the zip
> file to sys.path, add ZipImporter(file) to sys.importers).

This would be harder for distutil: where does Python get the initial
list of importers?

> Another alternative is an Importer that looks at a "sys.py_archives" list.
> Or an Importer that has a py_archives instance attribute.

OK, but again distutil needs to be able to add to this list when it
installs a package.  (Note that package deinstallation should also be
supported!)

(Of course I don't require this to affect Python processes that are
already running; but it should be possible to easily change the
default search path for all newly started instances of a given Python
installation.)

> >   (2) a file in a directory that's on sys.path can be a zip/jar file;
> >       its contents will be considered as a package (note that this is
> >       different from (1)!)
> 
> No problem. This will slow things down, as a stat() for *.zip and/or *.jar
> must be done, in addition to *.py, *.pyc, and *.pyo.

Fine, this is where the caching comes in handy.

> >   I don't particularly care about supporting all zip compression
> >   schemes; if Java gets away with only supporting gzip compression
> >   in jar files, so can we.
> 
> I presume we would support whatever zlib gives us, and no more.

That's it. :-)

> > - Easy ways to subclass or augment the import mechanism along
> >   different dimensions.  For example, while none of the following
> >   features should be part of the core implementation, it should be
> >   easy to add any or all:
> > 
> >   - support for a new compression scheme to the zip importer
> 
> Presuming ZipImporter is a class (derived from Importer), then this
> ability is wholly dependent upon the author of ZipImporter providing the
> hook.

Agreed.  But since we're likely going to provide this as a standandard
feature, we must ensure that it provides this hook.

> The Importer class is already designed for subclassing (and its interface 
> is very narrow, which means delegation is also *very* easy; see
> imputil.FuncImporter).

But maybe it's *too* narrow; some of the hooks I suggest above seem to
require extra interfaces -- at least in some of the subclasses of the
Importer base class.

Note: I looked at the doc string for get_code() and I don't understand
what the difference is between the modname and fqname arguments.  If I
write "import foo.bar", what are modname and fqname?  Why are both
present?  Also, while you claim that the API is narrow, the multiple
return values (also the different types for the second item) make it
complicated.

> >   - support for a new archive format, e.g. tar
> 
> A cakewalk. Gordon, JimA, and myself each have archive formats. :-)
> 
> >   - a hook to import from URLs or other data sources (e.g. a
> >     "module server" imported in CORBA) (this needn't be supported
> >     through $PYTHONPATH though)
> 
> No problem at all.
> 
> >   - a hook that imports from compressed .py or .pyc/.pyo files
> 
> No problem at all.
> 
> >   - a hook to auto-generate .py files from other filename
> >     extensions (as currently implemented by ILU)
> 
> No problem at all.

See above -- I think this should be more integrated with sys.path than
you are thinking of.  The more I think about it, the more I see that
the problem is that for you, the importer that uses sys.path is a
final subclass of Importer (i.e. it is itself not further subclassed).
Several of the hooks I want seem to require additional hooks in the
PathImporter rather than new importers.

> >   - a cache for file locations in directories/archives, to improve
> >     startup time
> 
> No problem at all.
> 
> >   - a completely different source of imported modules, e.g. for an
> >     embedded system or PalmOS (which has no traditional filesystem)
> 
> No problem at all.
> 
> In each of the above cases, the Importer.get_code() method just needs to
> grab the byte codes from the XYZ data source. That data source can be
> cmopressed, across a network, on-the-fly generated, or whatever. Each
> importer can certainly create a cache based on its concept of "location".
> In some cases, that would be a mapping from module name to filesystem
> path, or to a URL, or to a compiled-in, frozen module.

See above for sys.path integration remark.

> > - Note that different kinds of hooks should (ideally, and within
> >   reason) properly combine, as follows: if I write a hook to recognize
> >   .spam files and automatically translate them into .py files, and you
> >   write a hook to support a new archive format, then if both hooks are
> >   installed together, it should be possible to find a .spam file in an
> >   archive and do the right thing, without any extra action.  Right?
> 
> Ack. Very, very difficult.

Actually, I take most of this back.  Importers that deal with new
extension types often have to go through a file system to transform
their data to .py files, and this is just too complicated.  However it
would be still nice if there was code sharing between the code that
looks for .py and .pyc files in a zip archive and the code that does
the same in a filesystem.  Hm, maybe even that shouldn't be necessary,
the zip file probably should contain only .pyc files...

(Unrelated remark: I should really try to release the set of modules
we've written here at CNRI to deal with zip files.  Unfortunately zip
files are hairy and so is our code.)

> The imputil scheme combines the concept of locating/loading into one step.
> There is only one "hook" in the imputil system. Its semantic is "map this
> name to a code/module object and return it; if you don't have it, then
> return None."

That's fine.  I actually don't recall where the find-then-load API
came from, I think it may be an artefact of the original
implementation strategy.  It is currently used as follows: we try to
see if there's a .pyc and then we try to see if there's a .py; if both
exist we compare the timestamps etc. to choose which one.  But that's
still a red herring.

> Your compositing example is based on the capabilities of the
> find-then-load paradigm of the existing "ihooks.py". One module finds
> something (foo.spam) and the other module loads it (by generating a .py).

I still don't understand why ihooks.py had to be so complicated.  I
guess I just had much less of an understanding of the issues.  (It was
also partly a compromise with an alternative design by Ken Manheimer,
who basically forced me to support packages, originally through ni.py.)

> All is not lost, however. I can easily envision the get_code() hook as
> allowing any kind of return type. If it isn't a code or module object,
> then another hook is called to transform it.
> [ actually, I'd design it similarly: a *series* of hooks would be called
>   until somebody transforms the foo.spam into a code/module object. ]

OK.  This could be a feature of a subclass of Importer.

> The compositing would be limited ony by the (Python-based) Importer
> classes. For example, my ZipImporter might expect to zip up .pyc files
> *only*. Obviously, you would want to alter this to support zipping any
> file, then use the suffic to determine what to do at unzip time.
> 
> > - It should be possible to write hooks in C/C++ as well as Python
> 
> Use FuncImporter to delegate to an extension module.

Maybe not so great, since it sounds like the C code can't benefit from
any of the infrastructure that imputil offers.  I'm not sure about
this one though.

> This is one of the benefits of imputil's single/narrow interface.

Plus its vague specs? :-)

> > - Applications embedding Python may supply their own implementations,
> >   default search path, etc., but don't have to if they want to piggyback
> >   on an existing Python installation (even though the latter is
> >   fraught with risk, it's cheaper and easier to understand).
> 
> An application would have full control over the contents of sys.importers.
> 
> For a restricted execution app, it might install an Importer that loads
> files from *one* directory only which is configured from a specific
> Win32 Registry entry. That importer could also refuse to load shared
> modules. The BuiltinImporter would still be present (although the app
> would certainly omit all but the necessary builtins from the build).
> Frozen modules could be excluded.

Actually there's little reason to exclude frozen modules or any
.py/.pyc modules -- by definition, bytecode can't be dangerous.  It's
the builtins and extensions that need to be censored.

We currently do this by subclassing ihooks, where we mask the test for
builtins with a comparison to a predefined list of names.

> > Implementation:
> > ---------------
> > 
> > - There must clearly be some code in C that can import certain
> >   essential modules (to solve the chicken-or-egg problem), but I don't
> >   mind if the majority of the implementation is written in Python.
> >   Using Python makes it easy to subclass.
> 
> I posited once before that the cost of import is mostly I/O rather than
> CPU, so using Python should not be an issue. MAL demonstrated that a good
> design for the Importer classes is also required. Based on this, I'm a
> *strong* advocate of moving as much as possible into Python (to get
> Python's ease-of-coding with little relative cost).

Agreed.  However, how do you explain the slowdown (from 9 to 13
seconds I recall) though?  Are you a lousy coder? :-)

> The (core) C code should be able to search a path for a module and import
> it. It does not require dynamic loading or packages. This will be used to
> import exceptions.py, then imputil.py, then site.py.

It does, however, need to import builtin modules.  imputil currently
imports imp, sys, strop and __builtin__, struct and marshal; note that
struct can easily be a dynamic loadable module, and so could strop in
theory.  (Note that strop will be unnecessary in 1.6 if you use string
methods.)

I don't think that this chicken-or-egg problem is particularly
problematic though.

> The platform-specific module that perform dynamic-loading must be a
> statically linked module (in Modules/ ... it doesn't have to be in the
> Python/ directory).

See earlier comments.

> site.py can complete the bootstrap by setting up sys.importers with the
> appropriate Importer instances (this is where an application can define
> its own policy). sys.path was initially set by the import.c bootstrap code
> (from the compiled-in path and environment variables).

I thing that algorithm (currently in getpath.c / getpathp.c) might
also be moved to Python code -- imported frozen.  Sadly, rebuilding
with a new version of a frozen module might be more complicated than
rebuilding with a new version of a C module, but writing and
maintaining this code in Python would be *sooooooo* much easier that I
think it's worth it.

> Note that imputil.py would not install any hooks when it is loaded. That
> is up to site.py. This implies the core C code will import a total of
> three modules using its builtin system. After that, the imputil mechanism
> would be importing everything (site.py would .install() an Importer which
> then takes over the __import__ hook).

(Three not counting the builtin modules.)

> Further note that the "import" Python statement could be simplified to use
> only the hook. However, this would require the core importer to inject
> some module names into the imputil module's namespace (since it couldn't
> use an import statement until a hook was installed). While this
> simplification is "neat", it complicates the run-time system (the import
> statement is broken until a hook is installed).

Same chicken-or-egg.  We can be pragmatic.

For a developer, I'd like a bit of robustness (all this makes it
rather hard to debug a broken imputil, and that's a fair amount of
code!).

> Therefore, the core C code must also support importing builtins. "sys" and
> "imp" are needed by imputil to bootstrap.
> 
> The core importer should not need to deal with dynamic-load modules.

Same question.  Since that all has to be coded in C anyway, why not?

> To support frozen apps, the core importer would need to support loading
> the three modules as frozen modules.

I'd like to see a description of how someone like Jim A would build a
single-file application using the new mechanism.  This could
completely replace freeze.  (Freeze currently requires a C compiler;
that's bad.)

> The builtin/frozen importing would be exposed thru "imp" for use by
> imputil for future imports. imputil would load and use the (builtin)
> platform-specific module to do dynamic-load imports.

Sure.

> > - In order to support importing from zip/jar files using compression,
> >   we'd at least need the zlib extension module and hence libz itself,
> >   which may not be available everywhere.
> 
> Yes. I don't see this as a requirement, though. We wouldn't start to use
> these by default, would we? Or insist on zlib being present? I see this as
> more along the lines of "we have provided a standardized Importer to do
> this, *provided* you have zlib support."

Agreed.  Zlib support is easy to get, but there are probably platforms
where it's not.  (E.g. maybe the Mac?  I suppose that on the Mac,
there would be some importer classes to import from a resource fork.)

> > - I suppose that the bootstrap is solved using a mechanism very
> >   similar to what freeze currently used (other solutions seem to be
> >   platform dependent).
> 
> The bootstrap that I outlined above could be done in C code. The import
> code would be stripped down dramatically because you'll drop package
> support and dynamic loading.

Not the dynamic loading.  But yes the package support.

> Alternatively, you could probably do the path-scanning in Python and
> freeze that into the interpreter. Personally, I don't like this idea as it
> would not buy you much at all (it would still need to return to C for
> accessing a number of scanning functions and module importing funcs).
> 
> > - I also want to still support importing *everything* from the
> >   filesystem, if only for development.  (It's hard enough to deal with
> >   the fact that exceptions.py is needed during Py_Initialize();
> >   I want to be able to hack on the import code written in Python
> >   without having to rebuild the executable all the time.
> 
> My outline above does not freeze anything. Everything resides in the
> filesystem. The C code merely needs a path-scanning loop and functions to
> import .py*, builtin, and frozen types of modules.

Good.  Though I think there's also a need for freezing everything.
And when we go the route of the zip archive, the zip archive handling
code needs to be somewhere -- frozen seems to be a reasonable choice.

> If somebody nukes their imputil.py or site.py, then they return to Python
> 1.4 behavior where the core interpreter uses a path for importing (i.e. no
> packages). They lose dynamically-loaded module support.

But if the path guessing is also done by site.py (as I propose) the
path will probably be wrong.  A warning should be printed.

> > Let's first complete the requirements gathering.  Are these
> > requirements reasonable?  Will they make an implementation too
> > complex?  Am I missing anything?
> 
> I'm not a fan of the compositing due to it requiring a change to semantics
> that I believe are very useful and very clean. However, I outlined a
> possible, clean solution to do that (a secondary set of hooks for
> transforming get_code() return values).

As you may see from my responses, I'm a big fan of having several
different sets of hooks.  I do withdraw the composition requirement
though.

> The requirements are otherwise reasonable to me, as I see that they can
> all be readily solved (i.e. they aren't burdensome).
> 
> While this email may be long, I do not believe the resulting system would
> be complex. From the user-visible side of things, nothing would be
> changed. sys.path is still present and operates as before. They *do* have
> new functionality they can grow into, though (sys.importers). The
> underlying C code is simplified, and the platform-specific dynamic-load
> stuff can be distributed to distinct modules, as needed
> (e.g. BeOS/dynloadmodule.c and PC/dynloadmodule.c).
> 
> > Finally, to what extent does this impact the desire for dealing
> > differently with the Python bytecode compiler (e.g. supporting
> > optimizers written in Python)?  And does it affect the desire to
> > implement the read-eval-print loop (the >>> prompt) in Python?
> 
> If the three startup files require byte-compilation, then you could have
> some issues (i.e. the byte-compiler must be present).

Another chicken-or-egg.  No biggie.

> Once you hit site.py, you have a "full" environment and can easily detect
> and import a read-eval-print loop module (i.e. why return to Python? just 
> start things up right there).

You mean "why return to C?"  I agree.  It would be cool if somehow
IDLE and Pythonwin would also be bootstrapped using the same
mechanisms.  (This would also solve the question "which interactive
environment am I using?" that some modules and apps want to see
answered because they need to do things differently when run under
IDLE,for example.)

> site.py can also install new optimizers as desired, a new Python-based
> parser or compiler, or whatever...  If Python is built without a parser or
> compiler (I hope that's an option!), then the three startup modules would
> simply be frozen into the executable.

More power to hooks!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Thu Dec  2 22:22:33 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 2 Dec 1999 16:22:33 -0500 (EST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us>
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org>
	<199912022043.PAA15108@eric.cnri.reston.va.us>
Message-ID: <14406.58137.359127.921135@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > variable.  Every piece of code that I've ever seen that uses sys.path
 > doesn't care if a directory named in sys.path doesn't exist -- it may
 > try to stat various files in it, which also don't exist, and as far as

  Not the case -- I know you've looked at some of my code in the KOE
that ensures only real directories are on the path, and each is only
there once (pathhack.py).  Given that sys.path is often too long and
includes duplicate entries in a large system (often one entry with and
one without a trailing / for a given directory), it useful to be able
to distinguish between things that should be interpretable as paths
and things that aren't.  It should not be hard to declare that
"cookies" or whatever have some special form, like "<cookie>".

 > (Unrelated remark: I should really try to release the set of modules
 > we've written here at CNRI to deal with zip files.  Unfortunately zip
 > files are hairy and so is our code.)

  It doesn't help that that code just plain stinks.  I maintain that
no one here understands the whole of it.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jcw at equi4.com  Thu Dec  2 22:41:46 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Thu, 02 Dec 1999 22:41:46 +0100
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us>
Message-ID: <3846E79A.446EAFD5@equi4.com>

Guido van Rossum wrote:

[...]
> Note that the interpretation of __file__ could be problematic.  To
> what value do you set __file__ for a module loaded from a zip archive?

Makefiles use "archive(entry)" (this also supports nesting if needed).

[...] 
> I'd like to see a description of how someone like Jim A would build a
> single-file application using the new mechanism.  This could
> completely replace freeze.  (Freeze currently requires a C compiler;
> that's bad.)
[...]

This may be off-topic, but has anyone considered what it would take to
load shared libs out of an archive?  One way is to extract on-the-fly to
a temporary area.  A refinement is to leave extracted files there as
cache, and perhaps even to extract to a file with a name derived from
its MD5 digest (this way multiple users and even Python installations
can share the cache).  Would it be useful to define a "standard" area?

-- Jean-Claude


From gmcm at hypernet.com  Fri Dec  3 00:15:50 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Thu, 2 Dec 1999 18:15:50 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us>
References: Your message of "Fri, 19 Nov 1999 05:29:50 PST."             <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> 
Message-ID: <1267945992-2611810@hypernet.com>

[Guido]
 big snip
> Note that the interpretation of __file__ could be problematic. 
> To what value do you set __file__ for a module loaded from a zip
> archive?

I just left it alone (ie, as it was when I picked up the .pyc). 
Turns out OK, because then when the end user files a bug 
report, the developer can track it down.

> Note: I looked at the doc string for get_code() and I don't
> understand what the difference is between the modname and fqname
> arguments.  If I write "import foo.bar", what are modname and
> fqname?  

As I recall:
 import foo.bar
 -> get_code(None, 'foo', 'foo') # returns foo
 -> get_code(<self>, 'bar', 'foo.bar')

> Why are both present?  

I think so the importer can choose between being tree 
structured or flat.

> I'd like to see a description of how someone like Jim A would
> build a single-file application using the new mechanism.  This
> could completely replace freeze.  (Freeze currently requires a C
> compiler; that's bad.)

I have something working for Linux now. I froze exceptions.py. 
I hacked getpath.c so prefix = exec_prefix = executable's 
directory and the starting path is [prefix]. Although I did it 
differently, you could regard imputil.py and archive.py as 
frozen, too. (On WIndows it's somewhat different, because the 
result uses the stock python15.dll.) This somewhat 
oversimplifies; and I haven't really thought out all the ways 
people might try to use sym links. I'm inclined to think the 
starting path should contain both the executable's real 
directory and the sym link's directory.

> ....  I do withdraw the composition
> requirement though.

Hooray!


- Gordon


From gstein at lyra.org  Fri Dec  3 01:19:14 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 2 Dec 1999 16:19:14 -0800 (PST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <384694D8.DCA3D75E@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>

On Thu, 2 Dec 1999, M.-A. Lemburg wrote:
>...
> Still, I would like to rephrase my 0.02EUR which I already
> posted twice... why not start to think about what these
> importers would do first ? If there are only a handful of
> wishes we could just add them to the builtin machinery and
> be done with it...

I'd rather see the builtin machinery move to Python, regardless of what
system is used and/or what features are added.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Fri Dec  3 04:19:40 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 2 Dec 1999 19:19:40 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912021654100.18529-100000@nebula.lyra.org>

On Thu, 2 Dec 1999, Guido van Rossum wrote:
>...
> Sometime, Greg Stein wrote:
>...
> > On Thu, 18 Nov 1999, Guido van Rossum wrote:
>...
> > > Agreed.  I like some of imputil's features, but I think the API
> > > need to be redesigned.
> > 
> > It what ways? It sounds like you've applied some thought. Do you have any
> > concrete ideas yet, or "just a feeling" :-)  I'm working through some
> > changes from JimA right now, and would welcome other suggestions. I think
> > there may be some outstanding stuff from MAL, but I'm not sure (Marc?)
> 
> I actually think that the way the PVM (Python VM) calls the importer
> ought to be changed.  Assigning to __builtin__.__import__ is a crock.
> The API for __import__ is a crock.

Something like sys.set_import_hook() ?

The other alternative that I see would be to have the C code scan
sys.importers, assuming each are callable objects, and call them with the
appropriate params (e.g. module name). Of course, to move this scanning
into Python would require something like sys.set_import_hook() unless
Python looks for a hard-coded module and entrypoint.

>...
> > Which APIs are you referring to? The "imp" module? The C functions? The
> > __import__ and reload builtins?
> 
> > I'm guessing some of imp, the two builtins, and only one or two C
> > functions.
> 
> All of those.

We can provide Python code to provide compatibility for "imp" and the two
hooks. Nothing we can do to the C code, though. I'm not sure what the
import API looks like from C, and whether they could all stay. A brief
glance looks like most could stay.
[ removing any would change Python's API version, which might be "okay" ]

>...
> > > - load .py/.pyc/.pyo files and shared libraries from files
> > 
> > No problem. Again, a function is needed for platform-specific loading of
> > shared libraries.
> 
> Is it useful to expose the platform differences?  The current
> imp.load_dynamic() should suffice.

This comes up several times throughout this message, and in some off-list
mail Guido and I have exchanged. Namely, "should dynamic loading be part
of the core, or performed via a module?"

I would rather see it become a module, rather than inside the core
(despite the fact that the module would have to be compiled into the
interpreter). I believe this provides more flexibility for people looking
to replace/augment/update/fix dynamic loading on various architectures.
Rather than changing the core, a person can just drop in another module.
The isolation between the core and modules is nicer, aesthetically, to me.

The modules would also be exposing Just Another Importer Function, rather
than a specialized API in the builtin imp module. Also note that it is
easier to keep a module *out* of a Python-based application, than it is to
yank functions out of the core of Python. Frozen apps, embedded apps, etc
could easily leave out dynamic loading.

Are there strict advantages? Not any that I can think of right now (beyond
a bit of ease-of-use mentioned above). It just feels better to me.

>...
> > > - sys.path and sys.modules should still exist; sys.path might
> > > have a slightly different meaning
> > 
> > I would suggest that both retain their *exact* meaning. We introduce
> > sys.importers -- a list of importers to check, in sequence. The first
> > importer on that list uses sys.path to look for and load modules. The
> > second importer loads builtins and frozen code (i.e. modules not on
> > sys.path).
> 
> This is looking like the redesign I was looking for.  (Note that
> imputil's current chaining is not good since it's impossible to remove
> or reorder importers, which I think is a required feature; an explicit
> list would solve this.)

The chaining is an aspect of the current, singular import hook that Python
uses. In the past, I've suggested the installation of a "manager" that
maintains a list. sys.importers is similar in practice.

Note that this Manager would be present with the sys.set_import_hook()
scheme, while the Manager is implied if the core scans sys.importers.

> Actually, the order is the other way around, but by now you should
> know that.  It makes sense to have separate ones for builtin and
> frozen modules -- these have nothing in common.

Yes, JimA pointed this out. The latest imputil has corrected this.

I combined the builtin and frozen Importers because they were just so
similar. I didn't want to iterate over two Importers when a single one
sufficed quite well.

*shrug* Could go either way, really.

> There's another issue, which isn't directly addressed by imputil,
> although with clever use of inheritance it might be doable.  I'd like
> more support for this however.  Quite orthogonally to the issue of
> having separate importers, I might want to recognize new extensions.

Correct: while imputil doesn't address this, the standard/default Importer
classes *definitely* can.

>...
> the directory/directories with .isl files are placed.)  This requires
> an ugly modification to the _fs_import() function.  (Which should have
> been a method, by the way, to make overriding it in a subclass of
> PathImporter easier!)

I yanked that code out of the DirectoryImporter so that the PathImporter
could use it. I could see a reorg that creates a FileSystemImporter that
defines the method, and the other two just subclass from that.

> I've been thinking here along the lines of a strategy where the
> standard importer (the one that walks sys.path) has a set of hooks
> that define various things it could look for, e.g. .py files, .pyc
> files, .so or .dll files.  This list of hooks could be changed to
> support looking for .isl files.

Agreed. It should be easy to have a mapping of extension to handler.

One issue: should there be an ordering to the extensions? Exercise for the
reader to alter the data structures...

> There's an old, subtle issue that could be solved through this as
> well: whether or not a .pyc file without a .py file should be accepted
> or not.  Long ago (in Python 0.9.8) a .pyc file alone would never be
> loaded.  This was changed at the request of a small but vocal minority
> of Python developers who wanted to distribute .pyc files without .py
> files.  It has occasionally caused frustration because sometimes
> developers move .py files around but forget to remove the .pyc files,
> and then the .pyc file is silently picked up if it occurs on sys.path
> earlier than where the .py was moved to.

I think, "too bad for them."  :-)

Having just a .pyc is a very nice feature. But how can you tell whether it
was meant to be a plain .pyc or a mis-ordered one? To truly resolve that,
you would need to scan the whole path, looking for a .py. However, maybe
somebody put the .pyc there on purpose, to override the .py!

--- begin slightly-off-topic ---

Here is a neat little Bash script that allows you to use a .pyc as a CGI
(to avoid parse overhead). Normally, you can't just drop a .pyc into the
cgi-bin directory because the OS doesn't know how to execute it. Not a
problem, I say... just append your .pyc to the following Bash script and
execute! :-)

#!/bin/bash
exec - 3< $0 ; exec python -c 'import os,marshal ; f = os.fdopen(3, "rb")
; f.readline() ; f.readline() ; f.seek(8, 1) ; _c = marshal.load(f) ; del
os, marshal, f ; exec _c' $@

(the script should be two lines; and no... you can't use readlines(2))

The above script will preserve stdin, stdout, and stderr. If the caller
also use 3< ... well, that got overridden :-)

The script doesn't work on Windows for two reasons, though: 1) Bash, 2)
the "rb" mode followed by readline()

Detailed info at the bottom of http://www.lyra.org/greg/python/

--- end of off-topic ---

> Having a set of hooks for various extensions would make it possible to
> have a default where lone .pyc files are ignored, but where one can
> insert a .pyc importer in the list of hooks that does the right thing
> here.  (Of course, it may be possible that this whole feature of lone
> .pyc files should be replaced since the same need is easily taken care
> of by zip importers.

Maybe. I'd still like to see plain .pyc files, but I know I can work
around any change you might make here :-)

(i.e. whatever you'd like to do... go for it)

> I also want to support (Jim A notwithstanding :-) a feature whereby
> different things besides directories can live on sys.path, as long as
> they are strings -- these could be added from the PYTHONPATH env
> variable.  Every piece of code that I've ever seen that uses sys.path
> doesn't care if a directory named in sys.path doesn't exist -- it may
> try to stat various files in it, which also don't exist, and as far as
> it is concerned that is just an indication that the requested module
> doesn't live there.

I'm not in favor of this, but it is more-than-doable. Again: your
discretion...

> Again, we would have to dissect imputil to support various hooks that
> deal with different kind of entities in sys.path.  The default hook
> list would consist of a single item that interprets the name as a
> directory name; other hooks could support zip files or URLs.  Jack's
> "magic cookies" could also be supported nicely through such a
> mechanism.

Specifically, the PathImporter would get "dissected" :-). No problem.

> > Users can insert/append new importers or alter sys.path as before.
> > 
> > sys.modules continues to record name:module mappings.
> 
> Yes.
> 
> Note that the interpretation of __file__ could be problematic.  To
> what value do you set __file__ for a module loaded from a zip archive?

You don't (certainly in a way that is nice/compatible for modules that
refer to it). This is why I don't like __file__ and __path__. They just
don't make sense in archives or frozen code. Python code that relies on
them will create problems when that code is placed into different
packaging mechanisms.

>...
> > > (I wouldn't mind a splitting up of importdl.c into several
> > > platform-specific files, one of which is chosen by the configure
> > > script; but that's a bit of a separate issue.)
> > 
> > Easy enough. The standard importer can select the appropriate
> > platform-specific module/function to perform the load. i.e. these can move
> > to Modules/ and be split into a module-per-platform.
> 
> Again: what's the advantage of exposing the platform specificity?

See above.

>...
> Probably more support is required from the other end: once it's common
> for modules to be imported from zip files, the distutil code needs to
> support the creation and installation of such zip files.  Also, there
> is a need for the install phase of distutil to communicate the
> location of the zip file to the Python installation.

I'm quite confident that something can be designed that would satisfy the
needs here. Something akin to .pth files that a zip importer could read.

>...
> > > - Standard import from zip or jar files, in two ways:
> > > 
> > >   (1) an entry on sys.path can be a zip/jar file instead of a directory;
> > >       its contents will be searched for modules or packages
> 
> Note that this is what I mention above for distutil support.
> 
> > While this could easily be done, I might argue against it. Old
> > apps/modules that process sys.path might get confused.
> 
> Above I argued that this shouldn't be a problem.

For most code, no, but as Fred mentioned (and I surmise), there are things
out there assuming that sys.path contains strings which specify
directories.

Sure, we can do this (your discretion), but my feeling is to avoid it.

> > If compatibility is not an issue, then "No problem."
> > 
> > An alternative would be an Importer instance added to sys.importers that
> > is configured for a specific archive (in other words, don't add the zip
> > file to sys.path, add ZipImporter(file) to sys.importers).
> 
> This would be harder for distutil: where does Python get the initial
> list of importers?

Default is just the two: BuiltinImporter and PathImporter. Adding
ZipImporters (or anything else) at startup is TBD, but shouldn't pose a
problem.

>...
> > >   (2) a file in a directory that's on sys.path can be a zip/jar file;
> > >       its contents will be considered as a package (note that this is
> > >       different from (1)!)
> > 
> > No problem. This will slow things down, as a stat() for *.zip and/or *.jar
> > must be done, in addition to *.py, *.pyc, and *.pyo.
> 
> Fine, this is where the caching comes in handy.

IFF caching is enabled for the particular platform and installation.

>...
> > The Importer class is already designed for subclassing (and its interface 
> > is very narrow, which means delegation is also *very* easy; see
> > imputil.FuncImporter).
> 
> But maybe it's *too* narrow; some of the hooks I suggest above seem to
> require extra interfaces -- at least in some of the subclasses of the
> Importer base class.

Correct -- the *subclasses*. I still maintain the imputil design of a
single hook (get_code) is Right.

I'll make a swipe at PathImporter in the next few weeks to add the
capability for new extensions.

> Note: I looked at the doc string for get_code() and I don't understand
> what the difference is between the modname and fqname arguments.  If I
> write "import foo.bar", what are modname and fqname?  Why are both
> present?  Also, while you claim that the API is narrow, the multiple
> return values (also the different types for the second item) make it
> complicated.

Gordon detailed this in another note...

Yes, the multiple return values make it a bit more complicated, but I
can't think of any reasonable alternatives.

A bit more doc should do the trick, I'd guess.

>...
> > >   - a hook to auto-generate .py files from other filename
> > >     extensions (as currently implemented by ILU)
> > 
> > No problem at all.
> 
> See above -- I think this should be more integrated with sys.path than
> you are thinking of.  The more I think about it, the more I see that
> the problem is that for you, the importer that uses sys.path is a
> final subclass of Importer (i.e. it is itself not further subclassed).
> Several of the hooks I want seem to require additional hooks in the
> PathImporter rather than new importers.

Correct -- I've currently designed/implemented PathImporter as "final".

I don't forsee a problem turning it into something that can be hooked at
run-time, or subclassed at code-time. A detailing of the features needed 
would be handy:

* allow alternative file suffixes, with functions or subclasses to map the
  file into a code/module object.

>...
> > > - Note that different kinds of hooks should (ideally, and within
> > >   reason) properly combine, as follows: if I write a hook to recognize
> > >   .spam files and automatically translate them into .py files, and you
> > >   write a hook to support a new archive format, then if both hooks are
> > >   installed together, it should be possible to find a .spam file in an
> > >   archive and do the right thing, without any extra action.  Right?
> > 
> > Ack. Very, very difficult.
> 
> Actually, I take most of this back.  Importers that deal with new
> extension types often have to go through a file system to transform
> their data to .py files, and this is just too complicated.  However it
> would be still nice if there was code sharing between the code that
> looks for .py and .pyc files in a zip archive and the code that does
> the same in a filesystem.  Hm, maybe even that shouldn't be necessary,
> the zip file probably should contain only .pyc files...

Gordon replies to this... All of the archives that myself, Gordon, and
JimA have been using only store .pyc files. I don't see much code sharing
between the filesystem and archive import code.

>...
> > All is not lost, however. I can easily envision the get_code() hook as
> > allowing any kind of return type. If it isn't a code or module object,
> > then another hook is called to transform it.
> > [ actually, I'd design it similarly: a *series* of hooks would be called
> >   until somebody transforms the foo.spam into a code/module object. ]
> 
> OK.  This could be a feature of a subclass of Importer.

That would be my preference, rather than loading more into the Importer
base class itself.

>...
> > > - It should be possible to write hooks in C/C++ as well as Python
> > 
> > Use FuncImporter to delegate to an extension module.
> 
> Maybe not so great, since it sounds like the C code can't benefit from
> any of the infrastructure that imputil offers.  I'm not sure about
> this one though.

There isn't any infrastructure that needs to be accessed. get_code() is
the call-point, and there is no mechanism provided to the callee to call
back into the imputil system.

> > This is one of the benefits of imputil's single/narrow interface.
> 
> Plus its vague specs? :-)

Ouch. I thought I was actually doing quite a bit better than normal with
that long doc-string on get_code :-(

>...
> > For a restricted execution app, it might install an Importer that loads
> > files from *one* directory only which is configured from a specific
> > Win32 Registry entry. That importer could also refuse to load shared
> > modules. The BuiltinImporter would still be present (although the app
> > would certainly omit all but the necessary builtins from the build).
> > Frozen modules could be excluded.
> 
> Actually there's little reason to exclude frozen modules or any
> .py/.pyc modules -- by definition, bytecode can't be dangerous.  It's
> the builtins and extensions that need to be censored.
> 
> We currently do this by subclassing ihooks, where we mask the test for
> builtins with a comparison to a predefined list of names.

True. My concern is an invader misusing one "type" of module for another.
For example, let's say you've provided a selection of modules each
exporting function FOO, and the user can configure which module to use.
Can they do damage if some unrelated, frozen module also exports FOO?

Minor issue, anyhow. All the functionality is there.

>...
> > I posited once before that the cost of import is mostly I/O rather than
> > CPU, so using Python should not be an issue. MAL demonstrated that a good
> > design for the Importer classes is also required. Based on this, I'm a
> > *strong* advocate of moving as much as possible into Python (to get
> > Python's ease-of-coding with little relative cost).
> 
> Agreed.  However, how do you explain the slowdown (from 9 to 13
> seconds I recall) though?  Are you a lousy coder? :-)

Heh :-)

I have not spent *any* time working on optimization. Currently, each
Importer in the chain redoes some work of the prior Importer. A bit of
restructuring would split the common work out to a Manager, which then
calls a method in the Importer (and passes all the computed work). Of
course, a bit of profiling wouldn't hurt either. Some of the "imp"
interfaces could possibly be refined to better support the BuiltinImporter
or the dynamic load features.

The question is still valid, though -- at the moment, I can't explain it
because I haven't looked into it.

> > The (core) C code should be able to search a path for a module and import
> > it. It does not require dynamic loading or packages. This will be used to
> > import exceptions.py, then imputil.py, then site.py.

Note: after writing this, I realized there is really no need for the core
to do the imputil import. site.py can easily do that.

> It does, however, need to import builtin modules.  imputil currently

Correct.

> imports imp, sys, strop and __builtin__, struct and marshal; note that
> struct can easily be a dynamic loadable module, and so could strop in
> theory.  (Note that strop will be unnecessary in 1.6 if you use string
> methods.)

I knew about strop, but imputil would be harder to use today if it relied
on the string methods. So... I've delayed that change.

The struct module is used in a couple teeny cases, dealing with
constructing a network-order, 4-byte, binary integer value. It would be
easy enough to just do that with a bit of Python code instead.

> I don't think that this chicken-or-egg problem is particularly
> problematic though.

Right.

In my ideal world, the core couldn't do a dynamic load, so that would need
to be considered within the bootstrap process.

>...
> > site.py can complete the bootstrap by setting up sys.importers with the
> > appropriate Importer instances (this is where an application can define
> > its own policy). sys.path was initially set by the import.c bootstrap code
> > (from the compiled-in path and environment variables).
> 
> I thing that algorithm (currently in getpath.c / getpathp.c) might
> also be moved to Python code -- imported frozen.  Sadly, rebuilding
> with a new version of a frozen module might be more complicated than
> rebuilding with a new version of a C module, but writing and
> maintaining this code in Python would be *sooooooo* much easier that I
> think it's worth it.

I think we can find a better way to freeze modules and to use them.
Especially for the cases where we have specific "core" functions
implemented in Python. (e.g. freezing parsers, compilers, and/or the
read-eval loop)

I don't forsee an issue that the build process becomes more complicated.
If we nuke "makesetup" in favor of a Python script, then we could create a
stub Python executable which runs the build script which writes the Setup
file and the getpath*.c file(s).

> > Note that imputil.py would not install any hooks when it is loaded. That
> > is up to site.py. This implies the core C code will import a total of
> > three modules using its builtin system. After that, the imputil mechanism
> > would be importing everything (site.py would .install() an Importer which
> > then takes over the __import__ hook).
> 
> (Three not counting the builtin modules.)

Correct, although I'll modify my statement to "two plus the builtins".

> > Further note that the "import" Python statement could be simplified to use
> > only the hook. However, this would require the core importer to inject
> > some module names into the imputil module's namespace (since it couldn't
> > use an import statement until a hook was installed). While this
> > simplification is "neat", it complicates the run-time system (the import
> > statement is broken until a hook is installed).
> 
> Same chicken-or-egg.  We can be pragmatic.
> 
> For a developer, I'd like a bit of robustness (all this makes it
> rather hard to debug a broken imputil, and that's a fair amount of
> code!).

True. I threw that out as an alternative, and then presented the counter
argument :-)

>...
> > Therefore, the core C code must also support importing builtins. "sys" and
> > "imp" are needed by imputil to bootstrap.
> > 
> > The core importer should not need to deal with dynamic-load modules.
> 
> Same question.  Since that all has to be coded in C anyway, why not?

It simplifies the core's import code to not deal with that stuff at all.

> > To support frozen apps, the core importer would need to support loading
> > the three modules as frozen modules.
> 
> I'd like to see a description of how someone like Jim A would build a
> single-file application using the new mechanism.  This could
> completely replace freeze.  (Freeze currently requires a C compiler;
> that's bad.)

The portable mechanism for freezing will always need a compiler. Platform
specific mechanisms (e.g. append to the .EXE, or use the linker to create
a new ELF section) can optimize the freeze process in different ways.

I don't have a design in my head for the freeze issues -- I've been
considering that the mechanism would remain about the same. However, I can
easily see that different platforms may want to use different freeze
processes... hmm...

>...
> > Yes. I don't see this as a requirement, though. We wouldn't start to use
> > these by default, would we? Or insist on zlib being present? I see this as
> > more along the lines of "we have provided a standardized Importer to do
> > this, *provided* you have zlib support."
> 
> Agreed.  Zlib support is easy to get, but there are probably platforms
> where it's not.  (E.g. maybe the Mac?  I suppose that on the Mac,
> there would be some importer classes to import from a resource fork.)

Exactly. And importer classes to load from a Win32 resources (modifying a
.EXE's resources post-link is cleaner than the append solution)

>...
> > My outline above does not freeze anything. Everything resides in the
> > filesystem. The C code merely needs a path-scanning loop and functions to
> > import .py*, builtin, and frozen types of modules.
> 
> Good.  Though I think there's also a need for freezing everything.
> And when we go the route of the zip archive, the zip archive handling
> code needs to be somewhere -- frozen seems to be a reasonable choice.

Sure.

> > If somebody nukes their imputil.py or site.py, then they return to Python
> > 1.4 behavior where the core interpreter uses a path for importing (i.e. no
> > packages). They lose dynamically-loaded module support.
> 
> But if the path guessing is also done by site.py (as I propose) the
> path will probably be wrong.  A warning should be printed.

All right. Doesn't Python already print a warning if it can't find
site.py?

> > > Let's first complete the requirements gathering.  Are these
> > > requirements reasonable?  Will they make an implementation too
> > > complex?  Am I missing anything?
> > 
> > I'm not a fan of the compositing due to it requiring a change to semantics
> > that I believe are very useful and very clean. However, I outlined a
> > possible, clean solution to do that (a secondary set of hooks for
> > transforming get_code() return values).
> 
> As you may see from my responses, I'm a big fan of having several
> different sets of hooks.

Yes. However, I've only recognized one so far. Propose more... I'm
confident we can update the PathImporter design to accomodate (and retain
the underlying imputil paradigm).

> I do withdraw the composition requirement
> though.

:-)

>...
> > Once you hit site.py, you have a "full" environment and can easily detect
> > and import a read-eval-print loop module (i.e. why return to Python? just 
> > start things up right there).
> 
> You mean "why return to C?"  I agree.  It would be cool if somehow

Heh. Yah, that's what I meant :-)

> IDLE and Pythonwin would also be bootstrapped using the same
> mechanisms.  (This would also solve the question "which interactive
> environment am I using?" that some modules and apps want to see
> answered because they need to do things differently when run under
> IDLE,for example.)

Haven't thought on this. Should be doable, I'd think.

> > site.py can also install new optimizers as desired, a new Python-based
> > parser or compiler, or whatever...  If Python is built without a parser or
> > compiler (I hope that's an option!), then the three startup modules would
> > simply be frozen into the executable.
> 
> More power to hooks!

:-) You betcha!

I believe my next order of business:

* update PathImporter with the file-extension hook
* dynload C code reorg, per the other email
* create new-model site.py and trash import.c
* review freeze mechanisms and process
* design mechanism for frozen core functionality (eg. getpath*.c)
  (coding and building design)
* shift core functions to Python, using above design

I'll just plow ahead, but also recognize that any/all may change. ie. I'll
build examples/finals/prototypes and Guido can pick/choose/reimplement/etc
as needed. I'm out next week, but should start on the above items by the
end of the month (will probably do another mod_dav release in there
somewhere).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik at pythonware.com  Fri Dec  3 11:10:10 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 3 Dec 1999 11:10:10 +0100
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com>
Message-ID: <023601bf3d78$0ec3dc30$f29b12c2@secret.pythonware.com>

Jean-Claude Wippler <jcw at equi4.com> wrote:
> This may be off-topic, but has anyone considered what it would take to
> load shared libs out of an archive?

well, we do that in a number of applications.

(lazy installers are really cool... if you've installed works,
you've seen some weird stuff -- for example, when the
application starts the first time, it's loading everything
from inside the installer.  the rest of the installation is
done from within the application itself, using archives
in the installation executable)

I think things like this are better left for the application
designers, though...

</F>


From mal at lemburg.com  Fri Dec  3 11:03:31 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 03 Dec 1999 11:03:31 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>
Message-ID: <38479573.B2CFDD2B@lemburg.com>

Greg Stein wrote:
> 
> On Thu, 2 Dec 1999, M.-A. Lemburg wrote:
> >...
> > Still, I would like to rephrase my 0.02EUR which I already
> > posted twice... why not start to think about what these
> > importers would do first ? If there are only a handful of
> > wishes we could just add them to the builtin machinery and
> > be done with it...
> 
> I'd rather see the builtin machinery move to Python, regardless of what
> system is used and/or what features are added.

In the long run that's probably the right direction, but right now
we are only talking a very small set of additional features,
which can easily be added to the existing code without too much
fuzz.

Plus it won't slow things down, which is important since
Python startup time is already an issue all by itself. The
imputil.py approach of doing (a whole bunch of) recursive Python
function calls to all kinds of importers will not speed this up,
I'm afraid. A on-disk lookup table would speed this up, but
it would also break the current logic in imputil.py, which
puts importer independence above all.

--

IMHO, we should retreat to a more centralized interface,
one which more resembles a manager rather than the agent
interface implemented in imputil.py. Add-ons can then
register themselves to say "hey, I can handle pyz-archives"
or "I know how to import .so modules" or "I provide a
search function which you can call to have me scan
my module container (directory, web-site, archive)".

The manager would take care of what to call and in which
order, plus delegate requests to add-ons which implement
the needed logic, e.g. add-ons for signature checking, unzipping
archives, file system lookup tables, etc.

It could also trace its actions and then keep an on-disk
knowledge base for what it did in the past to find certain
modules under certain conditions.

Anyway, all this is extra magic for some future version of
Python.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    28 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido at CNRI.Reston.VA.US  Fri Dec  3 14:45:07 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 03 Dec 1999 08:45:07 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Fri, 03 Dec 1999 11:03:31 +0100."
             <38479573.B2CFDD2B@lemburg.com> 
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>  
            <38479573.B2CFDD2B@lemburg.com> 
Message-ID: <199912031345.IAA16376@eric.cnri.reston.va.us>

[Greg]
> > I'd rather see the builtin machinery move to Python, regardless of what
> > system is used and/or what features are added.

[Marc]
> In the long run that's probably the right direction, but right now
> we are only talking a very small set of additional features,
> which can easily be added to the existing code without too much
> fuzz.

I disagree.  We should do the redisign right rather than tweaking the
existing code.

> Plus it won't slow things down, which is important since
> Python startup time is already an issue all by itself. The
> imputil.py approach of doing (a whole bunch of) recursive Python
> function calls to all kinds of importers will not speed this up,
> I'm afraid. A on-disk lookup table would speed this up, but
> it would also break the current logic in imputil.py, which
> puts importer independence above all.

I don't care about the current logic in imputil.  It's only a prototype!

> IMHO, we should retreat to a more centralized interface,
> one which more resembles a manager rather than the agent
> interface implemented in imputil.py. Add-ons can then
> register themselves to say "hey, I can handle pyz-archives"
> or "I know how to import .so modules" or "I provide a
> search function which you can call to have me scan
> my module container (directory, web-site, archive)".

This makes sense.

> The manager would take care of what to call and in which
> order, plus delegate requests to add-ons which implement
> the needed logic, e.g. add-ons for signature checking, unzipping
> archives, file system lookup tables, etc.
> 
> It could also trace its actions and then keep an on-disk
> knowledge base for what it did in the past to find certain
> modules under certain conditions.
> 
> Anyway, all this is extra magic for some future version of
> Python.

I would say the manager API design and a basic set of specific
handlers should go into 1.6.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik at pythonware.com  Fri Dec  3 15:14:00 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 3 Dec 1999 15:14:00 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us>
Message-ID: <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com>

MAL wrote:
> > IMHO, we should retreat to a more centralized interface,
> > one which more resembles a manager rather than the agent
> > interface implemented in imputil.py. Add-ons can then
> > register themselves to say "hey, I can handle pyz-archives"
> > or "I know how to import .so modules" or "I provide a
> > search function which you can call to have me scan
> > my module container (directory, web-site, archive)".

but why?  in my small-minded view of how python
works, an importer carries out a very simple task:

    given a name, check if you have a
    module with that name, and install
    it.  if you cannot, fail (in which case
    python asks the next importer along
    the path).

why do you have to complicate things beyond that?
why not just let Python provide a few base classes
and mixins for people who want to create custom
importers, and be done with it?

rationale, please.

</F>


From jim at interet.com  Fri Dec  3 15:34:40 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 03 Dec 1999 09:34:40 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org> <38479573.B2CFDD2B@lemburg.com>
Message-ID: <3847D500.53833D06@interet.com>

"M.-A. Lemburg" wrote:
> 
> Greg Stein wrote:

> > I'd rather see the builtin machinery move to Python, regardless of what
> > system is used and/or what features are added.
> 
> In the long run that's probably the right direction, but right now
> we are only talking a very small set of additional features,
> which can easily be added to the existing code without too much
> fuzz.

I volunteer to write a Python archive in either Python or C.  In
fact I currently have prototypes for both.  But I have to agree
with Greg here.  I think a Python importer is the way to go.  The
C code is 300 lines mostly in import.c and parallel to existing code.
The Python archive is about 100 lines and is prettier, easy to read,
alter and re-use (obviously).

> Plus it won't slow things down, which is important since
> Python startup time is already an issue all by itself. The

I think archive files should be able to be fast, and should
help, not hurt, startup time.  Provided that the use of sys.path
is curtailed, os.readdir() is not needed, and the
specifications are not complicated.

Although archive files are my special concern, I realize that
imputil is not just about archives.

JimA


From guido at CNRI.Reston.VA.US  Fri Dec  3 15:39:25 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 03 Dec 1999 09:39:25 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Thu, 02 Dec 1999 19:19:40 PST."
             <Pine.LNX.4.10.9912021654100.18529-100000@nebula.lyra.org> 
References: <Pine.LNX.4.10.9912021654100.18529-100000@nebula.lyra.org> 
Message-ID: <199912031439.JAA16524@eric.cnri.reston.va.us>

Greg,

Great response.  I think we know where we each stand.  Please go ahead
with a new design.  (That's trust, not carte blanche.)

Just one thought: the more I think about it, the less I like
sys.importers: functionality which is implemented through
sys.importers must necessarily be placed either in front of all of
sys.path or after it.  While this is helpful for "canned" apps that
want *everything* to be imported from a fixed archive, I think that
for regular Python installations sys.path should remain the point of
attack.  In particular, installing a new package (e.g. PIL) should
affect sys.path, regardless of the way of delivery of the modules
(shared libs, .py files, .pyc files, or a zip archive).

I'm not too worried about code that inspects sys.path and expects
certain invariants; that code is most likely interfering with the
import mechanism so should be revisited anyway.

On the lone .pyc issue: I'd like to see this disappear when using the
filesystem, I see no use for it there if we support .pyc files in zip
archives.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at interet.com  Fri Dec  3 15:44:54 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 03 Dec 1999 09:44:54 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com>
Message-ID: <3847D766.1E5FFAF3@interet.com>

Jean-Claude Wippler wrote:
> 
> Guido van Rossum wrote:
> 
> [...]
> > Note that the interpretation of __file__ could be problematic.  To
> > what value do you set __file__ for a module loaded from a zip archive?
> 
> Makefiles use "archive(entry)" (this also supports nesting if needed).

I discovered the hard way this entry is not optional.  I just
used the archive file name for __file__.

> This may be off-topic, but has anyone considered what it would take to
> load shared libs out of an archive?  One way is to extract on-the-fly to
> a temporary area.  A refinement is to leave extracted files there as
> cache, and perhaps even to extract to a file with a name derived from
> its MD5 digest (this way multiple users and even Python installations
> can share the cache).  Would it be useful to define a "standard" area?

IMHO putting shared libs in an archive is a bad idea because the OS
can not use them there.  They must be extracted as you say.  But then
storage is wasted by using space in the archive and the external file.
Deleting them after use wastes time.  Better to leave them out of the
archive and provide for them in the installer.  IMHO the
archive is a basic simple feature, and people make installers on top
of that.  Archives shouldn't try to do it all.

JimA


From mal at lemburg.com  Fri Dec  3 15:14:09 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 03 Dec 1999 15:14:09 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>  
	            <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us>
Message-ID: <3847D030.2C936E24@lemburg.com>

Guido van Rossum wrote:
> 
> [Greg]
> > > I'd rather see the builtin machinery move to Python, regardless of what
> > > system is used and/or what features are added.
> 
> [Marc]
> > In the long run that's probably the right direction, but right now
> > we are only talking a very small set of additional features,
> > which can easily be added to the existing code without too much
> > fuzz.
> 
> I disagree.  We should do the redisign right rather than tweaking the
> existing code.

Ok, then...
 
> > IMHO, we should retreat to a more centralized interface,
> > one which more resembles a manager rather than the agent
> > interface implemented in imputil.py. Add-ons can then
> > register themselves to say "hey, I can handle pyz-archives"
> > or "I know how to import .so modules" or "I provide a
> > search function which you can call to have me scan
> > my module container (directory, web-site, archive)".
> 
> This makes sense.
> 
> > The manager would take care of what to call and in which
> > order, plus delegate requests to add-ons which implement
> > the needed logic, e.g. add-ons for signature checking, unzipping
> > archives, file system lookup tables, etc.
> >
> > It could also trace its actions and then keep an on-disk
> > knowledge base for what it did in the past to find certain
> > modules under certain conditions.
> >
> > Anyway, all this is extra magic for some future version of
> > Python.
> 
> I would say the manager API design and a basic set of specific
> handlers should go into 1.6.

BTW, is there a timeline for the 1.6 release ? I mean which
things will have to be in 1.6 ?

Some recent topics as hints:

1. Unicode
2. Import Manager API + default handlers
3. Python style coercion at C type level
4. Rich comparisons
5. __doc__ string extraction tool

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    28 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Fri Dec  3 15:24:04 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 03 Dec 1999 15:24:04 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com>
Message-ID: <3847D284.8CBF2A9C@lemburg.com>

Fredrik Lundh wrote:
> 
> MAL wrote:
> > > IMHO, we should retreat to a more centralized interface,
> > > one which more resembles a manager rather than the agent
> > > interface implemented in imputil.py. Add-ons can then
> > > register themselves to say "hey, I can handle pyz-archives"
> > > or "I know how to import .so modules" or "I provide a
> > > search function which you can call to have me scan
> > > my module container (directory, web-site, archive)".
> 
> but why?  in my small-minded view of how python
> works, an importer carries out a very simple task:
> 
>     given a name, check if you have a
>     module with that name, and install
>     it.  if you cannot, fail (in which case
>     python asks the next importer along
>     the path).
> 
> why do you have to complicate things beyond that?
> why not just let Python provide a few base classes
> and mixins for people who want to create custom
> importers, and be done with it?

Because importing in Python has become *much* more
complicated over time. There are requests for new
features which touch subjects such as storage mechanisms,
lookups, signatures (for trusted code), lazy imports, etc.

A chain of simple minded importers won't work together
too well, duplicate work and downgrade performance
considerably due to the many recursive function calls.
Also, centralized caching strategies are hard to implement
across import handlers.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    28 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jeremy at cnri.reston.va.us  Fri Dec  3 17:47:54 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Fri, 3 Dec 1999 11:47:54 -0500 (EST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <14406.58137.359127.921135@weyr.cnri.reston.va.us>
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org>
	<199912022043.PAA15108@eric.cnri.reston.va.us>
	<14406.58137.359127.921135@weyr.cnri.reston.va.us>
Message-ID: <14407.62522.360386.757519@goon.cnri.reston.va.us>

>>>>> "FLD" == Fred L Drake, <fdrake at acm.org> writes:

  >> (Unrelated remark: I should really try to release the set of
  >> modules we've written here at CNRI to deal with zip files.
  >> Unfortunately zip files are hairy and so is our code.)

  FLD>   It doesn't help that that code just plain stinks.  I maintain
  FLD> that no one here understands the whole of it.

I'm all for improving the code and getting it out.  The real problem
is that interfaces have been glommed on for every new use of a Zip
file.  (You want to read one off a socket and extract files before
you've got the whole thing?  No problem! Add a new class.)  We need to
figure out the common patterns for using the archives and write a new
set of interfaces to support that.

Jeremy


From guido at CNRI.Reston.VA.US  Fri Dec  3 18:12:07 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 03 Dec 1999 12:12:07 -0500
Subject: [Python-Dev] What to do with our Zip code?
In-Reply-To: Your message of "Fri, 03 Dec 1999 11:47:54 EST."
             <14407.62522.360386.757519@goon.cnri.reston.va.us> 
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <14406.58137.359127.921135@weyr.cnri.reston.va.us>  
            <14407.62522.360386.757519@goon.cnri.reston.va.us> 
Message-ID: <199912031712.MAA17061@eric.cnri.reston.va.us>

[Jeremy, on our Zip code]
> I'm all for improving the code and getting it out.  The real problem
> is that interfaces have been glommed on for every new use of a Zip
> file.  (You want to read one off a socket and extract files before
> you've got the whole thing?  No problem! Add a new class.)  We need to
> figure out the common patterns for using the archives and write a new
> set of interfaces to support that.

If we gave you the code we currently have, would someone else in this
forum be willing to redesign it?  Eventually it would become part of
the Python distribution.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik at pythonware.com  Sat Dec  4 10:54:30 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 4 Dec 1999 10:54:30 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com>
Message-ID: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com>

M.-A. Lemburg <mal at lemburg.com> wrote:
> >     given a name, check if you have a
> >     module with that name, and install
> >     it.  if you cannot, fail (in which case
> >     python asks the next importer along
> >     the path).
> > 
> > why do you have to complicate things beyond that?
> > why not just let Python provide a few base classes
> > and mixins for people who want to create custom
> > importers, and be done with it?
> 
> Because importing in Python has become *much* more
> complicated over time. There are requests for new
> features which touch subjects such as storage mechanisms,
> lookups, signatures (for trusted code), lazy imports, etc.

sorry, I still don't understand it.  our applications already
use different storage mechanisms, databases, signatures,
lazy importing, version handling, etc, etc.  now, if *we*
have managed to build all that on top of an old version
of imputil.py, how come it's not sufficient for the rest
of you?

> A chain of simple minded importers won't work together
> too well

why?  it sure works for us...

> duplicate work

avoiding duplicate work is what object oriented design
is all about.  and last time I checked, Python had excellent
support for that.

> and downgrade performance considerably due to the
> many recursive function calls

now that's what I call premature optimization.  and this
scares the hell out of me: if the rest of the python-dev
crowd don't seriously believe that Python is (or can be
made) fast enough to implement things like this, why
the heck are you using Python at all?  am I the only
one here who doesn't believe in osterhout's talk about
"the great system vs. scripting language divide"?

</F>


From fredrik at pythonware.com  Sat Dec  4 10:54:42 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 4 Dec 1999 10:54:42 +0100
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com>
Message-ID: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com>

James C. Ahlstrom <jim at interet.com> wrote:
> IMHO putting shared libs in an archive is a bad idea because the OS
> can not use them there.  They must be extracted as you say.  But then
> storage is wasted by using space in the archive and the external file.
> Deleting them after use wastes time.  Better to leave them out of the
> archive and provide for them in the installer.  IMHO the
> archive is a basic simple feature, and people make installers on top
> of that.  Archives shouldn't try to do it all.

have you tried it?  if not, why do you think you should
be allowed to forbid others from doing it?

in "the inmates are running the asylum", alan cooper
points out that the *major* reason people all over the
world love web applications are that there are no
bloody installers.  and here you are advocating that
we all should be forced to use installers, when python
makes it trivial to write self-installing apps. double-argh!

(on the other hand, why do I complain? all pythonworks
customers is going to be able to do all this anyway...).

<rant size="major">

frankly, this "design by committee" (or is it "design by
people who've never even been close to implementing
something because they thought it was too hard, and
thus think they're qualified to argue against those of
us who didn't even realize that it was a hard problem"?)
trend I've been seeing in all kinds of python forums
makes me sooooo sad.  the more of this I see (dist-
utils-sig, doc-sig, here, c.l.python), the sadder I get,
and the more I sympathise with John Skaller who's
defining his own python-like universe...

if someone needs me, I'll be down in the pub having
a beer with the mad scientist, the shiny eff-bot, and
mr. nitpicker.  if we're not there, you'll find us in the
lab, working on new string matching facilities for 1.6,
SOAP [1], tkinter replacements for the masses, and
whatever else we can come up with...  see you!

</rant>

1) http://www.newsalert.com/bin/story?StoryId=Coenz0bWbu0znmdKXqq


From gstein at lyra.org  Sat Dec  4 11:42:27 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 02:42:27 -0800 (PST)
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
In-Reply-To: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com>
Message-ID: <Pine.LNX.4.10.9912040232240.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, Fredrik Lundh wrote:
> M.-A. Lemburg <mal at lemburg.com> wrote:
>...
> > Because importing in Python has become *much* more
> > complicated over time. There are requests for new
> > features which touch subjects such as storage mechanisms,
> > lookups, signatures (for trusted code), lazy imports, etc.
> 
> sorry, I still don't understand it.  our applications already
> use different storage mechanisms, databases, signatures,
> lazy importing, version handling, etc, etc.  now, if *we*
> have managed to build all that on top of an old version
> of imputil.py, how come it's not sufficient for the rest
> of you?

I agree. The imputil mechanism has been proven in combat to work for many
scenarios. I have not (yet) heard of a case where the model has proven
insufficient.

> > A chain of simple minded importers won't work together
> > too well
> 
> why?  it sure works for us...

Exactly. "Why?" Please provide an example.

>...
> > and downgrade performance considerably due to the
> > many recursive function calls
> 
> now that's what I call premature optimization.  and this
> scares the hell out of me: if the rest of the python-dev
> crowd don't seriously believe that Python is (or can be
> made) fast enough to implement things like this, why
> the heck are you using Python at all?  am I the only
> one here who doesn't believe in osterhout's talk about
> "the great system vs. scripting language divide"?

Don't worry Fredrik... I'm with you on this one. I do not believe there is
a problem with the speed. Nobody has yet profiled imputil to find out
where/how the time is being spent. Nobody has tried to speed it up.
Therefore, any claims about its performance are simply FUD.

I claim that its interface is correct, and you (Fredrik) stated it well:
"given a name, please give me a module if you can (otherwise None)."

Underneath that semantic, there are a lot of things that can be done to
alter the performance and organization. Claims about speed are entirely
premature.

Yes, I'm biased. But, in truth, I haven't seen a better mechanism yet.
I've tossed out a few ideas on how imputil could be improved (which are
solely based on guess, rather than empirical evidence of profiling
output). When those changes are completed and there is still an issue,
then I'll admit defeat and wait for somebody else to provide a new design.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From marangoz at python.inrialpes.fr  Sat Dec  4 12:15:53 1999
From: marangoz at python.inrialpes.fr (Vladimir Marangozov)
Date: Sat, 4 Dec 1999 12:15:53 +0100 (CET)
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
In-Reply-To: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> from "Fredrik Lundh" at Dec 04, 1999 10:54:42 AM
Message-ID: <199912041115.MAA00539@python.inrialpes.fr>

Fredrik Lundh wrote:
> 
[snip]
> 
> <rant size="major">
> 
> frankly, this "design by committee"...
[snip]
> ...  see you!
> 
> </rant>
> 

C'mon /F, it's a battle of ideas and that's the way it works before
filtering the good ones from the bad ones, then focusing on the
appropriate implementation.

I'm in sync with the discussion, although I haven't posted my partial
notes on it due to lack of time. But let me say that overall, this
discussion is a good thing and the more opinions we get, the better.

BTW, you just _can't_ leave like this and start playing solitaire at
the bar, first, because we need beer too and it's unlikely that you'll
find a bar we don't know already, and second, because it was you who
revived this discussion with 1 word, repeated 3 times:

> Subject: Re: [Python-Dev] Python 1.6 status
> Date: Wed, 17 Nov 1999 12:46:01 +0100
> 
> Guido van Rossum <guido at CNRI.Reston.VA.US> wrote:
> > - suggestions for new issues that maybe ought to be settled in 1.6
> 
> three things: imputil, imputil, imputil
> 
> </F>

Thus, with no visible argumentation (so don't shoot on others when they
argue instead of you), and with this one word, you pushed Guido to the
extreme of suggesting a complete redesign of the import machinery from
scratch, based on a "Grand Architecture" :-). Right? -- Right!

This is a fact and a fairly amount of the credits go entirely to you!

Since then, however, I haven't really seen your arguments, and I believe
that nobody here got exactly your point. I, for one, may well argue
against imputil as being just another brick on top of the grand mess.
But because I haven't made the time to write properly my notes, I don't
dare to express a partial opinion, not blame those who argue good or
bad in the meantime, when I'm silent.

So, why are you showing us your back when you have clearly something
to say, but like me, you haven't made the time to say it?  Please don't
waste my time with emotional rants ;-). Everybody here tries to contribute
according to its knowledge, experience and availability.

Later,
-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov at inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From mal at lemburg.com  Sat Dec  4 11:45:52 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 04 Dec 1999 11:45:52 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com>
Message-ID: <3848F0E0.B8132AD2@lemburg.com>

Fredrik Lundh wrote:
> 
> M.-A. Lemburg <mal at lemburg.com> wrote:
> > >     given a name, check if you have a
> > >     module with that name, and install
> > >     it.  if you cannot, fail (in which case
> > >     python asks the next importer along
> > >     the path).
> > >
> > > why do you have to complicate things beyond that?
> > > why not just let Python provide a few base classes
> > > and mixins for people who want to create custom
> > > importers, and be done with it?
> >
> > Because importing in Python has become *much* more
> > complicated over time. There are requests for new
> > features which touch subjects such as storage mechanisms,
> > lookups, signatures (for trusted code), lazy imports, etc.
> 
> sorry, I still don't understand it.  our applications already
> use different storage mechanisms, databases, signatures,
> lazy importing, version handling, etc, etc.  now, if *we*
> have managed to build all that on top of an old version
> of imputil.py, how come it's not sufficient for the rest
> of you?

I've tried to get (an older) imputil.py version up and running
too. It did work, but only after some considerable tweaking
and even with integrated cache mechanisms did not reach
the performance of the builtin importer (which doesn't
use the kinds of caching strategies I had built into
imputil.py). Getting the whole setup to work wasn't easy
at all, because of the way imputil importers delegate work
and things get even more confusing when it starts to "take
over" certain parts of packages by installing temselves
as importers for a particular package.
 
> > A chain of simple minded importers won't work together
> > too well
> 
> why?  it sure works for us...

An example: 

A path importer knows how to scan directories and how to use
a path to tell the correct order. It can maybe also import
.py/.pyc/.pyo files. Now what happens if it finds a shared
lib as module... the usual imputil way would be to delegate
the request to some other importer which can handle shared
libs... but wait: how does the shared lib importer know
where to look ? It will have to rescan the directories,
etc...
 
> > duplicate work
> 
> avoiding duplicate work is what object oriented design
> is all about.  and last time I checked, Python had excellent
> support for that.

See my example above.

The agent approach used by imputil does not support
OO design too well: even though you can avoid duplicate
programming work on the importers by using a few
base classes which implement dir scans, shared lib
imports, etc. the imputil design does not provide
means to avoid duplicate actions taken by the importers.

> > and downgrade performance considerably due to the
> > many recursive function calls
> 
> now that's what I call premature optimization.  and this
> scares the hell out of me: if the rest of the python-dev
> crowd don't seriously believe that Python is (or can be
> made) fast enough to implement things like this, why
> the heck are you using Python at all?  am I the only
> one here who doesn't believe in osterhout's talk about
> "the great system vs. scripting language divide"?

Looks like you are in ranting mode here ;-) Seriously,
I've checked my imputil.py version (with caches enabled)
against the builtin importer and noticed a performance
downgrade by factor >2. This was enough to convince me
of looking for other techniques to handle the problems
I had at the time... you know, relative imports and things.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    27 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Sat Dec  4 12:04:15 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 04 Dec 1999 12:04:15 +0100
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com>
Message-ID: <3848F52F.5F5B748F@lemburg.com>

Fredrik Lundh wrote:
> 
> <rant size="major">
> 
> frankly, this "design by committee" (or is it "design by
> people who've never even been close to implementing
> something because they thought it was too hard, and
> thus think they're qualified to argue against those of
> us who didn't even realize that it was a hard problem"?)

Huh ? Two points:

1. How can you be sure that people haven't tried
   implementing their ideas and for various reasons
   have come to some conclusion about those ideas ?

2. Would you seriously disqualify people from joining a
   discussion by the simple arguement that they
   have not implemented anything yet ?

Just take the Unicode discussion as example: it was
very lively and resulted in a decent proposal which
is now subject to further investigation by the
implementors ;-) Many people have joined in even though
they did not and/or will not implement anything. Still,
their arguments were very useful to show up weaknesses
in the proposal.

Now, let's rather have a beer in the pub around the corner
than go on ranting about :-).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    27 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Sat Dec  4 12:53:33 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 04 Dec 1999 12:53:33 +0100
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
References: <Pine.LNX.4.10.9912040232240.18529-100000@nebula.lyra.org>
Message-ID: <384900BD.D16E72BC@lemburg.com>

Greg Stein wrote:
> > > [me:]
> > > A chain of simple minded importers won't work together
> > > too well
> >
> > why?  it sure works for us...
> 
> Exactly. "Why?" Please provide an example.

See my reply to Fredrik.
 
> >...
> > > and downgrade performance considerably due to the
> > > many recursive function calls
> >
> > now that's what I call premature optimization.  and this
> > scares the hell out of me: if the rest of the python-dev
> > crowd don't seriously believe that Python is (or can be
> > made) fast enough to implement things like this, why
> > the heck are you using Python at all?  am I the only
> > one here who doesn't believe in osterhout's talk about
> > "the great system vs. scripting language divide"?
> 
> Don't worry Fredrik... I'm with you on this one. I do not believe there is
> a problem with the speed. Nobody has yet profiled imputil to find out
> where/how the time is being spent. Nobody has tried to speed it up.

Sorry, Greg, but that is simply not true. I've spend a few
days on trying to get more performance out of it and have
succeeded, but in the end it wasn't enough to convince me
of the approach.

> Therefore, any claims about its performance are simply FUD.

BTW, did anybody mention that an import manager  wouldn't
be able to provide an API which is useable for imputil
style importers ? I'm not argueing against the possibility
to use imputil style importers, just against making it the
sole method of adding wisdom to Python imports.

The imputil importers could well benefit from a manager
providing logic to do basic things like importing
shared libs, checking signatures, downloading modules
from the web, etc.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    27 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein at lyra.org  Sat Dec  4 13:15:13 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 04:15:13 -0800 (PST)
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
In-Reply-To: <384900BD.D16E72BC@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912040402120.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, M.-A. Lemburg wrote:
>...
> > Don't worry Fredrik... I'm with you on this one. I do not believe there is
> > a problem with the speed. Nobody has yet profiled imputil to find out
> > where/how the time is being spent. Nobody has tried to speed it up.
> 
> Sorry, Greg, but that is simply not true. I've spend a few
> days on trying to get more performance out of it and have
> succeeded, but in the end it wasn't enough to convince me
> of the approach.

You sent me your changes... I don't believe that you were aggressive
enough. As I've mentioned before, I think it is quite possible to retain
the general Importer style and get_code() interface, but to shift some
functionality out (to be computed once) to a higher-level mechanism. The
patches that you sent me did not do this, so I'm not surprised that you
hit a wall.

Ack. See? Now I'm getting into discussions about performance and
implementation without truly knowing where the timing is spent. Eyeballing
it, I have an idea, but it would be best too see a profile output. My
mantra is always "90% of the time you're wrong about where 90% of the time
is being spent."

I am unconcerned about performance, but will work on it so that I don't
need to continue this conversation. That burden is on me.

> > Therefore, any claims about its performance are simply FUD.
> 
> BTW, did anybody mention that an import manager  wouldn't
> be able to provide an API which is useable for imputil
> style importers ? I'm not argueing against the possibility
> to use imputil style importers, just against making it the
> sole method of adding wisdom to Python imports.

Since the core will delegate out to Python (note: current working theory),
then it certainly is not the "sole method" (since you can just replace the
Python code). But there must be a default mechanism.

The ihooks stuff was too complicated. imputil seems to be much easier. I'd
love to see a third mechanism.... so I can steal ideas :-)

> The imputil importers could well benefit from a manager
> providing logic to do basic things like importing
> shared libs, checking signatures, downloading modules
> from the web, etc.

For shared libs, yes. For the others: geez... I don't want to see that in
the core infrastructure. Shift that out to specialized Importers. The
infrstructure ought to be teeny and agnostic about how to map a module
name to a module.


Side note to python-dev people: I apologize... I realize that I'm
beginning to get a bit defensive here. I'm going to be at XML '99 until
Friday, so that should give me a breather. When I get back, I'll skip the
talk and do some code.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Sat Dec  4 13:32:04 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 04:32:04 -0800 (PST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <3848F0E0.B8132AD2@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912040416220.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, M.-A. Lemburg wrote:
> Fredrik Lundh wrote:
>...
> > sorry, I still don't understand it.  our applications already
> > use different storage mechanisms, databases, signatures,
> > lazy importing, version handling, etc, etc.  now, if *we*
> > have managed to build all that on top of an old version
> > of imputil.py, how come it's not sufficient for the rest
> > of you?
> 
> I've tried to get (an older) imputil.py version up and running
> too. It did work, but only after some considerable tweaking
> and even with integrated cache mechanisms did not reach
> the performance of the builtin importer (which doesn't
> use the kinds of caching strategies I had built into
> imputil.py).

1) yes, it was an older version and did not have the PathImporter class.
   As a by product, the DirectoryImporters that it *did* have were much
   slower. It still did not support builtins, frozen modules, or dynamic
   loads. All of that is present now, so it works "out of the box" much
   better.

2) Performance: as I wrote in the other email, I don't believe that is an
   argument against the design. The imputil approach *will* be slower than
   the current Python mechanism, but there is some more coding to do to
   truly see how much. The side benefits (e.g. ZipImporter and caching)
   may outweigh the result. Time will tell.

> Getting the whole setup to work wasn't easy
> at all, because of the way imputil importers delegate work
> and things get even more confusing when it starts to "take
> over" certain parts of packages by installing temselves
> as importers for a particular package.

I don't understand this. If it is relevant, then please expand. Thx.

> > > A chain of simple minded importers won't work together
> > > too well
> > 
> > why?  it sure works for us...
> 
> An example: 
> 
> A path importer knows how to scan directories and how to use
> a path to tell the correct order. It can maybe also import
> .py/.pyc/.pyo files. Now what happens if it finds a shared
> lib as module... the usual imputil way would be to delegate
> the request to some other importer which can handle shared
> libs... but wait: how does the shared lib importer know
> where to look ? It will have to rescan the directories,
> etc...

No, the "usual imputil way" is that the PathImporter understands searching
a path and loading stuff from that path. An Importer is a combination of
locating and loading (since they are, typically, tightly bound). The next
rev will allow user-plugging of support for new file types.

> > > duplicate work
> > 
> > avoiding duplicate work is what object oriented design
> > is all about.  and last time I checked, Python had excellent
> > support for that.
> 
> See my example above.
> 
> The agent approach used by imputil does not support
> OO design too well: even though you can avoid duplicate
> programming work on the importers by using a few
> base classes which implement dir scans, shared lib
> imports, etc. the imputil design does not provide
> means to avoid duplicate actions taken by the importers.

There is always a balance to be struck between independence and coupling.
I chose to reduce coupling and increase independence. If you shift a bunch
of stuff out of the Importers, then you will increase the coupling between
the imputil framework and the Importers. That coupling will then close off
future possibilities.

Within the framework itself (e.g. between _import_hook and get_code),
there is a lot of opportunity for change. Since that is behind the covers,
it is no big deal to shift functionality around. I plan to do so.

>...
> Looks like you are in ranting mode here ;-) Seriously,
> I've checked my imputil.py version (with caches enabled)
> against the builtin importer and noticed a performance
> downgrade by factor >2. This was enough to convince me
> of looking for other techniques to handle the problems
> I had at the time... you know, relative imports and things.

I have run a long series of tests. Without doing any performance work on
imputil, the ratio is 9 to 13. The 13 may have bumped up to about 15 or 16
when I added some dynamic loading code (I forget). Regardless, it is
definitely less than a 2X increase. And that is with zero optimization.

*shrug*

I'm done. I'll do some code in a couple weeks.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Sat Dec  4 14:12:32 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 05:12:32 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912031439.JAA16524@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912040456180.18529-100000@nebula.lyra.org>

On Fri, 3 Dec 1999, Guido van Rossum wrote:
>...
> Great response.  I think we know where we each stand.  Please go ahead
> with a new design.  (That's trust, not carte blanche.)

Accepted gratefully. Thx.

> Just one thought: the more I think about it, the less I like
> sys.importers: functionality which is implemented through
> sys.importers must necessarily be placed either in front of all of
> sys.path or after it.  While this is helpful for "canned" apps that
> want *everything* to be imported from a fixed archive, I think that
> for regular Python installations sys.path should remain the point of
> attack.  In particular, installing a new package (e.g. PIL) should
> affect sys.path, regardless of the way of delivery of the modules
> (shared libs, .py files, .pyc files, or a zip archive).

Okay. I'll design with respect to this model.

To be explicit/clear and to be sure I'm hearing you right: sys.path may
contain Importer instances. Given the name FOO, the system will step
through sys.path looking for the first occurence of FOO (looking in a
directory or delegating). FOO may be found with any number of
(configurable) file extensions, which are ordered (e.g. ".so" before
".py" before ".isl").

> I'm not too worried about code that inspects sys.path and expects
> certain invariants; that code is most likely interfering with the
> import mechanism so should be revisited anyway.

The Benevolent Dictator has spoken. So be it.

:-)

> On the lone .pyc issue: I'd like to see this disappear when using the
> filesystem, I see no use for it there if we support .pyc files in zip
> archives.

No problem. This actually creates a simplification in the system, as I'm
seeing it now. I'm also seeing opportunities for a code reorg which may
work towards MAL's issues with performance.

I hope to have something in two or three weeks. I also hope people can be
patient :-), but I certainly wouldn't mind seeing some alternative code!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gmcm at hypernet.com  Sat Dec  4 15:59:44 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Sat, 4 Dec 1999 09:59:44 -0500
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
In-Reply-To: <384900BD.D16E72BC@lemburg.com>
Message-ID: <1267803104-11215142@hypernet.com>

M.-A. Lemburg wrote:
> Greg Stein wrote:

> > Don't worry Fredrik... I'm with you on this one. I do not
> > believe there is a problem with the speed. Nobody has yet
> > profiled imputil to find out where/how the time is being spent.
> > Nobody has tried to speed it up.
> 
> Sorry, Greg, but that is simply not true. I've spend a few
> days on trying to get more performance out of it and have
> succeeded, but in the end it wasn't enough to convince me
> of the approach.
 
Remember those comparisons of Perl and Python, to which 
you added cgipython? I've added to the list a version that uses 
an old version of imputil (probably the one you optimized) and 
a compressed std lib. Note that my Linux python (1.5.2) is 
built in the RedHat style - even struct and strop are .so's; so 
that accounts for the majority of the open calls. This is a full 
Python (runs code.py if you don't pass it a script name). For 
lack of a better name, I've called it "pykit".

 First, the size of log files (in lines), i.e. number of system 
calls:
 
                Solaris     Linux    IRIX[1]
   Perl              88        85      70
   Python           425       316     257
   cgipython                  182 
   pykit                      136

 Next, the number of "open" calls:

                Solaris     Linux    IRIX
   Perl             16         10       9
   Python          107         71      48
   cgipython                   33 
   pykit                        9

 And the number of unsuccessful "open" calls:
 
                Solaris     Linux    IRIX
   Perl              6          1       3
   Python           77         49      32
   cgipython                   28
   pykit                        2
 
 Number of "mmap" calls:
 
                Solaris     Linux    IRIX
   Perl              25        25       1
   Python            36        24       1
   cgipython                   13
   pykit                       21

This test would show off more if it went beyond startup. An 
import of a standard lib module in my stock Python involves 2 
failed stats and 6 failed opens, then 2 successful opens and 2 
fstats before the module is loaded. None of these occur in 
pykit.

The downside (asking my Importer for a .so or a module not in 
the importer) takes no system calls, and involves a dozen or 
so lines of Python and a check of a dictionary.


- Gordon


From tismer at appliedbiometrics.com  Sat Dec  4 16:29:03 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sat, 04 Dec 1999 16:29:03 +0100
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
References: <Pine.LNX.4.10.9912040402120.18529-100000@nebula.lyra.org>
Message-ID: <3849333F.1DF2A201@appliedbiometrics.com>


Greg Stein wrote:
...

> My mantra is always "90% of the time you're wrong about where 90% 
> of the time is being spent."

What a great sentence! We all know it, but many of us
(especially me) forget about it during 90% of our coding time.
Much better to spend this on design (as you did).

thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From jim at interet.com  Sat Dec  4 18:27:44 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Sat, 04 Dec 1999 12:27:44 -0500
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com>
Message-ID: <38494F10.C644BA7@interet.com>

Fredrik Lundh wrote:
> 
> James C. Ahlstrom <jim at interet.com> wrote:
> > IMHO putting shared libs in an archive is a bad idea because the OS

Dear Fredrik,

I thought the point of Python-Dev was to propose designs and get
feedback, right?  Well, I got feedback :-).

OK, I agree to alter my archive format so it provides the
ability to store shared libs and not just *.pyd.  I will
add the string length and if needed a flag indicating the
name is a shared lib.

Now the details:

> have you tried it?  if not, why do you think you should
> be allowed to forbid others from doing it?

Yes I have tried it, and I am currently on my fourth version
of an archive format which is based on formats by Greg Stein
and Gordon McMillan.  I hope it meets with the favor of the
Grand Inquisition, and becomes the standard format.  But
maybe it won't.  Oh well.

> bloody installers.  and here you are advocating that
> we all should be forced to use installers, when python
> makes it trivial to write self-installing apps. double-argh!

I am not forcing anyone to do anything, only proposing that
shared libs are best handled directly by imputil and not
the class within imputil which handles archive files.  It
is just a geeky design issue, nothing more.

JimA


From jim at interet.com  Sat Dec  4 19:31:48 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Sat, 04 Dec 1999 13:31:48 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com>
Message-ID: <38495E14.9C2FB107@interet.com>

"M.-A. Lemburg" wrote:

> An example:
> 
> A path importer knows how to scan directories and how to use
> a path to tell the correct order. It can maybe also import
> .py/.pyc/.pyo files. Now what happens if it finds a shared
> lib as module... the usual imputil way would be to delegate
> the request to some other importer which can handle shared
> libs... but wait: how does the shared lib importer know
> where to look ? It will have to rescan the directories,
> etc...

The above refers to an earlier but still very recent version
of imputil.  On that basis is is perfectly accurate.  Here is
another example from my own experience almost identical to
the above:

One possible archive file format holds its list of archived
*.pyc file names as keys in a dictionary.  This is simple and
efficient, but fails to correctly address the problem of shared
libs (aka DLL's in Windows) with names identical to names of
*.pyc files in the archive.  For example, suppose foo.pyc is in the
archive, and foo.dll is in a directory.  Suppose sys.path is to be
used to decide whether to load foo.pyc or foo.dll.  Then an
"archive importer" will fail to do this.  Specifically you can't
see if foo.pyc is in the archive and then check sys.path, nor can
you do the reverse.  You must call the "archive importer" repeatedly
for each element of sys.path and search the directory at the same time.

JimA


From jim at interet.com  Sat Dec  4 20:51:47 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Sat, 04 Dec 1999 14:51:47 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912040456180.18529-100000@nebula.lyra.org>
Message-ID: <384970D3.26A9ECDB@interet.com>

Greg Stein wrote:
> 
> On Fri, 3 Dec 1999, Guido van Rossum wrote:

> > attack.  In particular, installing a new package (e.g. PIL) should
> > affect sys.path, regardless of the way of delivery of the modules
> > (shared libs, .py files, .pyc files, or a zip archive).

> To be explicit/clear and to be sure I'm hearing you right: sys.path may
> contain Importer instances. Given the name FOO, the system will step
> through sys.path looking for the first occurence of FOO (looking in a
> directory or delegating). FOO may be found with any number of
> (configurable) file extensions, which are ordered (e.g. ".so" before
> ".py" before ".isl").

This is basically a gripe about this design spec.  So if the answer
turns out to be "we need this functionality so shut up" then just
say that and don't flame me.

This spec is painful.  Suppose sys.path has 10 elements, and there
are six file extensions.  Then the simple algorithm is slow:
  for path in sys.path:		# Yikes, may not be a string!
    for ext in file_extensions:
      name = "%s.%s" % (module_name, ext)
      full_path = os.path.join(path, name)
      if os.path.isfile(full_path):
        # Process file here

And sys.path can contain class instances
which only makes things slower.  You could do a readdir() and cache
the results, but maybe that would be slower.  A better
algorithm might be faster, but a lot more complicated.

In the context of archive files, it is also painful.  It prevents
you from saving a single dictionary of module names.  Instead you
must have len(sys.path) dictionaries.  You could try to
save in the archive information about whether (say) a foo.dll was
present in the file system, but the list of extensions is extensible.

The above problem only exists to support equally-named modules; that
is, to support a run-time choice of whether to load foo.pyc, foo.dll,
foo.isl, etc.  I claim (without having written it) that the fastest
algorithm to solve the unique-name case is much faster than the fastest
algorithm to solve the choose-among-equal-names case.

Do we really need to support the equal-name case [Jim runs for
cover...]?
If so, how about inventing a new way to support it.  Maybe if equal
names exist, these must be pre-loaded from a known location?

JimA


From gstein at lyra.org  Sat Dec  4 22:59:00 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 13:59:00 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <384970D3.26A9ECDB@interet.com>
Message-ID: <Pine.LNX.4.10.9912041350200.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
> Greg Stein wrote:
>...
> > To be explicit/clear and to be sure I'm hearing you right: sys.path may
> > contain Importer instances. Given the name FOO, the system will step
> > through sys.path looking for the first occurence of FOO (looking in a
> > directory or delegating). FOO may be found with any number of
> > (configurable) file extensions, which are ordered (e.g. ".so" before
> > ".py" before ".isl").
> 
> This is basically a gripe about this design spec.  So if the answer
> turns out to be "we need this functionality so shut up" then just
> say that and don't flame me.
> 
> This spec is painful.  Suppose sys.path has 10 elements, and there
> are six file extensions.  Then the simple algorithm is slow:
>   for path in sys.path:		# Yikes, may not be a string!
>     for ext in file_extensions:
>       name = "%s.%s" % (module_name, ext)
>       full_path = os.path.join(path, name)
>       if os.path.isfile(full_path):
>         # Process file here

This is the algorithm that Python uses today, and my standard Importers
follow.

> And sys.path can contain class instances
> which only makes things slower.

IMO, we don't know this, or whether it is significant.

> You could do a readdir() and cache
> the results, but maybe that would be slower.  A better
> algorithm might be faster, but a lot more complicated.

Who knows. BUT: the import process is now in Python -- it makes it *much*
easier to run these experiments. We could not really do this when the
import process is "hard-coded" in C code.

> In the context of archive files, it is also painful.  It prevents
> you from saving a single dictionary of module names.  Instead you
> must have len(sys.path) dictionaries.  You could try to
> save in the archive information about whether (say) a foo.dll was
> present in the file system, but the list of extensions is extensible.

I am not following this. What/where is the "single dictionary of module
names" ? Are you referring to a cache? Or is this about building an
archive?

An archive would look just like we have now: map a name to a module. It
would not need multiple dictionaries.

> The above problem only exists to support equally-named modules; that
> is, to support a run-time choice of whether to load foo.pyc, foo.dll,
> foo.isl, etc.  I claim (without having written it) that the fastest
> algorithm to solve the unique-name case is much faster than the fastest
> algorithm to solve the choose-among-equal-names case.
> 
> Do we really need to support the equal-name case [Jim runs for
> cover...]?
> If so, how about inventing a new way to support it.  Maybe if equal
> names exist, these must be pre-loaded from a known location?

I don't understand what the problem is. I don't see one. We are still
mapping a name to a module. sys.path defines a precedence.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Sun Dec  5 02:17:57 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 17:17:57 -0800 (PST)
Subject: [Python-Dev] pyc archives (was: .DLL vs .PYD search order)
In-Reply-To: <38495E14.9C2FB107@interet.com>
Message-ID: <Pine.LNX.4.10.9912041713580.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
>...
> One possible archive file format holds its list of archived
> *.pyc file names as keys in a dictionary.  This is simple and
> efficient, but fails to correctly address the problem of shared
> libs (aka DLL's in Windows) with names identical to names of
> *.pyc files in the archive.  For example, suppose foo.pyc is in the
> archive, and foo.dll is in a directory.  Suppose sys.path is to be
> used to decide whether to load foo.pyc or foo.dll.  Then an
> "archive importer" will fail to do this.  Specifically you can't
> see if foo.pyc is in the archive and then check sys.path, nor can
> you do the reverse.  You must call the "archive importer" repeatedly
> for each element of sys.path and search the directory at the same time.

What? The archive is independent of each .pyc's original position in
sys.path. There is no reason/need to carry that information into an
archive.

If the archive contains "foo", then you're done. If it doesn't, then move
on to the next element of sys.path (directory or Importer instance) and
look there.

Basically: if you deploy an archive, then all of its files will take
precedence over any file found later on sys.path. This is exactly what
sys.path is about: establishing precedence.

If I understand you correctly, then you're trying to say there is some
sort of interleaving that must occur. If so, then I don't understand why.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik at pythonware.com  Mon Dec  6 13:20:34 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 6 Dec 1999 13:20:34 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> <384B7E32.F7B81D82@lemburg.com>
Message-ID: <004401bf3fe4$4cab6ea0$f29b12c2@secret.pythonware.com>

> > you obviously attempted to use imputil to implement
> > non-standard import behaviour on top of the standard
> > storage system -- while we've used it to implement
> > standard import behaviour on top of non-standard
> > storage systems.
> 
> No, I tried to make the imputil approach work as replacement
> for the standard builtin importer.

I'm confused.  earlier, you said (or rather, I think you
said) that you looked at imputil to see if it could "handle
the problems you had at the time"...  and now you say
that you tried to use it as a drop-in replacement for the
"standard path importer".  I must be missing something
here...

> After I got that to work, I added some caching
> to avoid duplicated stats. The resulting importer was
> around twice as slow as the builtin one for the following
> imports:
> 
> # the default one Python does at startup, plus:
> from mx import HTMLTools,DateTime,ODBC
> 
> This is a pretty common setup for my scripts, so its
> preformance is relevant to me.

did you try stuffing all your PYC's into an archive file,
and running them from there?

</F>


From fredrik at pythonware.com  Sun Dec  5 19:22:57 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 5 Dec 1999 19:22:57 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com>
Message-ID: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com>

> I've checked my imputil.py version (with caches enabled)
> against the builtin importer and noticed a performance
> downgrade by factor >2. This was enough to convince me
> of looking for other techniques to handle the problems
> I had at the time... you know, relative imports and things.

hmm.  I think I see the problem here...

you obviously attempted to use imputil to implement
non-standard import behaviour on top of the standard
storage system -- while we've used it to implement
standard import behaviour on top of non-standard
storage systems.

I don't know if imputil is good enough for the former,
and I don't think I care...  I've spent too many nights
debugging code that relied on clever, non-standard
hacks.

</F>

PS. on the performance side of things, did you know
that 're' can be up to ten times slower than 'regex'?
but people don't complain -- probably because it
allows them to do things they couldn't do before...


From jim at interet.com  Mon Dec  6 20:40:01 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 06 Dec 1999 14:40:01 -0500
Subject: [Python-Dev] Re: pyc archives (was: .DLL vs .PYD search order)
References: <Pine.LNX.4.10.9912041713580.18529-100000@nebula.lyra.org>
Message-ID: <384C1111.92984B5A@interet.com>

Greg Stein wrote:
> 
> On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
> >...
> > One possible archive file format holds its list of archived
> > *.pyc file names as keys in a dictionary.  This is simple and
> > efficient, but fails to correctly address the problem of shared

> What? The archive is independent of each .pyc's original position in
> sys.path. There is no reason/need to carry that information into an
> archive.
> 
> If the archive contains "foo", then you're done. If it doesn't, then move
> on to the next element of sys.path (directory or Importer instance) and
> look there.
> 
> Basically: if you deploy an archive, then all of its files will take
> precedence over any file found later on sys.path. This is exactly what
> sys.path is about: establishing precedence.

Sorry, I am a little slow today.  My daughter got me up at 6 am to
work on her computer video editor.  No disk space, fragmentation,
2 gig limit on AVI files, ........

Are you saying this?  If foo is imported, the archive importer is
consulted first to see if it can provide foo.  If not, sys.path is
searched  for foo.pyc, foo.pyl etc., and if foo.pyl is found, then
its contents are added to the single archive importer dictionary.
The order of addition to the archive dictionary is determined by
sys.path, and duplicate names are not entered because they lie later
on sys.path.  But once a file is recognized as in an archive, it
effectively precedes all of sys.path.

Or this?  If foo is imported, sys.path is searched for
foo.pyc, foo.pyl, etc., and also all archive files found
at each element of sys.path are searched for foo.  If "bar"
is imported, it may be found in foo.pyl.  That is,
there is an instance of an archive importer for each element
of sys.path.

What if the user names an archive file not on sys.path?  What
order does it have?

JimA


From jim at interet.com  Mon Dec  6 19:34:41 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 06 Dec 1999 13:34:41 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912041350200.18529-100000@nebula.lyra.org>
Message-ID: <384C01C1.8D1AFFFF@interet.com>

Greg Stein wrote:
> 
> On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
> >         # Process file here
> 
> This is the algorithm that Python uses today, and my standard Importers
> follow.

Agreed.
 
> > And sys.path can contain class instances
> > which only makes things slower.
> 
> IMO, we don't know this, or whether it is significant.

Agreed.
 
> > You could do a readdir() and cache
> > the results, but maybe that would be slower.  A better
> > algorithm might be faster, but a lot more complicated.
> 
> Who knows. BUT: the import process is now in Python -- it makes it *much*
> easier to run these experiments. We could not really do this when the
> import process is "hard-coded" in C code.

Agreed.
 
> > In the context of archive files, it is also painful.  It prevents
> > you from saving a single dictionary of module names.  Instead you
> > must have len(sys.path) dictionaries.  You could try to
> > save in the archive information about whether (say) a foo.dll was
> > present in the file system, but the list of extensions is extensible.
> 
> I am not following this. What/where is the "single dictionary of module
> names" ? Are you referring to a cache? Or is this about building an
> archive?
> 
> An archive would look just like we have now: map a name to a module. It
> would not need multiple dictionaries.

The "single dictionary of names" is in the single archive importer
instance and has nothing to do with creating the archive.  It
is currently programmed this way.

Suppose the user specifies by name 12 archive files to be searched.
That is, the user hacks site.py to add archive names to the importer.
The "single dictionary" means that the archive importer takes the 12
dictionaries in the 12 files and merges them together into one
dictionary
in order to speed up the search for a name.  The good news is you can
always just call the archive importer to get a module.  The bad news is
you can't do that for each entry on sys.path because there is no
necessary identity between archive files and sys.path.  The user
specified the archive files by name, and they may or may not be on
sys.path, and the user may or may not have specified them in the
same order as sys.path even if they are.

Suppose archive files must lie on sys.path and are processed in order.
Then to find them you must know their name.  But IMHO you want to
avoid doing a readdir() on each element of sys.path and looking for
files *.pyl.

Suppose archive file names in general are the known name "lib.pyl"
for the Python library, plus the names "package.pyl" where "package"
can be the name of a Python package as a single archive file.  Then
if the user tries to import foo, imputil will search along sys.path
looking for foo.pyc, foo.pyl, etc.  If it finds foo.pyl, the archive
importer will add it to its list of known archive files.  But it must
not add it to its single dictionary, because that would destroy the
information about its position along sys.path.  Instead, it must keep
a separate dictionary for each element of sys.path and search the
separate dictionaries under control of imputil.  That is, get_code()
needs a new argument for the element of sys.path being searched.
Alternatively, you could create a new importer instance for each
archive file found, but then you still have multiple dictionaries.
They are in the multiple instances.

All this is needed only to support import of identically named
modules.  If there are none, there is no problem because sys.path
is being used only to find modules, not to disambiguate them.

See also my separate reply to your other post which discusses
this same issue.

JimA


From gstein at lyra.org  Tue Dec  7 01:43:21 1999
From: gstein at lyra.org (Greg Stein)
Date: Mon, 6 Dec 1999 16:43:21 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <384C01C1.8D1AFFFF@interet.com>
Message-ID: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>

On Mon, 6 Dec 1999, James C. Ahlstrom wrote:
> Greg Stein wrote:
>...
> > I am not following this. What/where is the "single dictionary of module
> > names" ? Are you referring to a cache? Or is this about building an
> > archive?
> > 
> > An archive would look just like we have now: map a name to a module. It
> > would not need multiple dictionaries.
> 
> The "single dictionary of names" is in the single archive importer
> instance and has nothing to do with creating the archive.  It
> is currently programmed this way.

Ah. There is the problem. In Guido's suggestion for the "next path of
inquiry" :-), there is no "single dictionary of names". Instead, you have
Importer instances as items in sys.path. Each instance maintains its
dictionary, and they are not (necessarily) combined.

If we were to combine them, then we would need to maintain the ordering
requirements implied by sys.path. However, this would be problematic if
sys.path changed -- we would have to detect the situation and rebuild a
merged dict.

> Suppose the user specifies by name 12 archive files to be searched.
> That is, the user hacks site.py to add archive names to the importer.
> The "single dictionary" means that the archive importer takes the 12
> dictionaries in the 12 files and merges them together into one
> dictionary
> in order to speed up the search for a name.  The good news is you can
> always just call the archive importer to get a module.  The bad news is
> you can't do that for each entry on sys.path because there is no
> necessary identity between archive files and sys.path.  The user
> specified the archive files by name, and they may or may not be on
> sys.path, and the user may or may not have specified them in the
> same order as sys.path even if they are.

The importer must be inserted into sys.path to establish a precedence. If
the user wants to add 12 libraries... fine. But *all* of those modules
will fall under a precedence defined by the Importer's position on
sys.path.

> Suppose archive files must lie on sys.path and are processed in order.
> Then to find them you must know their name.  But IMHO you want to
> avoid doing a readdir() on each element of sys.path and looking for
> files *.pyl.

I do not believe that we will arbitrarily locate and open library files.
They must be specified explicitly.

> Suppose archive file names in general are the known name "lib.pyl"
> for the Python library, plus the names "package.pyl" where "package"
> can be the name of a Python package as a single archive file.  Then
> if the user tries to import foo, imputil will search along sys.path
> looking for foo.pyc, foo.pyl, etc.  If it finds foo.pyl, the archive
> importer will add it to its list of known archive files.  But it must
> not add it to its single dictionary, because that would destroy the
> information about its position along sys.path.  Instead, it must keep
> a separate dictionary for each element of sys.path and search the
> separate dictionaries under control of imputil.  That is, get_code()
> needs a new argument for the element of sys.path being searched.
> Alternatively, you could create a new importer instance for each
> archive file found, but then you still have multiple dictionaries.
> They are in the multiple instances.

If the user installs ".pyl" as a recognized extension (i.e. installs into
the PathImporter), then the above scenario is possible. In my
in-head-design, I had not imagined any state being retained for
extension-recognizer hooks. Of course, state can be retained simply by
using a bound-method for the hook function.

get_code() would not need to change. The foo.pyl would be consulted at the
appropriate time based on where it is found in sys.path. Note that file-
extension hooks would definitely have a complete path to the target file.
Those are not Importers, however (although they will closely follow the
get_code() hook since the extension is called from get_code).


From tim_one at email.msn.com  Tue Dec  7 06:11:25 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 7 Dec 1999 00:11:25 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com>
Message-ID: <001601bf4071$8278cc20$88a0143f@tim>

[/F]
> PS. on the performance side of things, did you know
> that 're' can be up to ten times slower than 'regex'?
> but people don't complain -- probably because it
> allows them to do things they couldn't do before...

Bad example:  people do complain about this.  Those who care a lot continue
to use regex, temporarily pacified by the promise that re.py will get
recoded in C and thus regain a good chunk of regex's speed.  Those who care
a whale of a lot continue to use Perl <0.9 wink>.


From guido at CNRI.Reston.VA.US  Tue Dec  7 13:45:25 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 07 Dec 1999 07:45:25 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Mon, 06 Dec 1999 16:43:21 PST."
             <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org> 
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org> 
Message-ID: <199912071245.HAA21596@eric.cnri.reston.va.us>

> If we were to combine them, then we would need to maintain the ordering
> requirements implied by sys.path. However, this would be problematic if
> sys.path changed -- we would have to detect the situation and rebuild a
> merged dict.

No need to worry about this: just don't merge the caches.  Compared to
the hundreds of failed open() calls that are done now, it's no big
deal to do 12 failed Python dictionary lookups instead of one.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik at pythonware.com  Tue Dec  7 14:25:54 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 7 Dec 1999 14:25:54 +0100
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>
Message-ID: <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com>

Greg Stein <gstein at lyra.org> wrote:
> > The "single dictionary of names" is in the single archive importer
> > instance and has nothing to do with creating the archive.  It
> > is currently programmed this way.
> 
> Ah. There is the problem. In Guido's suggestion for the "next path of
> inquiry" :-), there is no "single dictionary of names". Instead, you have
> Importer instances as items in sys.path. Each instance maintains its
> dictionary, and they are not (necessarily) combined.

so the "sys.path contains importers (or strings)" strategy
is now officially sanctioned?  cool!!!

(a quick look in our code base says that this will cause
some trouble, unless os.path.isdir() is modified to reject
non-strings...  after all, if it's not a string, it cannot be
a valid directory path, so this does make some sense ;-)

another aside: can we have a standard mechanism for
listing the contents of a given archive, please?  we have
a lot of "path scanning" stuff (PIL and PST, among others),
and it would be great if things didn't break down if you
stuff it all in an archive.

something like:

    for path in sys.path:
        if os.path.isdir(path):
            files = os.listdir(path)
        else:
            try:
                files = path.listdir()
            except AttributeError:
                files = None
        if files is None:
            # no idea what's in here
        else:
            # path provides (at least) these modules

would be really useful.

and yes, it shouldn't have to be mentioned, since squeeze
have done it since early 1997, but archive importers should
provide a standard way to include non-module resources in
the archive, and a standard way to access such resources
as ordinary python streams.

e.g:

    file = path.open(name, "rb")

or something...

</F>


From jim at interet.com  Tue Dec  7 16:20:15 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 07 Dec 1999 10:20:15 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org> <199912071245.HAA21596@eric.cnri.reston.va.us>
Message-ID: <384D25AF.4C4F5107@interet.com>

Guido van Rossum wrote:

> No need to worry about this: just don't merge the caches.  Compared to
> the hundreds of failed open() calls that are done now, it's no big
> deal to do 12 failed Python dictionary lookups instead of one.

Agreed.

JimA


From jim at interet.com  Tue Dec  7 16:31:30 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 07 Dec 1999 10:31:30 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>
Message-ID: <384D2852.3C36C216@interet.com>

Greg Stein wrote:

> Ah. There is the problem. In Guido's suggestion for the "next path of
> inquiry" :-), there is no "single dictionary of names". Instead, you have
> Importer instances as items in sys.path. Each instance maintains its
> dictionary, and they are not (necessarily) combined.

> [A large number of other design issues]

OK, all design issues agreed.  I will make needed changes.

JimA


From jim at interet.com  Tue Dec  7 16:37:36 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 07 Dec 1999 10:37:36 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org> <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com>
Message-ID: <384D29C0.3D3A2194@interet.com>

Fredrik Lundh wrote:

> another aside: can we have a standard mechanism for
> listing the contents of a given archive, please?

I will add this.

> and yes, it shouldn't have to be mentioned, since squeeze
> have done it since early 1997, but archive importers should
> provide a standard way to include non-module resources in
> the archive, and a standard way to access such resources
> as ordinary python streams.

I will add this.

JimA


From gstein at lyra.org  Tue Dec  7 17:53:49 1999
From: gstein at lyra.org (Greg Stein)
Date: Tue, 7 Dec 1999 08:53:49 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912071245.HAA21596@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912070853230.21367-100000@nebula.lyra.org>

On Tue, 7 Dec 1999, Guido van Rossum wrote:
> > If we were to combine them, then we would need to maintain the ordering
> > requirements implied by sys.path. However, this would be problematic if
> > sys.path changed -- we would have to detect the situation and rebuild a
> > merged dict.
> 
> No need to worry about this: just don't merge the caches.  Compared to
> the hundreds of failed open() calls that are done now, it's no big
> deal to do 12 failed Python dictionary lookups instead of one.

Have no fear... I wasn't planning on this... complicates too much stuff
for too little gain.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido at CNRI.Reston.VA.US  Wed Dec  8 13:07:31 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 07:07:31 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 02:46:02 EST."
             <000201bf4150$46749da0$5aa2143f@tim> 
References: <000201bf4150$46749da0$5aa2143f@tim> 
Message-ID: <199912081207.HAA00040@eric.cnri.reston.va.us>

[Great analysis, Tim!]

> 4) The audience is Python end-users "in general", and the product is pure
> Python.  I think this is the most important one for Distutils to address,
> and compilation isn't a part of it.  So far, though, what Gordon is doing
> seems more appropriate than what Distutils has been up to.  I hope his work
> gets folded into this.

I'm not sure what stuff by which Gordon you're referring to.  I am
only familiar with his installer, which I thought is win32 only (but
I may be mistaken) and is an installer for a whole application, not
just a bunch of modules.  Please correct me if I'm wrong.

But this reminds me of a different issue, which Jim Ahlstrom has been
hammering about before: there's a completely separate set of cases
where what you are distributing is a stand-alone application, and the
target consists of end users who are entirely uninterested in whether
it's written in Python, C or Elvish.  (And then there's still the
distinction between Win32, Unix or both.)  The current distutil dools
don't deal with this at all.  I think it should though, and I think
its framework is powerful enough to be able to add this, e.g. as a new
"appdist" command.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Wed Dec  8 15:16:07 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 09:16:07 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 02:46:02 EST."             <000201bf4150$46749da0$5aa2143f@tim> 
Message-ID: <1267460464-31845181@hypernet.com>

Guido wrote:

> [Great analysis, Tim!]
> 
> > 4) The audience is Python end-users "in general", and the
> > product is pure Python.  I think this is the most important one
> > for Distutils to address, and compilation isn't a part of it. 
> > So far, though, what Gordon is doing seems more appropriate
> > than what Distutils has been up to.  I hope his work gets
> > folded into this.
> 
> I'm not sure what stuff by which Gordon you're referring to.  I
> am only familiar with his installer, which I thought is win32
> only (but I may be mistaken) and is an installer for a whole
> application, not just a bunch of modules.  Please correct me if
> I'm wrong.

It needed a name. I hate the word "Installer", but it expresses 
in one word the most common use of my stuff.

I'll be releasing a beta for Linux real soon. Only some of the 
tricks are Windows only (such as self-extracting executables, 
which is only culturally appropriate on Windows, anyway).

But more importantly it's not just for installing. The Python I 
use (interactively) on my wife's machine is 1 directory with 
about 6 files in it. On my Linux box I've been using the std lib 
in a .pyz for about a month now. Someone distributing a pure 
Python package could instead ship 3 files (imputil.py, 
archive.py and <package>.pyz) with the "install" consisting of 
adding one line to site.py in the user's perfectly normal Python 
installation.

And yeah, I solved the "manifest" problem, too. Mine predates 
Distutils, so don't accuse me of duplicate effort, (I pointed 
them to it a couple times). It uses ConfigParser and a config 
file, so it allows finer control.

While .pyz's are completely cross-platform, I have yet to work 
out endianness issues in the other archive I use (which should 
probably be zip format - it can hold anything). And at the 
"Installer" end, I have yet to work out how things should work 
on non-ELF/COFF platforms (where I can't append the archive 
to the executable). But there aren't any technical issues 
involved; just lack of time.

So no, it's not just for Windows; and no, it's not just for 
creating standalones (though that's what almost everyone 
uses it for).

- Gordon


From guido at CNRI.Reston.VA.US  Wed Dec  8 15:56:42 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 09:56:42 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 09:16:07 EST."
             <1267460464-31845181@hypernet.com> 
References: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim>  
            <1267460464-31845181@hypernet.com> 
Message-ID: <199912081456.JAA00200@eric.cnri.reston.va.us>

> It needed a name. I hate the word "Installer", but it expresses 
> in one word the most common use of my stuff.
> 
> I'll be releasing a beta for Linux real soon. Only some of the 
> tricks are Windows only (such as self-extracting executables, 
> which is only culturally appropriate on Windows, anyway).
> 
> But more importantly it's not just for installing. The Python I 
> use (interactively) on my wife's machine is 1 directory with 
> about 6 files in it. On my Linux box I've been using the std lib 
> in a .pyz for about a month now. Someone distributing a pure 
> Python package could instead ship 3 files (imputil.py, 
> archive.py and <package>.pyz) with the "install" consisting of 
> adding one line to site.py in the user's perfectly normal Python 
> installation.
> 
> And yeah, I solved the "manifest" problem, too. Mine predates 
> Distutils, so don't accuse me of duplicate effort, (I pointed 
> them to it a couple times). It uses ConfigParser and a config 
> file, so it allows finer control.
> 
> While .pyz's are completely cross-platform, I have yet to work 
> out endianness issues in the other archive I use (which should 
> probably be zip format - it can hold anything). And at the 
> "Installer" end, I have yet to work out how things should work 
> on non-ELF/COFF platforms (where I can't append the archive 
> to the executable). But there aren't any technical issues 
> involved; just lack of time.
> 
> So no, it's not just for Windows; and no, it's not just for 
> creating standalones (though that's what almost everyone 
> uses it for).

Gordon, I'm sorry, but from this description I still have no idea what
your stuff is (and I forgot the URL so I can't look it up).  For
example, if it's not (just) for installing, what *is* it for?

What is the ``"manifest" problem'' and how did you solve it?

Also, note that editing site.py is a no-no!  You can create/edit
sitecustomize.py, but you should leave site.py alone!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Wed Dec  8 17:17:03 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 11:17:03 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081456.JAA00200@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 09:16:07 EST."             <1267460464-31845181@hypernet.com> 
Message-ID: <1267453215-32281635@hypernet.com>

Guido,
 
> Gordon, I'm sorry, but from this description I still have no idea
> what your stuff is (and I forgot the URL so I can't look it up). 

http://starship.python.org/crew/gmcm/installer.html

The Linux stuff has a couple alpha testers and will probably 
get announced in a week or two.

> For example, if it's not (just) for installing, what *is* it for?
 
At the bottom level, it's a bunch of tools using freeze's 
modulefinder, imputil.py and 2 kinds of archives. There's at 
least 2 layers above that, with "Installer" being the top.  
There's a clean separation between the layers, so you can 
break in wherever you like.

> What is the ``"manifest" problem'' and how did you solve it?

The problem is specifying a set of resources, hopefully without 
having to list them explicitly. I solve this with a config file that 
lets you specify packages, directories, directory trees.. with 
filters that can work from paths, names, extensions, regular 
expressions...
 
> Also, note that editing site.py is a no-no!  You can create/edit
> sitecustomize.py, but you should leave site.py alone!

That would work fine. One of the standalone configurations will 
write a site.py, but that's for a completely self-contained 
installation (ie, one which will have no conflicts with another 
Python installation). 

I'd also note that, for Windows at least, the path-expanding 
mechanism created by site.py has not caught on. I've got lots 
installed, and no site-python, site-packages or sitecustomize.


- Gordon


From guido at CNRI.Reston.VA.US  Wed Dec  8 17:23:34 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 11:23:34 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 11:17:03 EST."
             <1267453215-32281635@hypernet.com> 
References: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com>  
            <1267453215-32281635@hypernet.com> 
Message-ID: <199912081623.LAA04119@eric.cnri.reston.va.us>

[me]
> > Also, note that editing site.py is a no-no!  You can create/edit
> > sitecustomize.py, but you should leave site.py alone!

[Gordon]
> That would work fine. One of the standalone configurations will 
> write a site.py, but that's for a completely self-contained 
> installation (ie, one which will have no conflicts with another 
> Python installation). 
> 
> I'd also note that, for Windows at least, the path-expanding 
> mechanism created by site.py has not caught on. I've got lots 
> installed, and no site-python, site-packages or sitecustomize.

You shouldn't see site-python or site-packages, they only exist on
Unix.  On Windows, everything is installed in the top Python
directory.  However you should see .pth files there, which is what
site.py looks for.  I believe NumPy and PIL use those.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Wed Dec  8 17:55:51 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 11:55:51 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081623.LAA04119@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 11:17:03 EST."             <1267453215-32281635@hypernet.com> 
Message-ID: <1267450887-32421651@hypernet.com>

> [Gordon]
> > That would work fine. One of the standalone configurations will
> > write a site.py, but that's for a completely self-contained
> > installation (ie, one which will have no conflicts with another
> > Python installation). 
> > 
> > I'd also note that, for Windows at least, the path-expanding
> > mechanism created by site.py has not caught on. I've got lots
> > installed, and no site-python, site-packages or sitecustomize.
[Guido] 
> You shouldn't see site-python or site-packages, they only exist
> on Unix.  

You mean "they only exist _for_ Unix", (site.py looks for them 
on Windows). I don't like that. For one thing, modulo a few 
platform differences, the same mechanism should work for 
multi-user Unix and Windows LAN installations. And single-
user Windows (I know, redundant, even on NT) should be a 
degenerate case of the above.

> On Windows, everything is installed in the top Python
> directory.  However you should see .pth files there, which is
> what site.py looks for.  I believe NumPy and PIL use those.

No NumPy, no PIL, no .pth files. 99% of everything out there 
just says "unzip this somewhere on your Python path".

In this case, Jim Ahlstrom may be right - there are too many 
options, or at least an insufficiently emphasized "proper" 
method. Until I worked out my own way of installing stuff, I 
used to lose a large number of packages whenever I upgraded 
my Windows Python.

Much as I love Mark's stuff (and hesitate to criticize crazy 
Aussies), I wish there weren't so much special casing here for 
Windows.

And no, I don't have any solutions to this, I'm just griping...

- Gordon


From guido at CNRI.Reston.VA.US  Wed Dec  8 18:07:30 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 12:07:30 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 11:55:51 EST."
             <1267450887-32421651@hypernet.com> 
References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com>  
            <1267450887-32421651@hypernet.com> 
Message-ID: <199912081707.MAA04242@eric.cnri.reston.va.us>

> [Guido] 
> > You shouldn't see site-python or site-packages, they only exist
> > on Unix.  

[Gordon]
> You mean "they only exist _for_ Unix", (site.py looks for them 
> on Windows).

No it doesn't.  The code in site.py only adds site-packages and
site-python when os.sep is '/'.  RTSL.

> I don't like that. For one thing, modulo a few 
> platform differences, the same mechanism should work for 
> multi-user Unix and Windows LAN installations. And single-
> user Windows (I know, redundant, even on NT) should be a 
> degenerate case of the above.

What do you mean by "the same mechanism should work"?  The same
mechanism for what?  Are you talking about sharing the installed
files somehow?

> > On Windows, everything is installed in the top Python
> > directory.  However you should see .pth files there, which is
> > what site.py looks for.  I believe NumPy and PIL use those.
> 
> No NumPy, no PIL, no .pth files. 99% of everything out there 
> just says "unzip this somewhere on your Python path".

Fair enough.  Of course I know about .pth files so I unzipped them
elsewhere and added a .pth file pointing there...

> In this case, Jim Ahlstrom may be right - there are too many 
> options, or at least an insufficiently emphasized "proper" 
> method. Until I worked out my own way of installing stuff, I 
> used to lose a large number of packages whenever I upgraded 
> my Windows Python.

The .pth files are designed for this.  Maybe they haven't been
explained as well as they should.

> Much as I love Mark's stuff (and hesitate to criticize crazy 
> Aussies), I wish there weren't so much special casing here for 
> Windows.

It's not Mark's fault, it's Microsoft's fault.  If you don't do things
the way MS wants you to, experienced Windows users will gripe,
misunderstand what you do, etc.

> And no, I don't have any solutions to this, I'm just griping...

Ditto.  Understanding the problems is half of the solution though.
The problems seem pretty complex!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Wed Dec  8 19:25:50 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 13:25:50 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 11:55:51 EST."             <1267450887-32421651@hypernet.com> 
Message-ID: <1267445488-32746429@hypernet.com>

[Guido] 
> No it doesn't.  The code in site.py only adds site-packages and
> site-python when os.sep is '/'.  RTSL.

Oops. Missed that.

> > I don't like that. For one thing, modulo a few 
> > platform differences, the same mechanism should work for 
> > multi-user Unix and Windows LAN installations. And single- user
> > Windows (I know, redundant, even on NT) should be a degenerate
> > case of the above.
> 
> What do you mean by "the same mechanism should work"?  The same
> mechanism for what?  Are you talking about sharing the installed
> files somehow?

In the above, "mechanism" basically meant that which creates 
sys.path. 

Basically, this came up for me because in standalone 
configurations (my Installer again), I have to take complete 
control of sys.path. After doing so differently on Windows and 
Linux, I finally realized that I can do it the same way on both.
 
Which makes me question why they are so different.

> The .pth files are designed for this.  Maybe they haven't been
> explained as well as they should.

I'd say "badgered" or "browbeaten" instead of "explained" ;-).
 
> > Much as I love Mark's stuff (and hesitate to criticize crazy
> > Aussies), I wish there weren't so much special casing here for
> > Windows.
> 
> It's not Mark's fault, it's Microsoft's fault.  If you don't do
> things the way MS wants you to, experienced Windows users will
> gripe, misunderstand what you do, etc.

Even MS doesn't do things the way MS says they want you to.

I find MS users equally divided between those who scream 
bloody murder if you touch the registry, and those who 
scream if you don't.

It's not like *nixen suffer from an excessive degree of 
conformity in preferred installation procedures, but somehow 
Python survives there...

> > And no, I don't have any solutions to this, I'm just griping...
> 
> Ditto.  Understanding the problems is half of the solution
> though. The problems seem pretty complex!

Grumpily agreed ;-).


- Gordon


From jim at interet.com  Wed Dec  8 19:33:51 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 08 Dec 1999 13:33:51 -0500
Subject: [Python-Dev] Linux Journal confirms evil rumor
References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com>  
	            <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us>
Message-ID: <384EA48F.F5190180@interet.com>

I finally got around to reading the current Linux
Journal (which just keeps getting better and better)
and lo! there was a picture of a familiar face I just
couldn't quite....

Oh no!  Could it be true?  I heard rumors but I refused to
believe them until now.  The glasses are gone!  Guido now
looks like an investment banker!  The sky is falling!

Next will probably be a Python 1.6 as a 27 Meg DLL, and
a Python IPO.  Well, maybe not.  Now that I look more
closely, he is wearing a black and white and mustard
(??MUSTARD) T-shirt which says "You Need Python".

At least we ought to make him wear a name tag at IPC8.

JimA


From fdrake at acm.org  Wed Dec  8 19:37:44 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 8 Dec 1999 13:37:44 -0500 (EST)
Subject: [Python-Dev] Linux Journal confirms evil rumor
In-Reply-To: <384EA48F.F5190180@interet.com>
References: <1267453215-32281635@hypernet.com>
	<1267450887-32421651@hypernet.com>
	<199912081707.MAA04242@eric.cnri.reston.va.us>
	<384EA48F.F5190180@interet.com>
Message-ID: <14414.42360.309237.967766@weyr.cnri.reston.va.us>

James C. Ahlstrom writes:
 > Oh no!  Could it be true?  I heard rumors but I refused to
 > believe them until now.  The glasses are gone!  Guido now
 > looks like an investment banker!  The sky is falling!

  I'm afraid this non-distinctive look was introduced at IPC7... it's
too bad we can't tell people Python was invented by the guy with the
glasses anymore.

 > Next will probably be a Python 1.6 as a 27 Meg DLL, and
 > a Python IPO.  Well, maybe not.  Now that I look more
 > closely, he is wearing a black and white and mustard
 > (??MUSTARD) T-shirt which says "You Need Python".

  It's really the blue & white & orange IPC7 shirt.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From bwarsaw at cnri.reston.va.us  Wed Dec  8 19:41:51 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Wed, 8 Dec 1999 13:41:51 -0500 (EST)
Subject: [Python-Dev] Linux Journal confirms evil rumor
References: <1267453215-32281635@hypernet.com>
	<1267450887-32421651@hypernet.com>
	<199912081707.MAA04242@eric.cnri.reston.va.us>
	<384EA48F.F5190180@interet.com>
Message-ID: <14414.42607.701538.783684@anthem.cnri.reston.va.us>

>>>>> "JCA" == James C Ahlstrom <jim at interet.com> writes:

    JCA> Oh no!  Could it be true?  I heard rumors but I refused to
    JCA> believe them until now.  The glasses are gone!  Guido now
    JCA> looks like an investment banker!  The sky is falling!

He's not the only one who's, like, "gone corporate", but I won't
mention any names, so as to protect the guilty.


From jim at digicool.com  Wed Dec  8 20:03:42 1999
From: jim at digicool.com (Jim Fulton)
Date: Wed, 08 Dec 1999 14:03:42 -0500
Subject: [Python-Dev] Linux Journal confirms evil rumor
References: <1267453215-32281635@hypernet.com>
		<1267450887-32421651@hypernet.com>
		<199912081707.MAA04242@eric.cnri.reston.va.us>
		<384EA48F.F5190180@interet.com> <14414.42607.701538.783684@anthem.cnri.reston.va.us>
Message-ID: <384EAB8E.EBA595B5@digicool.com>

"Barry A. Warsaw" wrote:
> 
> He's not the only one who's, like, "gone corporate", but I won't
> mention any names, so as to protect the guilty.

OK, Buzz.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From tim_one at email.msn.com  Thu Dec  9 06:31:52 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 9 Dec 1999 00:31:52 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us>
Message-ID: <000301bf4206$b39e5b80$36a2143f@tim>

[Guido]
> [Great analysis, Tim!]

I beg to differ:  it's internally inconsistent and should have identified at
least 3 axes and hence at least 8 cases.  Still, you got more than you paid
for <wink>.

>> 4) The audience is Python end-users "in general", and the
>> product is pure Python.  I think this is the most important one
>> for Distutils to address, and compilation isn't a part of it.
>> So far, though, what Gordon is doing seems more appropriate
>> than what Distutils has been up to.  I hope his work gets folded
>> into this.

> I'm not sure what stuff by which Gordon you're referring to.

You guessed right!

> I am only familiar with his installer, which I thought is win32
> only (but I may be mistaken) and is an installer for a whole
> application, not just a bunch of modules.  Please correct me if
> I'm wrong.

If it can install a whole app, what makes you suspect it couldn't install
just a bunch of modules <0.5 wink>?

It started life as Windows-only, and I believe it's been virtually ignored
by non-Windows folk because of that.  Bad blind spot.  It supplies
already-working approaches to many of the issues that are still being
*talked* about on Distutils (at least archive formats, code to manipulate
same, manifest files (how do you tell the tool which files to package?), and
transparently bundling a Python interpreter when needed).

> But this reminds me of a different issue, which Jim Ahlstrom has
> been hammering about before: there's a completely separate set of
> cases where what you are distributing is a stand-alone application,
> and the target consists of end users who are entirely uninterested
> in whether it's written in Python, C or Elvish.

I include part of that in my case #4 above, where the app happens to be
written in Pure Python -- but the user doesn't have to know that.  Gordon is
addressing at least that part of it.  AFAIK he can't deal with transparently
compiling C or exorcising Elvish on the target platform, but if you're just
distributing the binaries I expect his work is directly usable already.

> (And then there's still the distinction between Win32, Unix or
> both.)

I vote "both".  The world really doesn't need another Win32-only (or
Unix-only) installer, archive format, compression format, or distribution
model.

Jim seems mostly interested in Win32-only to me, and his concerns haven't
been about the mechanics of distribution but about how-- regardless of
tool --to create a bulletproof Python installation by hook or by crook.
Last time we went thru this, it was concluded that one couldn't without
patching the Python Windows binary with a resource editor (to point to its
own infernal <0.5 wink> registry entries).

Distutils hasn't talked about that at all (that I've seen, anyway); if there
were a less radical approach to that, I suspect Jim would be delighted to
use one of the commercial Win32 installation pkgs (and if that's what his
customers expect, delighted or not that's what he'll do).

> The current distutil dools don't deal with this at all.

That's why I said I thought what Gordon is doing seems more appropriate to
case #4 than what Distutils has been doing.

> I think it should though,

Ditto.

> and I think its framework is powerful enough to be able to
> add this, e.g. as a new "appdist" command.

I cordially invite (since Gordon will uncordially browbeat <wink>) people to
look seriously at what he's done.  Best I can tell, for apps that don't need
compilation "on the other end", it's mostly "there" already!

give-the-man-a-hand-ly y'rs  - tim


From tim_one at email.msn.com  Thu Dec  9 06:52:23 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 9 Dec 1999 00:52:23 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <1267453215-32281635@hypernet.com>
Message-ID: <000601bf4209$90a90c80$36a2143f@tim>

> http://starship.python.org/crew/gmcm/installer.html

Eh?  Doesn't work for me.  This does:

    http://starship.python.net/crew/gmcm/distribute.html


From tim_one at email.msn.com  Thu Dec  9 07:38:54 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 9 Dec 1999 01:38:54 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us>
Message-ID: <000701bf4210$10925a40$36a2143f@tim>

[Gordon]
>> Much as I love Mark's stuff (and hesitate to criticize crazy
>> Aussies), I wish there weren't so much special casing here for
>> Windows.

[Guido]
> It's not Mark's fault, it's Microsoft's fault.  If you don't do
> things the way MS wants you to, experienced Windows users will
> gripe, misunderstand what you do, etc.

Something just occurred to me:  MS's guidelines aren't arbitrary, they
actually have very good reasons.  In the case of putting all an app's
crucial info in the Registry, it's the only way to allow a site
administrator to set policy and site options remotely (an admin can fiddle
other machines' registries remotely).  This works very well indeed when
there's only "one copy" of an app on a machine (or at most one copy "per
user").

What just occurred to me is that JimA is concerned with *not* letting any
info from a previously-installed Python affect the app he's installing.
Similarly, Gordon's Win32 "standalone installer" modifies python.exe and
pythonw.exe to use a PYTHONPATH he forces, leaving the registry out of it.
Similarly, the woes I've had in trying to sell Python as a general Win32
scripting tool at work mostly boil down to that there's no effortless way to
do it that doesn't risk picking up info from-- or forcing info
onto --pre-existing or future distinct Python installations (in contrast,
Perl "just works" in this respect).

IOW, the three of us find getting path info out of the registry intolerable
because we are in fact trying to do the opposite of what the registry
mechanism was *designed* for:  we want perfect isolation, not perfect
sharing.

This has come up on Python-Help a few times too, in the guise of someone
installing a product that in turn installs an older version of Python, which
in turn confuses another product that relies on features in a newer version
of Python.

So while the traditional Windows .ini file (like Unix this-or-that.rc file)
model was replaced by the registry for excellent reasons, those reasons
don't apply to the way we're using Python!  The .ini file model was exactly
right for what most of us seem to want to do, and the registry model is
exactly wrong.

just-thought-i'd-cheer-you-up<wink>-ly y'rs  - tim


From skip at mojam.com  Thu Dec  9 08:38:36 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 9 Dec 1999 01:38:36 -0600 (CST)
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <000701bf4210$10925a40$36a2143f@tim>
References: <199912081707.MAA04242@eric.cnri.reston.va.us>
	<000701bf4210$10925a40$36a2143f@tim>
Message-ID: <14415.23676.775163.786028@dolphin.mojam.com>

    Tim> So while the traditional Windows .ini file (like Unix
    Tim> this-or-that.rc file) model was replaced by the registry for
    Tim> excellent reasons, those reasons don't apply to the way we're using
    Tim> Python!  The .ini file model was exactly right for what most of us
    Tim> seem to want to do, and the registry model is exactly wrong.

Alright!  Now I understand what all the hubbub is about!  My eyes have
mostly been glazing over trying to follow all this Windows registry/path/ini
stuff.  MS believes that Python is the application.  Those of us writing
Python programs view those programs as the applications, not the Python
interpreter per se.  Is there some way that people writing applications in
Python can set up registry entries that are specific to their application
(e.g. tabnanny.py) instead of only specific to the Python interpreter?

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From gmcm at hypernet.com  Thu Dec  9 15:17:27 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Thu, 9 Dec 1999 09:17:27 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <000701bf4210$10925a40$36a2143f@tim>
References: <199912081707.MAA04242@eric.cnri.reston.va.us>
Message-ID: <1267374045-37047016@hypernet.com>

[Guido]
> > It's not Mark's fault, it's Microsoft's fault.  If you don't do
> > things the way MS wants you to, experienced Windows users will
> > gripe, misunderstand what you do, etc.
[Tim] 
> Something just occurred to me:  MS's guidelines aren't arbitrary,
> they actually have very good reasons.  In the case of putting all
> an app's crucial info in the Registry, it's the only way to allow
> a site administrator to set policy and site options remotely (an
> admin can fiddle other machines' registries remotely).  This
> works very well indeed when there's only "one copy" of an app on
> a machine (or at most one copy "per user").

And actually, the business about separate subtrees for the 
machine's configuration and the user's configuration is pretty 
clever. MS doesn't explain it well, and it gets misused, but 
when done right, it's a lot simpler than the maze of .xxxrc files 
you sometimes find in other OSes.
 
> What just occurred to me is that JimA is concerned with *not*
> letting any info from a previously-installed Python affect the
> app he's installing. Similarly, Gordon's Win32 "standalone
> installer" modifies python.exe and pythonw.exe to use a
> PYTHONPATH he forces, leaving the registry out of it. Similarly,
> the woes I've had in trying to sell Python as a general Win32
> scripting tool at work mostly boil down to that there's no
> effortless way to do it that doesn't risk picking up info from--
> or forcing info onto --pre-existing or future distinct Python
> installations (in contrast, Perl "just works" in this respect).

In my Linux version, I went to the heart of the matter - 
getpath.c. It occurs to me that getpath.c might do better to 
follow a normal bootstrap process - ie,  create the absolute 
minimal sys.path required to go to the next step. Then the 
rest of what goes on in getpath.c could be written in Python. 
Maybe that Python code needs to get frozen in (to prevent 
bozos from destroying an installation by stepping on 
getpath.py), but it would make it a lot easier to create 
independent installations, and also reduce the variations 
between platforms at the C level. (Then again, I've never heard 
of anyone stepping on exceptions.py.)

If some registry manipulation primitives were exposed (say, 
through ntpath) that would mean that Windows developers 
could (if they wanted) play by the MS rules with at least the 
option of not stepping on each other.
 

- Gordon


From jim at interet.com  Thu Dec  9 16:02:18 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 10:02:18 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim>
Message-ID: <384FC47A.BB4DA517@interet.com>

Tim Peters wrote:

> Jim seems mostly interested in Win32-only to me, and his concerns haven't
> been about the mechanics of distribution but about how-- regardless of
> tool --to create a bulletproof Python installation by hook or by crook.

Not exactly.  I am interested in how to create a bullet-proof
installation.
But I am equally interested in Unix (especially Linux) and dislike the
current dichotomy in the code base.

Lately I have been more active in distribution via archive files.
Part of the solution is an archive file format which is identical on
Unix and Windows, and which can hold the Python library and packages
as single files.  For my own efforts on this see:

    ftp://ftp.interet.com/pub/pylib.html

This is an archive file format similar to Gordon's format, although
Gordon's work goes well beyond just file formats.  I currently have
fifth generation code for this format, and am adding features as
suggested by Fredrik Lundt.  I hope it gets considered as a candidate
for a Python standard format.

> Distutils hasn't talked about that at all (that I've seen, anyway);

Gordon, Greg Stein and I have discussed file formats before.  I think
it was on distutils.  Anyway that was months ago.

JimA


From guido at CNRI.Reston.VA.US  Thu Dec  9 17:17:18 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 11:17:18 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 09:17:27 EST."
             <1267374045-37047016@hypernet.com> 
References: <199912081707.MAA04242@eric.cnri.reston.va.us>  
            <1267374045-37047016@hypernet.com> 
Message-ID: <199912091617.LAA05742@eric.cnri.reston.va.us>

> [Guido]
> > > It's not Mark's fault, it's Microsoft's fault.  If you don't do
> > > things the way MS wants you to, experienced Windows users will
> > > gripe, misunderstand what you do, etc.
> [Tim] 
> > Something just occurred to me:  MS's guidelines aren't arbitrary,
> > they actually have very good reasons.  In the case of putting all
> > an app's crucial info in the Registry, it's the only way to allow
> > a site administrator to set policy and site options remotely (an
> > admin can fiddle other machines' registries remotely).  This
> > works very well indeed when there's only "one copy" of an app on
> > a machine (or at most one copy "per user").
[Gordon]
> And actually, the business about separate subtrees for the 
> machine's configuration and the user's configuration is pretty 
> clever. MS doesn't explain it well, and it gets misused, but 
> when done right, it's a lot simpler than the maze of .xxxrc files 
> you sometimes find in other OSes.

I agree.  And I am guilty of not even try to find MS' explanation -- I
just looked in the registry at what other apps did and tried to mimic
that (plus what Mark had already done), without really knowing what I
was doing.  I now know a little better -- see the end of this message.

> In my Linux version, I went to the heart of the matter - 
> getpath.c. It occurs to me that getpath.c might do better to 
> follow a normal bootstrap process - ie,  create the absolute 
> minimal sys.path required to go to the next step. Then the 
> rest of what goes on in getpath.c could be written in Python. 
> Maybe that Python code needs to get frozen in (to prevent 
> bozos from destroying an installation by stepping on 
> getpath.py), but it would make it a lot easier to create 
> independent installations, and also reduce the variations 
> between platforms at the C level. (Then again, I've never heard 
> of anyone stepping on exceptions.py.)

Yes, this is exactly what was proposed in the thread on the Big Import
Rewrite.

> If some registry manipulation primitives were exposed (say, 
> through ntpath) that would mean that Windows developers 
> could (if they wanted) play by the MS rules with at least the 
> option of not stepping on each other.

That's a good idea.  These functions are already available through
Mark's win32api extension -- much of which will eventually (I hope
before 1.6 is out!) become part of the core distribution.

In the mean time, I've been thinking a bit more about how Python
should be using the Windows registry.  (It's clear to me that Python
should use the registry -- those who disagree can go build their own
Python distribution.)

The basic ideas of Python's current registry usage are sound: there's
a resource built into the DLL which is part of the key into the
registry used for all information.

The problem lies in which key is used.  All versions of Python 1.5.x
(1.5, 1.5.1, 1.5.2) use the same key!  This is a main cause of
trouble, because it means that different versions cannot peacefully
live together even if the user installs them into different
directories -- they will all use the registry keys of the last version
installed.  This, in turn, means that someone who writes a Python
application that has a dependency on a particular Python version (and
which application worth distributing doesn't :-) cannot trust that if
a Python installation is present, it is the right one.  But they also
cannot simply bundle the standard installer for the correct Python
version with their program, because its installation would overwrite
an existing Python application, thus breaking some *other* Python apps
that the user might already have installed.

(There's a solution for app builders who are willing to do a lot of
work -- you can change the registry key resource in the DLL.  For
example, Alice comes with its own version of Python 1.5.1 and it uses
"1.5.1-alice" as its registry key.  The Alice installer installs
Python in a subdirectory of the Alice installation directory and
points the 1.5.1-alice registry entries there.  The problem is that
this is a lot of work for the average app builder.)

I thought a bit about how VB solves this.  I think that when you wrap
up a VB app in, all the support code (mostly a big DLL) is wrapped
with it.  When the user runs the installer, the DLL is installed
(probably in the WINDOWS directory).  If a user installs several VB
apps built with the same VB version, they all attempt to install the
exact same DLL; of course the installers notice this and optimize it
away, keeping a reference count.  (Ignoring for now the fact that
those reference counts don't always work!)  If an app builty with a
different VB version is installed, it has a DLL with a different name,
and that is installed separately.  Other support files, I presume, are
dealt with in much the same way.  Voila, there's the theory.

How can we do something similar for Python?

A app written in Python should need to install only three or four
files:

- a driver EXE to start the app
- a copy of the Python DLL
- the Python library in an archive
- the app code in an archive

The latter two could be combined into a single archive, but I propose
that we use two archives so that the DLL and the Python library
archive can be shared between installations of independent Python apps
as long as they use the exact same Python version and don't need
additional 3rd party packages.  (I believe that Jim A's proposal
combines the archives with the EXE and the DLL, reducing the number of
files to two.  That's fine too.)

Is there a use for the registry here at all?  Maybe not.  (I notice
that VB seems to have a single registry entry, pointing to a DLL; all
other VB files also seem to live there.)

Complications:

- Some apps may need a custom extension module, which has to be
  installed as a PYD file.  So it seems that there needs to be a
  directory per app, and perhaps per version of the app (if the app
  distributor cares).

- Some apps need other, non-pyc files (e.g. data tables or help
  files); it would be handy if these could be stored in the archives as
  well.

- Some standard extension modules are in their own PYD files; these
  also need to be installed.  They aren't typically marked with a
  version, so perhaps a path directory per version of Python (if not per
  installed app) is wise.

- How to distribute an app that needs 3rd party stuff, e.g. Tcl/Tk, or
  PIL, or NumPy?  Their Python code can easily be wrapped up in another
  archive with a standard name incorporating a version number; but the
  required PYD and DLL files are a separate story.  (E.g. for Tkinter,
  you need _tkinter.pyd which links against tcl80.dll.)  Basically the
  same solution as for standard PYD files can work; the needed DLL files
  can be installed either systemwide (if they have a reliable version
  number in their name, like tcl80.dll) or in the per-app or per-package
  directory (like NumPy).

- Presumably, the archives will contain PYC files only.  This means
  that tracebacks will not show source code, only line numbers.  For Jim
  A, this is probably exactly what he wants (if the user gets a
  traceback, his "robust app" has miserably failed, and he takes it in
  pride that this doesn't happen).  But for some others, access to the
  sources could be essential.

  For example, I might want to distribute IDLE using this mechanism;
  users of IDLE who are curious about the standard library (or about
  IDLE itself) should be able to open the source for an arbitrary module
  (and maybe even edit it, although that's not a priority and perhaps
  should even be discouraged).  Library source access is an important
  feature of the IDLE debugger as well.

  A way out for IDLE is to install a classic distribution of the Python
  library sources, into the filesystem at an IDLE specific location.
  Other apps, with only the need for source code in tracebacks, might
  choose to to have the PY files in the archives sitting next to the PYC
  files, and somehow the traceback mechanism should be accessing the
  archive to get a hold of the source.

And yes, I realize that Jim A's latest offering solves most of these
problems to a large extent -- well done.  (Jim, would you care to
comment on the issues that you don't address?  Will you address them
in a future version?)

Final notes:

There are two different problems here.  One is how to distribute
Python apps robustly to end users who don't particular care about
Python.  This is Jim A's problem (and he has a solution that works for
him).  In general the solutions here try to isolate the installed app
from other Python installations.  I'm proposing that at least the DLL
and the Python library archive can probably be shared between apps
without reducing robustness if we keep track more carefully of version
numbers.

The other problem is how to distribute packages of Python and
extension modules for use by Python users.  These typically need to
drop into some existing Python installation.  This is Paul Dubois'
problem with NumPy (amongst others) and is the current focus of the
distutil SIG.

However I believe that there could be a lot of common infrastructure
that would help us create better solutions for both problems.  For
package distribution, common infrastructure (a.k.a. standards) is
essential.  For app distribution, common infrastructure isn't so
important (since the solutions strive for total isolation, there's no
problem if different apps use solutions).  However, this changes when
app creators want to distribute robust self-sufficient apps that use
3rd party packages -- then the 3rd party packages must allow being
packaged up using the app distribution creator of choice.

Solving this compound problem (creating package distributions that can
be redistributed easily as part of robust Python app distributions)
should be an important goal for the infrastructure we're building
here.  The Big Import Rewrite ought to add this to its list of
objectives if it isn't already on it.  My guess is that the solution
for this compound problem will increase the dependency of app
distribution tools on the package distribution infrastructure; which
to me seems like a Good Thing because it would lead to more code
sharing.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at interet.com  Thu Dec  9 17:24:40 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 11:24:40 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000701bf4210$10925a40$36a2143f@tim>
Message-ID: <384FD7C8.12832BF1@interet.com>

Tim Peters wrote:

> Something just occurred to me:  MS's guidelines aren't arbitrary, they
> actually have very good reasons.  In the case of putting all an app's
> crucial info in the Registry, it's the only way to allow a site
> administrator to set policy and site options remotely (an admin can fiddle
> other machines' registries remotely).  This works very well indeed when
> there's only "one copy" of an app on a machine (or at most one copy "per
> user").

The registry is still a bad idea because it lumps critical and app data
into single files and brings up the ugly problem of protecting
individual registry entries instead of just files.  Microsoft
should have put all app config into the app directory and provided
for remote admin of that.  But that is not really your point (just
ranting about the registry again).

> IOW, the three of us find getting path info out of the registry intolerable
> because we are in fact trying to do the opposite of what the registry
> mechanism was *designed* for:  we want perfect isolation, not perfect
> sharing.
> 
> This has come up on Python-Help a few times too, in the guise of someone
> installing a product that in turn installs an older version of Python, which
> in turn confuses another product that relies on features in a newer version
> of Python.

Or, in other words, no isolation is possible if critical info
depends on global data like PYTHONPATH or a _common_ registry
entry.  We could have different registry entries, but this is
confusing and not documented.

I think we can solve this with archive files in a way compatible
with Unix without going off on a Windows-only wavelength.  If the
archive file contains everything, and it is in the dir of the app,
and the app looks there and finds it, then it Just Works.

See also my reply to Skip.

JimA


From akuchlin at mems-exchange.org  Thu Dec  9 17:32:08 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Thu, 9 Dec 1999 11:32:08 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
Message-ID: <199912091632.LAA09236@amarok.cnri.reston.va.us>

After poking around in the O'Reilly POSIX book, here's a list of POSIX
functions that don't seem to be available in Python.  Not all of them
seem worth supporting.   Ironically, Greg Ward's daemonize() Perl
subroutine, which started me on this, doesn't actually seem to need
anything that Python doesn't have.

I'm looking for corrections to the list; are there other POSIX
functions I've missed, or are some of them actually in Python?

I think implementing most of these functions is straightforward, with
the exception of opendir/readdir/closedir.

Worth adding?
=============
opendir(), readdir(), closedir() -- 
	   most of their functionality is available through
	   os.listdir(), but it might be useful to have a direct
	   interface.  Downside is that this would require a new
	   extension type for the C DIR struct.  My (lazy) inclination
	   is to not bother.

Worth adding:
=============

abort() -- used in Py_FatalError(), but not accessible to Python code

ctermid(), ctermid_r() -- returns the terminal pathname 
	   -- probably just add ctermid(), but use ctermid_r() for
thread-safety
            
fpathconf(fd, name) -- Get configuration limit for a file
	    -- would need constants from unistd.h

getlogin() -- returns user's login name
	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
	 getlogin() apparently looks in utmp

getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs

pathconf(path, name) -- Gets config variables for a path
	    -- would need constants from unistd.h

sysconf(int name) -- Gets system configuration information
	    -- would need constants from unistd.h

Not worth adding:
=================
clearerr() -- looks like fileobjects call clearerr() before raising errors

cuserid() -- returns user's login name
	  -- ORA book says "Do not use this function" -- removed in 1990 POSIX

difftime
	  -- seems only required in C "because no addition properties
are defined for time_t" (Solaris man page)              

tmpfile(), tmpnam() -- Create temp file, generate temp filename
		    -- Similar functionality available in tempfile.py

mblen(), mbstowcs(), mbtowc(), wcstombs(),  wctomb()
	 -- Multi-byte character functions: 
	 -- Don't bother; wait for the Unicode type.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
I'm sorry I became abusive just now ... calling you worms... I was just
speaking relatively, you understand.
    -- Dekko, in ZOT! #3


From jcw at equi4.com  Thu Dec  9 17:38:13 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Thu, 09 Dec 1999 17:38:13 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com>
Message-ID: <384FDAF5.C25C447C@equi4.com>

"James C. Ahlstrom" wrote:

[...]
>     ftp://ftp.interet.com/pub/pylib.html

Ouch - what's wrong with zip archives?

There are utilities to convert to/from zip, to re-pack, to mount zip
transparently so it's entries look like regular files, FTP servers, etc.

Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format.

Zips would seem natural with JPython.  And suppose that scripting ever
starts to consolidate to a common scripting kernel (yah, well), do you
really want a system which is closing all doors to cross-fertilization?

Zip has an advantage over .tar.gz in that its table of contents is
available without having to decompress the whole kaboodle.

Your format has no checksum, which for deployment and long-term storage
can be important.

If you want a marshalled TOC, then why not add a manifest entry for it,
sort of like what ranlib does with ar?

You designed the format so archives can be concatenated without any tool
(other than "cat"), but this works just as well with zip files, as the
Tcl Wrap approach demonstrates.

Allow me to very, very loosely paraphrase Guido here: sure, everyone can
design an archive format, but they are likely to make the same mistakes
all over again - so why not adopt a format which is tried and tested?

With all due respect - I sincerely hope you will reconsider and alter
your code to work with zip files.  It's probably a small adjustment?

Unless your *intent* is to create a diverging standard, of course...

-- Jean-Claude


From jim at interet.com  Thu Dec  9 17:46:35 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 11:46:35 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <199912081707.MAA04242@eric.cnri.reston.va.us>
		<000701bf4210$10925a40$36a2143f@tim> <14415.23676.775163.786028@dolphin.mojam.com>
Message-ID: <384FDCEB.2226C1C1@interet.com>

Skip Montanaro wrote:

> MS believes that Python is the application.  Those of us writing
> Python programs view those programs as the applications, not the Python
> interpreter per se.

I think this is a good point.  Windows app programmers (mostly)
view Python as part of their app and try it install it in their
app directory.  Unix installs Python as a system app in multiple
versions and users use PATH to pick a version.  Unix users view
the Python interpreter as a system service which is needed for
running their app.

I think this is because a Windows app is a visual program,
and the Python release compiles to a console app (not really
a visual program).  So all
(?most) Windows Python apps are custom mains with Python
as a component, but the stock python.exe is not the main.
This makes it difficult to document a way to install Python
in the Unix fashion, since all apps need their own binary main
and python15.dll is the only thing in common.

IMHO archive files can solve this a lot more simply.

JimA


From guido at CNRI.Reston.VA.US  Thu Dec  9 17:55:40 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 11:55:40 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 17:38:13 +0100."
             <384FDAF5.C25C447C@equi4.com> 
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com>  
            <384FDAF5.C25C447C@equi4.com> 
Message-ID: <199912091655.LAA05928@eric.cnri.reston.va.us>

> "James C. Ahlstrom" wrote:
> 
> [...]
> >     ftp://ftp.interet.com/pub/pylib.html

Jean-Claude Wippler replied:

> Ouch - what's wrong with zip archives?
> 
> There are utilities to convert to/from zip, to re-pack, to mount zip
> transparently so it's entries look like regular files, FTP servers, etc.
> 
> Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format.
> 
> Zips would seem natural with JPython.  And suppose that scripting ever
> starts to consolidate to a common scripting kernel (yah, well), do you
> really want a system which is closing all doors to cross-fertilization?
> 
> Zip has an advantage over .tar.gz in that its table of contents is
> available without having to decompress the whole kaboodle.
> 
> Your format has no checksum, which for deployment and long-term storage
> can be important.
> 
> If you want a marshalled TOC, then why not add a manifest entry for it,
> sort of like what ranlib does with ar?
> 
> You designed the format so archives can be concatenated without any tool
> (other than "cat"), but this works just as well with zip files, as the
> Tcl Wrap approach demonstrates.
> 
> Allow me to very, very loosely paraphrase Guido here: sure, everyone can
> design an archive format, but they are likely to make the same mistakes
> all over again - so why not adopt a format which is tried and tested?
> 
> With all due respect - I sincerely hope you will reconsider and alter
> your code to work with zip files.  It's probably a small adjustment?
> 
> Unless your *intent* is to create a diverging standard, of course...

Exactly my sentiments.  We have rough Python code to deal with zip
files; it's very rough because we got kind of carried away adding
features and ended up with spaghetti code :-(  But it's working code
nevertheless and we're offering it up for anyone in this group to
clean up (we could do that ourselves but it's not high on our current
priority list).

I don't know anything about Tcl Wrap.  I do know a great deal about
the ZIP format, but apparently I missed the concatenation feature.
How does this work?  Does that work for all zip tools, or just for the
ZIP reader in Wrap?  (I looked up how Jim A does it -- his central
directory at the end of the file contains the total size of the data
covered by that directory, so he seeks back to the beginning of it and
sees if another magic number precedes it; and so on.  Very simple.)

I quickly looked at the Wrap page; it shows how to access data files
stored in the archive.  Question: does the wrap::open code go out to
the regular filesystem if it finds there's no wrap archive?  That
would be handy so you can test the code in its unwrapped form without
change.  Python needs this too.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward at cnri.reston.va.us  Thu Dec  9 18:12:00 1999
From: gward at cnri.reston.va.us (Greg Ward)
Date: Thu, 9 Dec 1999 12:12:00 -0500
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Dec 09, 1999 at 11:32:08AM -0500
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <19991209121159.B20179@cnri.reston.va.us>

On 09 December 1999, Andrew M. Kuchling said:
> After poking around in the O'Reilly POSIX book, here's a list of POSIX
> functions that don't seem to be available in Python.  Not all of them
> seem worth supporting.   Ironically, Greg Ward's daemonize() Perl
> subroutine, which started me on this, doesn't actually seem to need
> anything that Python doesn't have.

I think I already pointed this your way, but don't forget the man page
for Perl's POSIX module: "perldoc POSIX".  I suspect POSIX functions
that don't make sense in Perl also don't make sense in Python.

I agree with all your assessments about what's worth adding and what's
not, and that {close,read,open}dir() are questionable and probably not
worth the bother.  Random thoughts:

> abort() -- used in Py_FatalError(), but not accessible to Python code

Would this do the same as in C, ie. terminate the process and dump core?

> getlogin() -- returns user's login name
> 	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
> 	 getlogin() apparently looks in utmp

With a documentation proviso that utmp is very old-fashioned, and you
really should do the getuid() thing unless you definitely want to get
the login ID from utmp.  Perhaps an alternate "getlogin" (different
name?) that does the getuid() thing could be provided.

        Greg


From guido at CNRI.Reston.VA.US  Thu Dec  9 18:16:03 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 12:16:03 -0500
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: Your message of "Thu, 09 Dec 1999 12:12:00 EST."
             <19991209121159.B20179@cnri.reston.va.us> 
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>  
            <19991209121159.B20179@cnri.reston.va.us> 
Message-ID: <199912091716.MAA06063@eric.cnri.reston.va.us>

> > getlogin() -- returns user's login name
> > 	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
> > 	 getlogin() apparently looks in utmp
> 
> With a documentation proviso that utmp is very old-fashioned, and you
> really should do the getuid() thing unless you definitely want to get
> the login ID from utmp.  Perhaps an alternate "getlogin" (different
> name?) that does the getuid() thing could be provided.

There's the getpass module which has a getuser() function that looks
in various env vars and if all else fails uses getuid() and pwd.

If the goal is to get the user ID without being fooled, using
os.getuid() or os.geteuid() directly seems to be the right thing to
do; I don't see the need for a shorthand for
pwd.getpwuid(os.getuid())[0] (which is what getuser() uses).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Thu Dec  9 18:18:10 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 12:18:10 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 10:02:18 EST."
             <384FC47A.BB4DA517@interet.com> 
References: <000301bf4206$b39e5b80$36a2143f@tim>  
            <384FC47A.BB4DA517@interet.com> 
Message-ID: <199912091718.MAA06087@eric.cnri.reston.va.us>

[Jim A]
> Lately I have been more active in distribution via archive files.
> Part of the solution is an archive file format which is identical on
> Unix and Windows, and which can hold the Python library and packages
> as single files.  For my own efforts on this see:
> 
>     ftp://ftp.interet.com/pub/pylib.html

Apart from agreeing with Jean-Claude's rant about inventing a new
archive format, I think this is a good proposal because it is very
clear about the problem it tries to solve and doesn't get distracted
by other issues.  I also commend Jim for building upon Greg Stein's
imputil (like Gordon did).  I wish I could present a solution this
simple as The Standard Way, but (as explained in my long post earlier
today) there just are so many wrinkles that I'd rather hold out for
the Right Solution...  But I've taken good notice of Jim's solution.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From beazley at cs.uchicago.edu  Thu Dec  9 18:16:57 1999
From: beazley at cs.uchicago.edu (David Beazley)
Date: Thu, 9 Dec 1999 11:16:57 -0600 (CST)
Subject: [Python-Dev] Missing POSIX functions: the list
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
	<19991209121159.B20179@cnri.reston.va.us>
Message-ID: <199912091716.LAA15624@gargoyle.cs.uchicago.edu>

Greg Ward writes:
> 
> I think I already pointed this your way, but don't forget the man page
> for Perl's POSIX module: "perldoc POSIX".  I suspect POSIX functions
> that don't make sense in Perl also don't make sense in Python.
> 
> I agree with all your assessments about what's worth adding and what's
> not, and that {close,read,open}dir() are questionable and probably not
> worth the bother.  Random thoughts:
> 

I disagree.  I think that the POSIX module should strive to be as
complete as possible--even if certain functions are closely related
other functionality in the library (tmpfile for instance).  I suspect
that this sort of thing is probably the cause of the missing
functionality in the current library (as in, "why would anyone want to
do that?" when in fact there may be a perfectly good reason in certain
situations).  

> > abort() -- used in Py_FatalError(), but not accessible to Python code
> 
> Would this do the same as in C, ie. terminate the process and dump core?
> 

Sure, why not?  This might be a useful thing to do every so
often---when trying to figure out what's wrong with a C extension
module for instance.

Cheers,

Dave


From jim at interet.com  Thu Dec  9 18:43:57 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 12:43:57 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com>
Message-ID: <384FEA5D.A07F23EC@interet.com>

Jean-Claude Wippler wrote:

> Ouch - what's wrong with zip archives?

Thanks very much for looking over the format.

In general Zip archives store whole branches of a file
system.  A Python ./Lib zip archive would contain:

  N:/python/Python-1.5.2/Lib/string.pyc
  N:/python/Python-1.5.2/Lib/os.pyc
  N:/python/Python-1.5.2/Lib/copy.pyc
  N:/python/Python-1.5.2/Lib/test/testall.pyc

Zip archives are isomorphic to branches of a file system.
That means there must be a sys.path for each zip archive file.
How would this be specified?

The archive format stores modules as dotted names, just as they
appear in the import statement.  The search path is "." in every
archive file by definition.  The import statement "import foo"
just results in a dictionary lookup for key "foo", not a search
through a zip directory along a local search path for "foo.something"
where "something" can be pyc, pyo, py, etc.

The intent was to link the archives to the import statement, not
re-create a directory tree.  It borrowed this feature from
the archive formats of Greg and Gordon.

> There are utilities to convert to/from zip, to re-pack, to mount zip
> transparently so it's entries look like regular files, FTP servers, etc.

Basic operations (to, from, repack) are easy in Python.

> Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format.

Hmmm....
 
> Your format has no checksum, which for deployment and long-term storage
> can be important.

Actually the pylib.py "dir()" method reads all *.pyc with marshal,
and I am depending on marshal to object to bad data and also
out-of-date magic numbers.  But this is a good point.

> If you want a marshalled TOC, then why not add a manifest entry for it,
> sort of like what ranlib does with ar?

Sorry, I don't understand.  Please explain.

> You designed the format so archives can be concatenated without any tool
> (other than "cat"), but this works just as well with zip files, as the
> Tcl Wrap approach demonstrates.

Are you saying that cat zip1.zip zip2.zip > myzip.zip works?

An important feature is the ability to concatenate to a binary:
  cat python.exe zip1.zip > myapp.exe
Searching for this isn't fast unless magic numbers are at the
end.  Are zip files recognizable from the end (I don't know)?

> Allow me to very, very loosely paraphrase Guido here: sure, everyone can
> design an archive format, but they are likely to make the same mistakes
> all over again - so why not adopt a format which is tried and tested?
> 
> With all due respect - I sincerely hope you will reconsider and alter
> your code to work with zip files.  It's probably a small adjustment?
> 
> Unless your *intent* is to create a diverging standard, of course...

The intent is to create a standard but not a diverging standard.

Are there any zip experts out there?  Can zip files satisfy all the
design requirements I listed in pylib.html?  Is there zip code
available?  All my code is in Python.

JimA


From jcw at equi4.com  Thu Dec  9 18:57:33 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Thu, 09 Dec 1999 18:57:33 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com>  
	            <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us>
Message-ID: <384FED8D.3C535D38@equi4.com>

Guido van Rossum wrote:
> 
> [... my not-really-meant-as-rant about adopting zip as format ...]
>
[zip concatenation feature]

> How does this work?  Does that work for all zip tools, or just for the
> ZIP reader in Wrap?  (I looked up how Jim A does it -- his central
> directory at the end of the file contains the total size of the data
> covered by that directory, so he seeks back to the beginning of it and
> sees if another magic number precedes it; and so on.  Very simple.)

Same for Wrap.  Standard tools would not see the preceding ZIP groups.

In terms of maintenance, I'd avoid this trick.  I merely wanted to point
out that zip archives can be stacked, if the reader is set up to it.

> Question: does the wrap::open code go out to the regular filesystem
> if it finds there's no wrap archive?  That would be handy so you can
> test the code in its unwrapped form without change.

IIRC, Wrap overrides "open" for embedded entries as "file.zip/abc.py".
There's more being developed in this area: a "virtual file system" which
lets you mount archives and such (VFS by Matt Newman, mentioned with his
permission), so that the file-system model can be extended to navigate
into a lot more things than real file systems.

Andrew Kuchling's post hints at another tangent: opendir/readdir is of
course simply an enumeration.  There's a lot of "genericity" lurking in
scanning across file systems, trees, networks, and resources in general.

<minirant> The filesystem <-> OO dichotomy needs a review. </minirant>

> Python needs this too.

<voice location=in-the-desert level=timid>
Concepts like these have a lot to offer - and would make even more sense
if they were done in a way which benefits multiple scripting languages.
Feel free to reply by email if you ever want to further discuss this.
</voice>

-- Jean-Claude


From fdrake at acm.org  Thu Dec  9 19:10:44 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 13:10:44 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <14415.61604.415084.520092@weyr.cnri.reston.va.us>

Andrew M. Kuchling writes:
 > After poking around in the O'Reilly POSIX book, here's a list of POSIX
 > functions that don't seem to be available in Python.  Not all of them
 > seem worth supporting.   Ironically, Greg Ward's daemonize() Perl

  I think your assessment is reasonable.  I looked at posixmodule.c
and note also that the functions use PyArg_Parse() and PyArg_NoArgs()
instead of using PyArg_ParseTuple().  The advantage of
PyArg_ParseTuple() is that the name of the function can be specified
for inclusion in TypeError messages when the arguments are not of the
right type.
  I'm doing some work to correct this now.  I've also added ctermid(), 
and will try to add at least a few more before I check in the changes.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From bwarsaw at cnri.reston.va.us  Thu Dec  9 19:17:35 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Thu, 9 Dec 1999 13:17:35 -0500 (EST)
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim>
	<384FC47A.BB4DA517@interet.com>
	<384FDAF5.C25C447C@equi4.com>
	<199912091655.LAA05928@eric.cnri.reston.va.us>
	<384FED8D.3C535D38@equi4.com>
Message-ID: <14415.62015.856931.750279@anthem.cnri.reston.va.us>

>>>>> "JW" == Jean-Claude Wippler <jcw at equi4.com> writes:

    JW> Same for Wrap.  Standard tools would not see the preceding ZIP
    JW> groups.

    JW> In terms of maintenance, I'd avoid this trick.  I merely
    JW> wanted to point out that zip archives can be stacked, if the
    JW> reader is set up to it.

I agree.  I can't recall the details now, but I had a lot of problems
with zip concatenation in JPython.  I think at least some of the older
Java tools for groking zips don't work with contatenation.

-Barry


From guido at CNRI.Reston.VA.US  Thu Dec  9 19:21:42 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 13:21:42 -0500
Subject: [Python-Dev] Virtual filesystem APIs
In-Reply-To: Your message of "Thu, 09 Dec 1999 18:57:33 +0100."
             <384FED8D.3C535D38@equi4.com> 
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us>  
            <384FED8D.3C535D38@equi4.com> 
Message-ID: <199912091821.NAA06209@eric.cnri.reston.va.us>

Jean-Claude Wippler:
> There's more being developed in this area: a "virtual file system" which
> lets you mount archives and such (VFS by Matt Newman, mentioned with his
> permission), so that the file-system model can be extended to navigate
> into a lot more things than real file systems.

I agree.  We have experimented with this a bunch in the Knowbot
sofware, where we have some code that wants to look at a "filesystem"
but could be talking to some kind of filesystem emulation across an
RPC connection or alternatively could be accessing a zip file.  Our
conclusion is that a convenient interface is modeled after (a subset
of) the os and os.path functionality.  In fact, the only thing you
would need to add to the os module would be a function to open a file
object; I've proposed to add os.fopen() as an alias for the built-in
open().

The idea that you could mount one VFS inside another is nice, although
I'm not sure how practical it is.  For one thing, in our fs code,
os.path.sep and friends (e.g. os.path.normcase behavior) were set per
filesystem; what would happen if you mounted a Unix filesystem in an
NT tree?  Doing the translations is hard too; e.g. on a Mac fs, the
separator is ':' and a '/' can be part of a filename -- do you simply
swap them?  What if a Mac file has both '/' and '\'  and you mount it
on a Windows FS?  I'd rather stay away from this.

On the other hand the VFS concept could be used as a totally different
solution to the sys.importers vs. sys.path 

> Andrew Kuchling's post hints at another tangent: opendir/readdir is of
> course simply an enumeration.  There's a lot of "genericity" lurking in
> scanning across file systems, trees, networks, and resources in general.

I'd still rather see listdir() (which our sample virtual FS API
supported).  I don't think it necessarily makes sense to do this on a
more generic basis -- other trees and graphs have sufficiently
different semantics that using a FS like API doesn't necessarily cut
it.  Take for example the Windows registry -- looks a lot like a
filesystem, doesn't it?  Yet it has one fundamental property that a
typical FS doesn't: directory nodes can have data *and* children...

I've written a tree widget and found that it's remarkably hard to come
up with a workable API to talk to trees *in general*.  Trees are a
universal concept, but code sharing is still elusive...  Perhaps
because the concept is so simple?

> <minirant> The filesystem <-> OO dichotomy needs a review. </minirant>

I think that my proposal above should cover this.  (We looked briefly
at doing a similar thing for Java, and found that it's actually harder
there -- they have all these nice objects representing paths, but it's
not easily subclassable to represent paths in some virtual
filesystem.)

> Concepts like these have a lot to offer - and would make even more sense
> if they were done in a way which benefits multiple scripting languages.
> Feel free to reply by email if you ever want to further discuss this.

I see only very hope for this point of view, but I will refrain to
comment more.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Thu Dec  9 19:23:14 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Thu, 9 Dec 1999 13:23:14 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <384FEA5D.A07F23EC@interet.com>
Message-ID: <1267359311-37934097@hypernet.com>

James C. Ahlstrom wrote:

> Jean-Claude Wippler wrote:
> 
> > Ouch - what's wrong with zip archives?

> In general Zip archives store whole branches of a file
> system.  

> The archive format stores modules as dotted names, just as they
> appear in the import statement.  The search path is "." in every
> archive file by definition.  The import statement "import foo"
> just results in a dictionary lookup for key "foo", not a search
> through a zip directory along a local search path for
> "foo.something" where "something" can be pyc, pyo, py, etc.
> 
> The intent was to link the archives to the import statement, not
> re-create a directory tree.  It borrowed this feature from the
> archive formats of Greg and Gordon.

As I've stated before, I have 2 archive formats. This may seem 
a needless complication, but my suspicion is that sooner or 
later, people will want 2 different kinds.

One is a .pyz format, which corresponds closely to Jim's .pyl 
format (with a number of minor differences: it's compressed, 
the archive as a whole has the Python magic number, instead 
of each entry, and it's not designed for concatenation).
 
The other is like a zip, and probably should be zip format.  It's 
designed to hold _anything_, and can be manipulated from C 
and from Python. It can be concatenated and / or embedded 
(and the innner one opened without extraction). It's table of 
contents is more file-system like. Importing from one is 
slower, but that's not really what it's for. It's for packaging up 
arbitrary resources. Like .pyz's, or Tcl/Tk for Tkinter apps, or 
configuration files.

Jim is correct that a good importer (which can say "No, it's not 
mine" as quickly as possible) is better satisfied by a simple 
dictionary lookup than fooling with file extensions and 
directories (virtual or real).

> > If you want a marshalled TOC, then why not add a manifest entry
> > for it, sort of like what ranlib does with ar?
> 
> Sorry, I don't understand.  Please explain.

The table of contents is just another entry.
 
> An important feature is the ability to concatenate to a binary:
>   cat python.exe zip1.zip > myapp.exe
> Searching for this isn't fast unless magic numbers are at the
> end.  Are zip files recognizable from the end (I don't know)?

Where do you think we got this idea?

> Are there any zip experts out there?  Can zip files satisfy all
> the design requirements I listed in pylib.html?  Is there zip
> code available?  All my code is in Python.

Hmm. My bookmark appears to be dead (I was there not long 
ago):
http://www.cubic.org/source/archive/fileform/packers/appnote.t
xt

There have been several references on this list to Guido et al 
having some Python / zip code.


- Gordon


From guido at CNRI.Reston.VA.US  Thu Dec  9 19:23:27 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 13:23:27 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 13:17:35 EST."
             <14415.62015.856931.750279@anthem.cnri.reston.va.us> 
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com>  
            <14415.62015.856931.750279@anthem.cnri.reston.va.us> 
Message-ID: <199912091823.NAA06243@eric.cnri.reston.va.us>

> I agree.  I can't recall the details now, but I had a lot of problems
> with zip concatenation in JPython.  I think at least some of the older
> Java tools for groking zips don't work with contatenation.

The Java "jar" tool mostly ignores the central directory -- it seems
to read the archive from the front, using the local header records,
and ignoring the central directory (of course it writes one when it
creates an archive).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Thu Dec  9 19:32:15 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 13:32:15 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 12:43:57 EST."
             <384FEA5D.A07F23EC@interet.com> 
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com>  
            <384FEA5D.A07F23EC@interet.com> 
Message-ID: <199912091832.NAA06287@eric.cnri.reston.va.us>

> In general Zip archives store whole branches of a file
> system.  A Python ./Lib zip archive would contain:
> 
>   N:/python/Python-1.5.2/Lib/string.pyc
>   N:/python/Python-1.5.2/Lib/os.pyc
>   N:/python/Python-1.5.2/Lib/copy.pyc
>   N:/python/Python-1.5.2/Lib/test/testall.pyc
> 
> Zip archives are isomorphic to branches of a file system.
> That means there must be a sys.path for each zip archive file.
> How would this be specified?

Not true.  It's easy (using the proper Zip tools) to creat an archive
containing this instead:

  string.pyc
  os.pyc
  copy.pyc
  testall.pyc

Thus the entire archive is considered the directory.  The Java "jar"
tool uses this approach.  It's also easy to have packages in there
(again this is what Java does):

  test/
  test/__init__.pyc
  test/pystone.pyc
  test_support.pyc
  (etc.)

> The archive format stores modules as dotted names, just as they
> appear in the import statement.  The search path is "." in every
> archive file by definition.  The import statement "import foo"
> just results in a dictionary lookup for key "foo", not a search
> through a zip directory along a local search path for "foo.something"
> where "something" can be pyc, pyo, py, etc.
> 
> The intent was to link the archives to the import statement, not
> re-create a directory tree.  It borrowed this feature from
> the archive formats of Greg and Gordon.

Maybe you've gone overboard.  The time it takes to translate the dots
into slashes really isn't the big deal.

> Are there any zip experts out there?  Can zip files satisfy all the
> design requirements I listed in pylib.html?  Is there zip code
> available?  All my code is in Python.

Yes (all of us here at CNRI), yes, yes (we have the spaghetti code).
While zip files support compression, they support uncompressed files
as well and we could go either way.  Their most popular compression
format is gzip compatible and can be read and written with the zlib
module, which is in the standard Python distribution (even on Windows)
-- though to build it you need the zlib C library which is of course
external (but solid open source).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Thu Dec  9 19:41:22 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 13:41:22 -0500 (EST)
Subject: [Python-Dev] Virtual filesystem APIs
In-Reply-To: <199912091821.NAA06209@eric.cnri.reston.va.us>
References: <000301bf4206$b39e5b80$36a2143f@tim>
	<384FC47A.BB4DA517@interet.com>
	<384FDAF5.C25C447C@equi4.com>
	<199912091655.LAA05928@eric.cnri.reston.va.us>
	<384FED8D.3C535D38@equi4.com>
	<199912091821.NAA06209@eric.cnri.reston.va.us>
Message-ID: <14415.63442.92911.748132@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > os.path.sep and friends (e.g. os.path.normcase behavior) were set per

  Hah!  Caught you in public!  "sep" & friends are defined in the os
module; this is where the separation breaks down.
  I think these should be located in os.path, and os can just pick
them up from there to be backward compatible.
  os.pathsep is a problem, somewhat; it is related to os.sep, but is
very different in many ways.  I don't think there's a good way to deal 
with it.

 > filesystem; what would happen if you mounted a Unix filesystem in an
 > NT tree?  Doing the translations is hard too; e.g. on a Mac fs, the
 > separator is ':' and a '/' can be part of a filename -- do you simply
 > swap them?  What if a Mac file has both '/' and '\'  and you mount it
 > on a Windows FS?  I'd rather stay away from this.

  And this is tightly related to the sep/pathsep problem as well.  I
agree, we should stay away from it.

 > I think that my proposal above should cover this.  (We looked briefly
 > at doing a similar thing for Java, and found that it's actually harder
 > there -- they have all these nice objects representing paths, but it's
 > not easily subclassable to represent paths in some virtual

  But it was easy to create a set of interfaces with a reasonable API; 
getting back to the "typical" Java classes was what really changed the 
most.
  For those of us not working on the KOE:  I set up Filesystem and
FSFile interfaces; the Filesystem represented the entire filesystem
and the FSFile was very similar to the java.io.File class, but had
additional methods to get input and output stream objects (of the
standard Java flavor); all the buffering and such could be wrapped on
top of that just like any other Java I/O.
  The specific application was to provide access to an isolated
directory structure which untrusted code "owned", but ensured that
parent directories were unreachable.  Additional security checks can
be worked into such a structure as applicable.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From fdrake at acm.org  Thu Dec  9 20:06:32 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 14:06:32 -0500 (EST)
Subject: [Python-Dev] posix module test suite
Message-ID: <14415.64952.780974.8124@weyr.cnri.reston.va.us>

  There's not a test for the posix or os modules; if anyone would like 
to contribute one, this would be a good time!  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jcw at equi4.com  Thu Dec  9 21:51:11 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Thu, 09 Dec 1999 21:51:11 +0100
Subject: [Python-Dev] Virtual filesystem APIs
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us>  
	            <384FED8D.3C535D38@equi4.com> <199912091821.NAA06209@eric.cnri.reston.va.us>
Message-ID: <3850163F.80BDCB75@equi4.com>

Guido van Rossum wrote:
>
[... horrors of cross-OS mounts and ":\/" separators ...]

I agree, this has some very hairy sides to it.  But VFS is really more
about mounting non-FS things in a "root" FS (presumably the real one).

> On the other hand the VFS concept could be used as a totally different
> solution to the sys.importers vs. sys.path

Heck, I'll be the "enfant terrible" once more: yes, and this stuff could
well be implemented generically across scripting languages.  Of course
the act of "importing" is a very Pythonic issue - but FS/VFS traversal
and the actual shared library load need not be.  Anyway, enough of that.

> Take for example the Windows registry -- looks a lot like a 
> filesystem, doesn't it?  Yet it has one fundamental property that a
> typical FS doesn't: directory nodes can have data *and* children...

What you're saying is that dir = set-of-subdirs + set-of-files, and that
this is a more general requirement than plain FS's.  Doesn't that simply
mean that the more general model is needed as basis to handle both?

> Trees are a universal concept, but code sharing is still elusive...

Ah, but think of the implications: archives, networks, XML, the world!

-- Jean-Claude


From fdrake at acm.org  Thu Dec  9 22:16:00 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 16:16:00 -0500 (EST)
Subject: [Python-Dev] forwarded message from Fred L. Drake
Message-ID: <14416.7184.255000.342231@weyr.cnri.reston.va.us>


  OK, I've checked in some changes to the posix module to add support
for a few of the POSIX interfaces Andrew expressed interest in seeing
(and some he said weren't such a good idea, or at least not necessary,
but about which I decided I disagreed after all).
  For those of you who aren't on the checkins list (??), I've attached 
the message so you'll know what functions were added.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


-------------- next part --------------
An embedded message was scrubbed...
From: "Fred L. Drake" <fdrake at weyr.cnri.reston.va.us>
Subject: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.115,2.116
Date: Thu, 9 Dec 1999 16:13:10 -0500 (EST)
Size: 3800
URL: <http://mail.python.org/pipermail/python-dev/attachments/19991209/ed5f3b37/attachment.eml>

From guido at CNRI.Reston.VA.US  Thu Dec  9 22:19:57 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 16:19:57 -0500
Subject: [Python-Dev] forwarded message from Fred L. Drake
In-Reply-To: Your message of "Thu, 09 Dec 1999 16:16:00 EST."
             <14416.7184.255000.342231@weyr.cnri.reston.va.us> 
References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> 
Message-ID: <199912092119.QAA06731@eric.cnri.reston.va.us>

>   OK, I've checked in some changes to the posix module to add support
> for a few of the POSIX interfaces Andrew expressed interest in seeing
> (and some he said weren't such a good idea, or at least not necessary,
> but about which I decided I disagreed after all).

I wish you'd made your disagreement public before checking it in...
But it's not too late...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin at mems-exchange.org  Thu Dec  9 22:32:26 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Thu, 9 Dec 1999 16:32:26 -0500 (EST)
Subject: [Python-Dev] forwarded message from Fred L. Drake
In-Reply-To: <14416.7184.255000.342231@weyr.cnri.reston.va.us>
References: <14416.7184.255000.342231@weyr.cnri.reston.va.us>
Message-ID: <14416.8170.18298.33796@amarok.cnri.reston.va.us>

Fred L. Drake, Jr. writes (in a CVS checkin):
>Added support for abort(), ctermid(), tmpfile(), tempnam(), tmpnam(),
>and TMP_MAX.

For those of you following along, the tmpfile(), tempnam(), tmpnam()
functions were ones I listed as probably not worth adding.  On the
other hand, David Beazley wrote:

>  I think that the POSIX module should strive to be as
>complete as possible--even if certain functions are closely related
>other functionality in the library (tmpfile for instance).  I suspect

... and that's a good point, too.  The POSIX functions may provide
adaptability that a Python analog doesn't; for example, you could read
/etc/passwd in pure Python, but that wouldn't handle NIS or shadow
passwords.  So I guess I'll vote for completeness over lack of
overlap; leave tmpfile() & friends in.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
This supports reflection, which is the 90s way of writing self-modifying code.
    -- John Aycock at IPC7, during his parsing talk


From guido at CNRI.Reston.VA.US  Thu Dec  9 22:38:42 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 16:38:42 -0500
Subject: [Python-Dev] forwarded message from Fred L. Drake
In-Reply-To: Your message of "Thu, 09 Dec 1999 16:32:26 EST."
             <14416.8170.18298.33796@amarok.cnri.reston.va.us> 
References: <14416.7184.255000.342231@weyr.cnri.reston.va.us>  
            <14416.8170.18298.33796@amarok.cnri.reston.va.us> 
Message-ID: <199912092138.QAA06790@eric.cnri.reston.va.us>

> ... and that's a good point, too.  The POSIX functions may provide
> adaptability that a Python analog doesn't; for example, you could read
> /etc/passwd in pure Python, but that wouldn't handle NIS or shadow
> passwords.  So I guess I'll vote for completeness over lack of
> overlap; leave tmpfile() & friends in.

OK, I agree now.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Thu Dec  9 23:30:52 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 17:30:52 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <14416.11676.888918.511932@weyr.cnri.reston.va.us>

Andrew M. Kuchling writes:
 > After poking around in the O'Reilly POSIX book, here's a list of POSIX

  Ok, here's my comments on the remainder of these.

 > Worth adding?
 > =============
 > opendir(), readdir(), closedir() -- 
 > 	   most of their functionality is available through
 > 	   os.listdir(), but it might be useful to have a direct
 > 	   interface.  Downside is that this would require a new
 > 	   extension type for the C DIR struct.  My (lazy) inclination
 > 	   is to not bother.

  [rewinddir() and seekdir() should be considered as well, where
supported.]

  There's more tedium than anything in implementing a new C type.  I'm 
a little concerned that there might not be any real value here, but
it's hard to be sure about that.  Is there any real reason not to use
os.listdir().

 > Worth adding:
 > =============
...
 > fpathconf(fd, name) -- Get configuration limit for a file
 > 	    -- would need constants from unistd.h

  This is mostly a matter of setting up the constants; not hard, just
more distracting than I want to deal with right now.

 > getlogin() -- returns user's login name
 > 	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
 > 	 getlogin() apparently looks in utmp

  Per Guido's comments, I'm not sure how valuable it is.  It may make
sense strictly for completeness, but I've never heard of utmp being
considered reliable in any way.  Maybe I'm too new at all this.

 > getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs

  This should be easy enough.

 > pathconf(path, name) -- Gets config variables for a path
 > 	    -- would need constants from unistd.h

  (Same as for fpathconf().)

 > sysconf(int name) -- Gets system configuration information
 > 	    -- would need constants from unistd.h
 > 
 > Not worth adding:
 > =================

  Aside from the ones I've already added, I agree.  ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jim at digicool.com  Fri Dec 10 00:31:40 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 09 Dec 1999 18:31:40 -0500
Subject: [Python-Dev] Thankyou for fsync :)
Message-ID: <38503BDC.CB91FB29@digicool.com>

I found recently that I needed fsync and was pleasantly surprized 
to find that it is provided in the posix module, where available.

Can I count on it staying in the posix module, when available, 
for the forseeable future?

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From gstein at lyra.org  Fri Dec 10 01:32:33 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 9 Dec 1999 16:32:33 -0800 (PST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <14416.11676.888918.511932@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912091630540.10472-100000@nebula.lyra.org>

On Thu, 9 Dec 1999, Fred L. Drake, Jr. wrote:
> Andrew M. Kuchling writes:
>...
>  > opendir(), readdir(), closedir() -- 
>  > 	   most of their functionality is available through
>  > 	   os.listdir(), but it might be useful to have a direct
>  > 	   interface.  Downside is that this would require a new
>  > 	   extension type for the C DIR struct.  My (lazy) inclination
>  > 	   is to not bother.
> 
>   [rewinddir() and seekdir() should be considered as well, where
> supported.]
> 
>   There's more tedium than anything in implementing a new C type.  I'm 
> a little concerned that there might not be any real value here, but
> it's hard to be sure about that.  Is there any real reason not to use
> os.listdir().

No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic
number if you're worried about mixing CObjects.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido at CNRI.Reston.VA.US  Fri Dec 10 03:03:04 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 21:03:04 -0500
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: Your message of "Thu, 09 Dec 1999 18:31:40 EST."
             <38503BDC.CB91FB29@digicool.com> 
References: <38503BDC.CB91FB29@digicool.com> 
Message-ID: <199912100203.VAA07410@eric.cnri.reston.va.us>

> I found recently that I needed fsync and was pleasantly surprized 
> to find that it is provided in the posix module, where available.
> 
> Can I count on it staying in the posix module, when available, 
> for the forseeable future?

Since we seem to be on an adding spree, I don't see why not -- as long
as POSIX keeps it available :)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip at mojam.com  Fri Dec 10 07:28:56 1999
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 10 Dec 1999 00:28:56 -0600 (CST)
Subject: [Python-Dev] posix module test suite
In-Reply-To: <14415.64952.780974.8124@weyr.cnri.reston.va.us>
References: <14415.64952.780974.8124@weyr.cnri.reston.va.us>
Message-ID: <14416.40360.611743.143624@dolphin.mojam.com>

    Fred> There's not a test for the posix or os modules; if anyone would
    Fred> like to contribute one, this would be a good time!  ;-)

Not having ever written any tests for the core Python modules, it seems
natural to ask if there are any guidelines for the construction of such
tests or the test equivalent of the Modules/xxmodule.c file.  Are there
standard behaviors expected for passing and failing a test?

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From tim_one at email.msn.com  Fri Dec 10 09:48:59 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Fri, 10 Dec 1999 03:48:59 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <14415.23676.775163.786028@dolphin.mojam.com>
Message-ID: <000501bf42eb$66529860$412d153f@tim>

[Skip Montanaro]
> Alright!  Now I understand what all the hubbub is about!  My eyes have
> mostly been glazing over trying to follow all this Windows
> registry/path/ini stuff.  MS believes that Python is the application.
> Those of us writing Python programs view those programs as the
> applications, not the Python interpreter per se.

Eww -- that's a helpful and insightful way to put it, Skip!  Now maybe *I*
can understand what the hubbub is about <wink>.

> Is there some way that people writing applications in Python can set
> up registry entries that are specific to their application (e.g.
> tabnanny.py) instead of only specific to the Python interpreter?

Yes, but they can't get Python to look at those before it's too late.  I
spent a whole evening a month or two ago just trying to figure out where all
the cruft in my Windows sys.path *came* from.  This is out-of-the-box; I
haven't added anything myself:

['',
 'D:\\Python\\win32',
 'D:\\Python\\win32\\lib',
 'D:\\Python',
 'D:\\Python\\Pythonwin',
 'D:\\Python\\Lib\\plat-win',
 'D:\\Python\\Lib',
 'D:\\Python\\DLLs',
 'D:\\Python\\Lib\\lib-tk',
 'D:\\PYTHON\\DLLs',
 'D:\\PYTHON\\lib',
 'D:\\PYTHON\\lib\\plat-win',
 'D:\\PYTHON\\lib\\lib-tk',
 'D:\\PYTHON']

That's bizarre on the face of it, and tracking it all down was draining.
I've forgotten the details.  I do remember concluding that it was impossible
to do what I wanted to do without changing the implementation, though, and
nobody on Python-Dev disputed that at the time.

In a pragmatic crunch, I wrote the little app I needed to distribute at the
time in Perl instead, meaning to come back to this.  I haven't had time.

IIRC, the ultimate problem wasn't really that Python looked at the registry
to get *some* path info, it was a combination of

A) It looked at the registry so early that it was impossible to stop it from
executing whatever site.py the registry pointed at (well, I could with
the -S option -- but then there was no way to get it to do the site.py that
was *wanted* instead).

B) No way to override what was in the registry; e.g., I was greatly
surprised to discover that setting a PYTHONPATH envar didn't override
anything, it simply plunked the PYTHONPATH entries into sys.path along with
everything else -- and too late to stop anything anyway.

In a long msg I haven't yet read all the way thru, Guido at least suggested
associating different registry path info with different Python versions.
That would address a number of otherwise currently intractable problems.

I suspect it still wouldn't help with the problem I was facing, though.
That is, I wanted to be able to tell people to run

\\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py

which is just a Windows way of saying "run a Python executable from a shared
network location".  When they tried that, though, the network Python looked
in *their* individual registries for its Python path info, and some of the
hackers with mondo customized Python setups on their own machines watched
things go down in flames.

This certainly can't be a common problem, but it speaks to an unforgiving
rigidity in the current approach.  There seemed to be nothing I could do to
guarantee this would work, short of telling users to edit their registries
before running this tool (that's a non-starter on Windows -- editing the
registry is dangerous) or putting a customized Python on the network
pointing to a bogus registry key (it was faster to write the app in Perl!
Perl doesn't *try* to be so infernally helpful <wink>, so doesn't get in the
way either).

I'm left wondering what purpose putting Python library path info into the
Windows registry serves.  Is there anyone on Windows who *doesn't* have
their Python Lib/ etc as direct subdirectories of the directory containing
python.exe?  Not that I've seen.  Python puts *those* in sys.path too -- but
only after it (in the normal case; see my sys.path above) pulls identically
redundant paths out of the registry first, or (in the cases we're griping
about) pulls irrelevant or downright harmful paths out of the registry first
(paths appropriate to the last Python you *installed*, not to the Python
that's *running*!).

Perhaps all this cruft is needed to support embedded Python, though
(something I've never done).

Regardless, I expect it would have been enough for me if PYTHONPATH simply
worked the way I mistakenly assumed it would (that is, this is sys.path, and
that's *it*; feel free to prepend the current directory when initialization
is complete, but before then looking at any file not reached from PYTHONPATH
is verboten).

the-cleverer-the-code-the-more-vital-that-there-be-a-way-to-
    short-circuit-it-ly y'rs  - tim


From jim at interet.com  Fri Dec 10 13:16:31 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 07:16:31 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000501bf42eb$66529860$412d153f@tim>
Message-ID: <3850EF1F.158445B6@interet.com>

Tim Peters wrote:
> 
> [Skip Montanaro]
> > Is there some way that people writing applications in Python can set
> 
> Yes, but they can't get Python to look at those before it's too late.  I
> spent a whole evening a month or two ago just trying to figure out where all
> the cruft in my Windows sys.path *came* from.  This is out-of-the-box; I
> .....

Excellent discussion Tim!

> I suspect it still wouldn't help with the problem I was facing, though.
> That is, I wanted to be able to tell people to run
> 
> \\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py
> 
> which is just a Windows way of saying "run a Python executable from a shared
> network location".  When they tried that, though, the network Python looked
> in *their* individual registries for its Python path info, and some of the
> hackers with mondo customized Python setups on their own machines watched
> things go down in flames.

I think a sensible way to run little apps is to put everything
in an archive file including the main.py.  On Windows you
concattenate that to python.exe, and it Just Works.

> Windows registry serves.  Is there anyone on Windows who *doesn't* have
> their Python Lib/ etc as direct subdirectories of the directory containing
> python.exe?  Not that I've seen.

Point on the curve.  We don't.  We freeze everything except the main.py.

JimA


From jim at interet.com  Fri Dec 10 14:38:28 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 08:38:28 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com>
Message-ID: <38510254.ED15D32B@interet.com>

Jean-Claude Wippler wrote:

> Ouch - what's wrong with zip archives?

> With all due respect - I sincerely hope you will reconsider and alter
> your code to work with zip files.  It's probably a small adjustment?

OK, you talked me into it.  Ya, small adjustment, no problem ;-)

JimA


From jack at oratrix.nl  Fri Dec 10 14:51:10 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 10 Dec 1999 14:51:10 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Message by "James C. Ahlstrom" <jim@interet.com> ,
	     Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com> 
Message-ID: <19991210135111.2F83C370CF2@snelboot.oratrix.nl>

Is it possible nowadays to have two files with the same name but different 
paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive?

That's the one thing that always struck me as very very silly about zipfiles.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From gmcm at hypernet.com  Fri Dec 10 15:28:51 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Fri, 10 Dec 1999 09:28:51 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: <19991210135111.2F83C370CF2@snelboot.oratrix.nl>
References: Message by "James C. Ahlstrom" <jim@interet.com> ,	     Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com> 
Message-ID: <1267287023-386248@hypernet.com>

Jack Jansen asks:

> Is it possible nowadays to have two files with the same name but
> different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same
> archive?

Depends on how you do it.

If the user imports foo.spam.bar, an importer will be asked for:
  foo (return foo.__init__)
  foo.spam (return foo.bar.__init__)
  foo.spam.bar (return foo.spam.bar)

But the API allows lots of variations. This is another possible 
interaction:
  foo (return None)
  foo.__init__ (return foo.__init__)
  foo.spam (return None)
  foo.bar.__init__ (return foo.bar.__init__)
  foo.spam.bar (return foo.spam.bar)

Or, by looking at different args to get_code, you could look at 
the requests as:
  foo in context of None
  spam in context of foo
  bar in context of foo.spam
 
With another variation where the request for __init__ becomes 
explicit.

The first way seems the natural way for archives, and makes it 
easy to keep foo.bar.spam distinct from foo.spam.

> That's the one thing that always struck me as very very silly
> about zipfiles.

Huh?

- Gordon


From guido at CNRI.Reston.VA.US  Fri Dec 10 15:51:39 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 09:51:39 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 14:51:10 +0100."
             <19991210135111.2F83C370CF2@snelboot.oratrix.nl> 
References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> 
Message-ID: <199912101451.JAA07786@eric.cnri.reston.va.us>

> Is it possible nowadays to have two files with the same name but different 
> paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive?
> 
> That's the one thing that always struck me as very very silly about zipfiles.

Zip files contain the full path, there's no problem with that.  Was
there ever?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jack at oratrix.nl  Fri Dec 10 15:52:26 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 10 Dec 1999 15:52:26 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy 
 )
In-Reply-To: Message by "Gordon McMillan" <gmcm@hypernet.com> ,
	     Fri, 10 Dec 1999 09:28:51 -0500 , <1267287023-386248@hypernet.com> 
Message-ID: <19991210145227.01F99370CF2@snelboot.oratrix.nl>

> Jack Jansen asks:
> 
> > Is it possible nowadays to have two files with the same name but
> > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same
> > archive?
> 
> Depends on how you do it.

Apparently I mis-phrased my question, I'll try again.

When people suggested to use zip format as the standard Python archive format 
I was a bit worried, becuase I've had it happen to me various times that I was 
unable to create a ZIP archive with two files with the same name but different 
paths (i.e. create an archive of a directory that contains both a foo/bar.py 
and a foo/spam/bar.py).

So, my question was: has this happened to me because the winzip I used was 
braindead, or is there possibly a problem with the ZIP file format that 
disallows two files with the same name in one archive? Most zip programs I've 
seen also seem to present filenames as the primary metaphore, with full 
pathnames somewhat "tacked on".

If the latter is the case I wonder whether zip is the right format to use...
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From guido at CNRI.Reston.VA.US  Fri Dec 10 16:00:51 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 10:00:51 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 15:52:26 +0100."
             <19991210145227.01F99370CF2@snelboot.oratrix.nl> 
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> 
Message-ID: <199912101500.KAA07863@eric.cnri.reston.va.us>

Again, the zip format does not have this problem.  Some zip tools may
-- then we don't use those.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Fri Dec 10 16:40:21 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 10:40:21 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <Pine.LNX.4.10.9912091630540.10472-100000@nebula.lyra.org>
References: <14416.11676.888918.511932@weyr.cnri.reston.va.us>
	<Pine.LNX.4.10.9912091630540.10472-100000@nebula.lyra.org>
Message-ID: <14417.7909.511437.230915@weyr.cnri.reston.va.us>

Greg Stein writes:
 > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic
 > number if you're worried about mixing CObjects.

  That's certainly one option, but I would have made readdir(),
seekdir(), rewinddir() and closedir() into the methods read(), seek(), 
rewind() and close().  So it's a question of what interface you
prefer; functions with magically interpreted token parameters (kind of 
like file descriptors, hey!), or something that is more recognizably
object-oriented.
  I know my preference.  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From mal at lemburg.com  Fri Dec 10 16:55:02 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 16:55:02 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
Message-ID: <38512256.F9287E24@lemburg.com>

Jack Jansen wrote:
> 
> > Jack Jansen asks:
> >
> > > Is it possible nowadays to have two files with the same name but
> > > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same
> > > archive?
> >
> > Depends on how you do it.
> 
> Apparently I mis-phrased my question, I'll try again.
> 
> When people suggested to use zip format as the standard Python archive format
> I was a bit worried, becuase I've had it happen to me various times that I was
> unable to create a ZIP archive with two files with the same name but different
> paths (i.e. create an archive of a directory that contains both a foo/bar.py
> and a foo/spam/bar.py).
> 
> So, my question was: has this happened to me because the winzip I used was
> braindead, or is there possibly a problem with the ZIP file format that
> disallows two files with the same name in one archive? Most zip programs I've
> seen also seem to present filenames as the primary metaphore, with full
> pathnames somewhat "tacked on".
> 
> If the latter is the case I wonder whether zip is the right format to use...

Hmm, I've been doing the above for years now... never had a problem
with it (I use Info-ZIPs tools, BTW), e.g.

/home/lemburg> unzip -l projects/distribution/mxODBC-1.1.1.zip 
Archive:  projects/distribution/mxODBC-1.1.1.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
   131316  06-09-99 14:10   ODBC/EasySoft/mxODBC.c
   131316  06-09-99 14:10   ODBC/Informix/mxODBC.c
   ...

Would be cool if I could use my packages as ZIP files :-) So
here's another vote for using the ZIP format.

BTW, wouldn't it make sense to include the zlib code
in the core distribution much like the pcre stuff is now ?
AFAIK, it is public domain and including it would remedy many of the
compatibility issues with the different zlib versions around.

Ok, that was my wish for Xmas :-) The rest is up to Mr. Clause...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido at CNRI.Reston.VA.US  Fri Dec 10 17:04:24 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:04:24 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 16:55:02 +0100."
             <38512256.F9287E24@lemburg.com> 
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>  
            <38512256.F9287E24@lemburg.com> 
Message-ID: <199912101604.LAA14100@eric.cnri.reston.va.us>

> BTW, wouldn't it make sense to include the zlib code
> in the core distribution much like the pcre stuff is now ?
> AFAIK, it is public domain and including it would remedy many of the
> compatibility issues with the different zlib versions around.

What compatibility issues?  Note that the Win32 distri already comes
with zlib statically linked into zlib.pyd.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal at lemburg.com  Fri Dec 10 17:15:48 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:15:48 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>  
	            <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>
Message-ID: <38512734.CF6E4489@lemburg.com>

Guido van Rossum wrote:
> 
> > BTW, wouldn't it make sense to include the zlib code
> > in the core distribution much like the pcre stuff is now ?
> > AFAIK, it is public domain and including it would remedy many of the
> > compatibility issues with the different zlib versions around.
> 
> What compatibility issues?  Note that the Win32 distri already comes
> with zlib statically linked into zlib.pyd.

There were issues with zlib 1.0.4 and later ones. Also, many
Linux distributions don't have the zlib header files installed.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido at CNRI.Reston.VA.US  Fri Dec 10 17:19:47 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:19:47 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 17:15:48 +0100."
             <38512734.CF6E4489@lemburg.com> 
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>  
            <38512734.CF6E4489@lemburg.com> 
Message-ID: <199912101619.LAA14174@eric.cnri.reston.va.us>

> There were issues with zlib 1.0.4 and later ones. Also, many
> Linux distributions don't have the zlib header files installed.

Hm.  I don't recall having any problems reported to me.  I'd rather
not include the entire zlib distri in the Python distri -- zlib
is rather big.  Adding only the Unix source would be cheating.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Fri Dec 10 17:25:23 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:25:23 -0500
Subject: [Python-Dev] dbm clone with serious specs wanted
Message-ID: <199912101625.LAA14216@eric.cnri.reston.va.us>

Someone has asked me for a dbm clone that can store 16M keys of 350
bytes each, and runs on Linux, HPUX, and NT.  That's 5.6 Gigabyte in
keys alone!  I presume most classic approaches won't cut it since
total file size is typicall limited by the seek system call, internal
data structures and/or file index format to 2Gb (signed longs) or 4Gb
(unsigned longs).

Does anyone have an idea where to start looking?  Would a Python
extension already exist?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From petrilli at amber.org  Fri Dec 10 17:29:27 1999
From: petrilli at amber.org (Christopher Petrilli)
Date: Fri, 10 Dec 1999 11:29:27 -0500
Subject: [Python-Dev] dbm clone with serious specs wanted
In-Reply-To: <199912101625.LAA14216@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Fri, Dec 10, 1999 at 11:25:23AM -0500
References: <199912101625.LAA14216@eric.cnri.reston.va.us>
Message-ID: <19991210112927.A14102@trump.amber.org>

Guido van Rossum [guido at CNRI.Reston.VA.US] wrote:
> Someone has asked me for a dbm clone that can store 16M keys of 350
> bytes each, and runs on Linux, HPUX, and NT.  That's 5.6 Gigabyte in
> keys alone!  I presume most classic approaches won't cut it since
> total file size is typicall limited by the seek system call, internal
> data structures and/or file index format to 2Gb (signed longs) or 4Gb
> (unsigned longs).
> 
> Does anyone have an idea where to start looking?  Would a Python
> extension already exist?

Assuming you mean an interface to a ddbm-style situation, you could easily
use berkeley DB, I belive it is limited in the 4TB range...  

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org


From mal at lemburg.com  Fri Dec 10 17:26:10 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:26:10 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>  
	            <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>
Message-ID: <385129A2.6FAF4E81@lemburg.com>

Guido van Rossum wrote:
> 
> > There were issues with zlib 1.0.4 and later ones. Also, many
> > Linux distributions don't have the zlib header files installed.
> 
> Hm.  I don't recall having any problems reported to me.  I'd rather
> not include the entire zlib distri in the Python distri -- zlib
> is rather big.  Adding only the Unix source would be cheating.

How about only adding those parts which would be needed to
at least deflate the ZIP archive contents ?

If the ZIP archive format becomes the standard for Python, we'd
have to ensure that all Python users can read them. Well, at
least that's what I would expect from a standard format :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido at CNRI.Reston.VA.US  Fri Dec 10 17:29:36 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:29:36 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 17:26:10 +0100."
             <385129A2.6FAF4E81@lemburg.com> 
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>  
            <385129A2.6FAF4E81@lemburg.com> 
Message-ID: <199912101629.LAA14274@eric.cnri.reston.va.us>

> How about only adding those parts which would be needed to
> at least deflate the ZIP archive contents ?

Ditto -- still lots of portability issues I bet.

> If the ZIP archive format becomes the standard for Python, we'd
> have to ensure that all Python users can read them. Well, at
> least that's what I would expect from a standard format :-)

There's a simple solution: don't use compression.  With current disk
prices it's really not worth it.  Let the installer do the
decompression (installers travel across networks where compression
*is* worth it).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin at mems-exchange.org  Fri Dec 10 17:34:09 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Fri, 10 Dec 1999 11:34:09 -0500 (EST)
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: <38512734.CF6E4489@lemburg.com>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
	<38512256.F9287E24@lemburg.com>
	<199912101604.LAA14100@eric.cnri.reston.va.us>
	<38512734.CF6E4489@lemburg.com>
Message-ID: <14417.11137.562474.99270@amarok.cnri.reston.va.us>

M.-A. Lemburg writes:
>There were issues with zlib 1.0.4 and later ones. Also, many
>Linux distributions don't have the zlib header files installed.

For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm,
and zlib.XXX.rpm only contains libz.so.  On the other hand, anyone
who's compiling Python should really have the various -devel RPMs
installed.  I'd argue against including it, because it might cause odd
versioning problems.  For example, what if I have PIL compiled against
zlib1.1.2 (zlib is used for writing PNGs) and the Python binary
includes zlib1.1.3?  There might be hard-to-debug problems
caused by calling the wrong symbol.

PCRE is a special case, because we've actually hacked the code a lot;
it's not the PCRE code as Philip Hazel distributes it.

Just received Guido's email suggesting skipping compression in
archives; not a bad idea.  You'd use less CPU, but might do
more I/O because you're reading more sectors off disk.  There
probably isn't much need for compression when the archive is on-disk;
Java needed it because of applets.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
The NSA response was, "Well, that was interesting, but there aren't any
ciphers like that."
    -- Gus Simmons, "The History of Subliminal Channels"


From petrilli at amber.org  Fri Dec 10 17:39:44 1999
From: petrilli at amber.org (Christopher Petrilli)
Date: Fri, 10 Dec 1999 11:39:44 -0500
Subject: [Python-Dev] dbm clone with serious specs wanted
In-Reply-To: <19991210112927.A14102@trump.amber.org>; from petrilli@amber.org on Fri, Dec 10, 1999 at 11:29:27AM -0500
References: <199912101625.LAA14216@eric.cnri.reston.va.us> <19991210112927.A14102@trump.amber.org>
Message-ID: <19991210113944.B14102@trump.amber.org>

Christopher Petrilli [petrilli at amber.org] wrote:
> Guido van Rossum [guido at CNRI.Reston.VA.US] wrote:
> > Does anyone have an idea where to start looking?  Would a Python
> > extension already exist?
> 
> Assuming you mean an interface to a ddbm-style situation, you could easily
> use berkeley DB, I belive it is limited in the 4TB range...  

I just did some checking... first Robin Dunn has an interface, but it's not
currently compatible with BerkeleyDB 3.x, which just came out... it shouldn't
be hard to retrofit.  Anyway, the limits are based on page size...

	512b page:	2TB
	64K page:	256TB

It uses 32bit numbers for pages, so I assume that is also a reflection
of the number of keys allowed... given I belive one key must use a minimum
of one page.

I know that I've pushed earlier releases o around 50Gb without trouble,
but you might see issues relatd to the number of keys.  I'd ask Sleepycat
directly, as they'r amazingly responsive.

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org


From mal at lemburg.com  Fri Dec 10 17:37:30 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:37:30 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>  
	            <385129A2.6FAF4E81@lemburg.com> <199912101629.LAA14274@eric.cnri.reston.va.us>
Message-ID: <38512C4A.ADB63C2B@lemburg.com>

Guido van Rossum wrote:
> 
> > How about only adding those parts which would be needed to
> > at least deflate the ZIP archive contents ?
> 
> Ditto -- still lots of portability issues I bet.

Hmm, not sure: zlib is pretty portable. Its the interface
changes that can break code, not so much the zlib portability.
 
> > If the ZIP archive format becomes the standard for Python, we'd
> > have to ensure that all Python users can read them. Well, at
> > least that's what I would expect from a standard format :-)
> 
> There's a simple solution: don't use compression.  With current disk
> prices it's really not worth it.  Let the installer do the
> decompression (installers travel across networks where compression
> *is* worth it).

That's a possibility, right. It would still let us use the many
ZIP tools while not adding complexity to the core.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Fri Dec 10 17:43:11 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:43:11 +0100
Subject: [Python-Dev] dbm clone with serious specs wanted
References: <199912101625.LAA14216@eric.cnri.reston.va.us>
Message-ID: <38512D9F.2AE9DC8B@lemburg.com>

Guido van Rossum wrote:
> 
> Someone has asked me for a dbm clone that can store 16M keys of 350
> bytes each, and runs on Linux, HPUX, and NT.  That's 5.6 Gigabyte in
> keys alone!  I presume most classic approaches won't cut it since
> total file size is typicall limited by the seek system call, internal
> data structures and/or file index format to 2Gb (signed longs) or 4Gb
> (unsigned longs).
> 
> Does anyone have an idea where to start looking?  Would a Python
> extension already exist?

I'd suggest using a dbm style wrapper around the DB-API and then
trying out the many cross-platform databases. IBM DB2 comes to
mind... it can certainly handle these sizes given the right
hardware.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fdrake at acm.org  Fri Dec 10 18:35:01 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 12:35:01 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <199912100203.VAA07410@eric.cnri.reston.va.us>
References: <38503BDC.CB91FB29@digicool.com>
	<199912100203.VAA07410@eric.cnri.reston.va.us>
Message-ID: <14417.14789.306365.439782@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > Since we seem to be on an adding spree, I don't see why not -- as long
 > as POSIX keeps it available :)

  fsync() isn't listed in O'Reilly's POSIX book, so it's probably not
in the POSIX spec.  Neither is the tempnam() function I added in
yesterdays spree, though tmpfile() and tmpnam() are.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jim at digicool.com  Fri Dec 10 19:37:53 1999
From: jim at digicool.com (Jim Fulton)
Date: Fri, 10 Dec 1999 18:37:53 +0000
Subject: [Python-Dev] Thankyou for fsync :)
References: <38503BDC.CB91FB29@digicool.com>
		<199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us>
Message-ID: <38514881.5C124E36@digicool.com>

"Fred L. Drake, Jr." wrote:
> 
> Guido van Rossum writes:
>  > Since we seem to be on an adding spree, I don't see why not -- as long
>  > as POSIX keeps it available :)
> 
>   fsync() isn't listed in O'Reilly's POSIX book, so it's probably not
> in the POSIX spec. 

It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;)

I'd still like it to stay, where available. :)

Jim

--
Jim Fulton           mailto:jim at digicool.com
Technical Director   (888) 344-4332              Python Powered!
Digital Creations    http://www.digicool.com     http://www.python.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From fdrake at acm.org  Fri Dec 10 19:36:44 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 13:36:44 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <38514881.5C124E36@digicool.com>
References: <38503BDC.CB91FB29@digicool.com>
	<199912100203.VAA07410@eric.cnri.reston.va.us>
	<14417.14789.306365.439782@weyr.cnri.reston.va.us>
	<38514881.5C124E36@digicool.com>
Message-ID: <14417.18492.932392.608912@weyr.cnri.reston.va.us>

Jim Fulton writes:
 > It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;)

  I don't have that one, but I certainly don't have any plans on
ripping out fsync().  Not today, at any rate.  ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jim at interet.com  Fri Dec 10 19:37:50 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 13:37:50 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl>
Message-ID: <3851487E.F610BE17@interet.com>

Jack Jansen wrote:
> 
> Is it possible nowadays to have two files with the same name but different
> paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive?

Yes, I just made one with WinZip.

JimA


From gmcm at hypernet.com  Fri Dec 10 19:41:56 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Fri, 10 Dec 1999 13:41:56 -0500
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <38514881.5C124E36@digicool.com>
Message-ID: <1267271840-1299809@hypernet.com>

Fred L. Drake, Jr. wrote:
> 
> Guido van Rossum writes:
>  > Since we seem to be on an adding spree, I don't see why not
>  > -- as long as POSIX keeps it available :)
> 
>   fsync() isn't listed in O'Reilly's POSIX book, so it's
>   probably not
> in the POSIX spec. 
> 

It's in the other O'Reilly POSIX book, p 348 of POSIX.4.

- Gordon


From fdrake at acm.org  Fri Dec 10 19:43:56 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 13:43:56 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <1267271840-1299809@hypernet.com>
References: <38514881.5C124E36@digicool.com>
	<1267271840-1299809@hypernet.com>
Message-ID: <14417.18924.461115.906914@weyr.cnri.reston.va.us>

Gordon McMillan writes:
 > It's in the other O'Reilly POSIX book, p 348 of POSIX.4.

  Ah, I don't have that either.  I thought POSIX.4 was real-time
stuff.
  (If anyone wants to send a copy along, I'd be glad to consider
adding reasonable interfaces for Python. ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jim at interet.com  Fri Dec 10 19:43:18 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 13:43:18 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
Message-ID: <385149C6.DF942F36@interet.com>

Jack Jansen wrote:

> When people suggested to use zip format as the standard Python archive format
> I was a bit worried, becuase I've had it happen to me various times that I was
> unable to create a ZIP archive with two files with the same name but different
> paths (i.e. create an archive of a directory that contains both a foo/bar.py
> and a foo/spam/bar.py).

No problem.

But most zip tools will create an archive with either no
path (file name is "bar.py") or full path (filename "foo/bar.py".
If paths are different Ok, not sure about duplicate bare names.
The difference is an option and has nothing to do with how the
file name is specified to the utility.

JimA


From jim at interet.com  Fri Dec 10 19:48:47 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 13:48:47 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>  
		            <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com>
Message-ID: <38514B0F.84A546C6@interet.com>

"M.-A. Lemburg" wrote:

> How about only adding those parts which would be needed to
> at least deflate the ZIP archive contents ?
> 
> If the ZIP archive format becomes the standard for Python, we'd
> have to ensure that all Python users can read them. Well, at
> least that's what I would expect from a standard format :-)

I think that for now we will need to create archives with
compression method zero: no compression.  That is a valid
compression method all ZIP utilities support.  The point is that
zlib just isn't part of Python.

Jim


From jcw at equi4.com  Fri Dec 10 19:57:00 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Fri, 10 Dec 1999 19:57:00 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>  
			            <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> <38514B0F.84A546C6@interet.com>
Message-ID: <38514CFC.47C8A8E0@equi4.com>

"James C. Ahlstrom" wrote:
[...]
> I think that for now we will need to create archives with
> compression method zero: no compression.  That is a valid
> compression method all ZIP utilities support.

Sounds good.  This is also exactly how Java started out with jar.

-jcw


From gmcm at hypernet.com  Fri Dec 10 20:06:59 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Fri, 10 Dec 1999 14:06:59 -0500
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us>
References: <1267271840-1299809@hypernet.com>
Message-ID: <1267270337-1390160@hypernet.com>

Fred wrote:
 
> Gordon McMillan writes:
>  > It's in the other O'Reilly POSIX book, p 348 of POSIX.4.
> 
>   Ah, I don't have that either.  I thought POSIX.4 was real-time
> stuff.

Well, it says it is, but having done some stuff with automated 
warehouses, I'm always amazed at how people will use the 
term "real-time". I'd say "pretty likely to be responsive" ;-).

>   (If anyone wants to send a copy along, I'd be glad to consider
> adding reasonable interfaces for Python. ;)

Only around 70 documented functions, but many of them 
appear to be tweaks, or redocumenting stuff in view of new 
kernel behaviors.

- Gordon


From fdrake at acm.org  Fri Dec 10 20:18:16 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 14:18:16 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <1267270337-1390160@hypernet.com>
References: <1267271840-1299809@hypernet.com>
	<1267270337-1390160@hypernet.com>
Message-ID: <14417.20984.151867.630871@weyr.cnri.reston.va.us>

Gordon McMillan writes:
 > Well, it says it is, but having done some stuff with automated 
 > warehouses, I'm always amazed at how people will use the 
 > term "real-time". I'd say "pretty likely to be responsive" ;-).

  Oh, a manager's interpretation of real-time:  "I want this by close
of business next Wednesday!"

 > Only around 70 documented functions, but many of them 
 > appear to be tweaks, or redocumenting stuff in view of new 
 > kernel behaviors.

  Anything that should be added anywhere?  Failing all else, I can
probably read the man pages if I know what to look for.  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From fdrake at acm.org  Fri Dec 10 22:40:29 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 16:40:29 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <14417.29517.238124.767279@weyr.cnri.reston.va.us>

Andrew M. Kuchling writes:
 > fpathconf(fd, name) -- Get configuration limit for a file
...
 > pathconf(path, name) -- Gets config variables for a path
...
 > sysconf(int name) -- Gets system configuration information
 > 	    -- would need constants from unistd.h

  I'm almost done with these, and also confstr (from POSIX.2).  I
don't have time to finish them today; I'll check them in next week.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From skip at mojam.com  Sat Dec 11 00:20:21 1999
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 10 Dec 1999 17:20:21 -0600 (CST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us>
References: <38514881.5C124E36@digicool.com>
	<1267271840-1299809@hypernet.com>
	<14417.18924.461115.906914@weyr.cnri.reston.va.us>
Message-ID: <14417.35509.284749.924066@dolphin.mojam.com>

    Fred> I thought POSIX.4 was real-time stuff.

This all seems to be happening in real-time to me... ;-)

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From andy at robanal.demon.co.uk  Sat Dec 11 01:11:28 1999
From: andy at robanal.demon.co.uk (Andy Robinson)
Date: Sat, 11 Dec 1999 00:11:28 GMT
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: <199912101619.LAA14174@eric.cnri.reston.va.us>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>   <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>
Message-ID: <38519531.15439641@post.demon.co.uk>

On Fri, 10 Dec 1999 11:19:47 -0500, you wrote:

>> There were issues with zlib 1.0.4 and later ones. Also, many
>> Linux distributions don't have the zlib header files installed.
>
>Hm.  I don't recall having any problems reported to me.  I'd rather
>not include the entire zlib distri in the Python distri -- zlib
>is rather big.  Adding only the Unix source would be cheating.
>
Minor data point on the importance of zlib.  I spent a long time
figuring out what Adobe PDF's "flate filter" was before I discovered
it was the inverse of "deflate" (yes, there were loud sounds of
head-slapping when I clicked) and discovered that zlib.compress() was
EXACTLY what you need to create compressed streams in PDF documents.
Being a Windows person, I naively assumed zlib was in the standard
distribution everywhere, and subsequently discovered Mac and Unix
users were not so happy.  So if you want to make PDFs, having zlib
around is very useful indeed...

- Andy


From akuchlin at mems-exchange.org  Sat Dec 11 01:35:58 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Fri, 10 Dec 1999 19:35:58 -0500 (EST)
Subject: [Python-Dev] Enabling more modules by default
In-Reply-To: <38519531.15439641@post.demon.co.uk>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
	<38512256.F9287E24@lemburg.com>
	<199912101604.LAA14100@eric.cnri.reston.va.us>
	<38512734.CF6E4489@lemburg.com>
	<199912101619.LAA14174@eric.cnri.reston.va.us>
	<38519531.15439641@post.demon.co.uk>
Message-ID: <14417.40046.850655.491684@amarok.cnri.reston.va.us>

Andy Robinson writes:
>...  So if you want to make PDFs, having zlib
>around is very useful indeed...

This raises a good point, though I still dislike the idea of including
the zlib library.  It would be nice if Setup.in would be autogenerated
to compile all the modules it can -- bsddb if it finds libdb, zlib if
it finds libz.a.  I vaguely recall once working on a Python script that
would generate a customized Setup.in file, though I can't find it at
the moment.  Given that someone has already suggested automatically
enabling threads on those platforms that support it, why not go all
the way?

(But a Python script that generates a Setup.in isn't going to work,
unless we compile a minipython first and then create a more complete
Setup file.)

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
The most merciful thing in the world... is the inability of the human mind to
correlate all its contents.
    -- H.P. Lovecraft


From petrilli at amber.org  Sat Dec 11 06:54:41 1999
From: petrilli at amber.org (Christopher Petrilli)
Date: Sat, 11 Dec 1999 00:54:41 -0500
Subject: [Python-Dev] Enabling more modules by default
In-Reply-To: <14417.40046.850655.491684@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Fri, Dec 10, 1999 at 07:35:58PM -0500
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <38519531.15439641@post.demon.co.uk> <14417.40046.850655.491684@amarok.cnri.reston.va.us>
Message-ID: <19991211005441.A20923@trump.amber.org>

Andrew M. Kuchling [akuchlin at mems-exchange.org] wrote:
> Andy Robinson writes:
> >...  So if you want to make PDFs, having zlib
> >around is very useful indeed...
> 
> This raises a good point, though I still dislike the idea of including
> the zlib library.  It would be nice if Setup.in would be autogenerated
> to compile all the modules it can -- bsddb if it finds libdb, zlib if
> it finds libz.a.  I vaguely recall once working on a Python script that
> would generate a customized Setup.in file, though I can't find it at
> the moment.  Given that someone has already suggested automatically
> enabling threads on those platforms that support it, why not go all
> the way?

WEll, one warning about BSDdb, is that it comes in 3 incarnations that 
all might be -ldb :-):

	1.85
	2.x
	3.x

and they are NOT compatible with eachother.  1.85 has serious brain damage,
and honestly I'd love to see Robin Dunn's 2.x work rolled in to replace it,
but not sure how viable that is---people might actually want the 1.85 breakage.

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org


From gstein at lyra.org  Sat Dec 11 12:23:30 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 03:23:30 -0800 (PST)
Subject: [Python-Dev] Zip format
In-Reply-To: <1267287023-386248@hypernet.com>
Message-ID: <Pine.LNX.4.10.9912110321010.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Gordon McMillan wrote:
>...
> If the user imports foo.spam.bar, an importer will be asked for:
>   foo (return foo.__init__)
>   foo.spam (return foo.bar.__init__)

                         ^^^ foo.spam.__init__

>   foo.spam.bar (return foo.spam.bar)

The above sequence is what currently happens.

> But the API allows lots of variations. This is another possible 
> interaction:
>   foo (return None)
>   foo.__init__ (return foo.__init__)
>   foo.spam (return None)
>   foo.bar.__init__ (return foo.bar.__init__)
>   foo.spam.bar (return foo.spam.bar)

The core of imputil has no knowledge of the __init__ thingy. That is
specific to the filesystem-based stuff. So in this sense, "possible" means
"imputil could be changed to do this". I would argue against the change,
however :-)

> Or, by looking at different args to get_code, you could look at 
> the requests as:
>   foo in context of None
>   spam in context of foo
>   bar in context of foo.spam

Bing!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Sat Dec 11 12:26:59 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 03:26:59 -0800 (PST)
Subject: [Python-Dev] Zip format
In-Reply-To: <14417.11137.562474.99270@amarok.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912110323510.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Andrew M. Kuchling wrote:
> M.-A. Lemburg writes:
> >There were issues with zlib 1.0.4 and later ones. Also, many
> >Linux distributions don't have the zlib header files installed.
> 
> For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm,
> and zlib.XXX.rpm only contains libz.so.  On the other hand, anyone
> who's compiling Python should really have the various -devel RPMs

Exactly. The distro's *have* the headers -- it all depends on what you
installed. I happen to have the headers on my system (because I installed
zlib-devel, as AMK mentions).

> installed.  I'd argue against including it, because it might cause odd
> versioning problems.  For example, what if I have PIL compiled against
> zlib1.1.2 (zlib is used for writing PNGs) and the Python binary
> includes zlib1.1.3?  There might be hard-to-debug problems
> caused by calling the wrong symbol.

I totally agree.

>...
> Just received Guido's email suggesting skipping compression in
> archives; not a bad idea.  You'd use less CPU, but might do
> more I/O because you're reading more sectors off disk.  There
> probably isn't much need for compression when the archive is on-disk;
> Java needed it because of applets.

There are all kinds of things that we can do here. Consider mmap'ing the
archive into a shared memory segment, used by all the Python processes on
the system... woo! :-)

IMO, the standard distro can use zip files, and just bail if they are
compressed, but Python cannot load zlib. Obvious failure with an obvious
remedy. No big deal.

As Guido also mentions, an installer can just bring along zlib if they
want to use a compressed archive. i.e. their choice.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Sat Dec 11 12:33:47 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 03:33:47 -0800 (PST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <14417.7909.511437.230915@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912110332360.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Fred L. Drake, Jr. wrote:
> Greg Stein writes:
>  > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic
>  > number if you're worried about mixing CObjects.
> 
>   That's certainly one option, but I would have made readdir(),
> seekdir(), rewinddir() and closedir() into the methods read(), seek(), 
> rewind() and close().  So it's a question of what interface you
> prefer; functions with magically interpreted token parameters (kind of 
> like file descriptors, hey!), or something that is more recognizably
> object-oriented.
>   I know my preference.  ;-)

Well, I know my preference of those two alternatives, too :-), but if
we're going with the Pythonic minimalism, then I'd think you would expose
the functions "as close as possible."

Would I argue if you went with a method-based approach? No :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik at pythonware.com  Sat Dec 11 14:07:08 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 11 Dec 1999 14:07:08 +0100
Subject: [Python-Dev] Zip format
References: <Pine.LNX.4.10.9912110323510.16305-100000@nebula.lyra.org>
Message-ID: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com>

Greg Stein <gstein at lyra.org> wrote:
> There are all kinds of things that we can do here. Consider mmap'ing the
> archive into a shared memory segment, used by all the Python processes on
> the system... woo! :-)

it doesn't really look like this, but I hope we're defining
interfaces here, and not just "one true solution".  I'd be
very annoyed if it turned out that we couldn't use works'
archives with the new standard importer...

> As Guido also mentions, an installer can just bring along zlib if they
> want to use a compressed archive. i.e. their choice.

in the pythonworks universe, the installer and the
application is the same thing...

</F>


From fredrik at pythonware.com  Sat Dec 11 14:12:12 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 11 Dec 1999 14:12:12 +0100
Subject: [Python-Dev] Thankyou for fsync :)
References: <38503BDC.CB91FB29@digicool.com><199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us>
Message-ID: <006c01bf43d9$57bc0f90$f29b12c2@secret.pythonware.com>

Fred L. Drake, Jr. <fdrake at acm.org> wrote:
>   fsync() isn't listed in O'Reilly's POSIX book, so it's probably not
> in the POSIX spec.  Neither is the tempnam() function I added in
> yesterdays spree, though tmpfile() and tmpnam() are.

instead of guessing, you can get a complete
list from:

http://www.unix-systems.org/apis.html

reading up on the "single unix specification"
should also help:

http://www.unix-systems.org/online.html

(registration required; contains complete man
pages for all functions covered by the UNIX95
and UNIX98 specification)

</F>


From gstein at lyra.org  Sat Dec 11 14:10:00 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 05:10:00 -0800 (PST)
Subject: [Python-Dev] Zip format
In-Reply-To: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com>
Message-ID: <Pine.LNX.4.10.9912110505580.16305-100000@nebula.lyra.org>

On Sat, 11 Dec 1999, Fredrik Lundh wrote:
> Greg Stein <gstein at lyra.org> wrote:
> > There are all kinds of things that we can do here. Consider mmap'ing the
> > archive into a shared memory segment, used by all the Python processes on
> > the system... woo! :-)
> 
> it doesn't really look like this, but I hope we're defining
> interfaces here, and not just "one true solution".  I'd be

Oh, I was just having fun there :-). I don't see "one true solution" at
all. Just some standards.

> very annoyed if it turned out that we couldn't use works'
> archives with the new standard importer...

get_code() and its processing is not going anywhere. Some stuff will
change under the covers, and we'll be using sys.path (typically) rather
than chaining (although chaining will still exist!).

I would think that your Importer subclass would be directly usable, but
the installation could/would be a bit different. Heck, worst case, nothing
is going to invalidate your archive format -- feel free to berate me if I
ever break that!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim at interet.com  Mon Dec 13 15:50:11 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 13 Dec 1999 09:50:11 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <38510254.ED15D32B@interet.com>
Message-ID: <385507A3.9F6AAF0F@interet.com>

> Jean-Claude Wippler wrote:
> 
> > Ouch - what's wrong with zip archives?
> 
> > With all due respect - I sincerely hope you will reconsider and alter
> > your code to work with zip files.  It's probably a small adjustment?

OK, I now have a new module "zipfile" which reads and
writes ZIP files.  It is written in Python and has been tested
on Windows and Linux.  I tested it with WinZip and found that
the files it creates are read OK with WinZip, and WinZip
files are read OK with zipfile.  So I am withdrawing my
Python archive file format, and re-writing all my stuff
using zipfile.  It should all be done in a week.

Basically everything works fine.  But there are some problems.

Python seems to lack a CRC-32 function, so I wrote one
in Python.  It is slow.  We need to add a CRC-32 function
to some Python built-in module that it always present, like
md5 or binascci.  The zlib module is not necessarily present.

I can't seem to get WinZip to record a partial path.  That is,
I want the ./Lib/test package to have these ZIP paths:
  test/__init__.pyc
  test/testall.pyc
  ...
but WinZip creates files with either no path at all or the
fully specified path.  Am I missing something?  Do all
other ZIP tools do this too?

JimA


Return-Path: <owner-python-dev at python.org>
Delivered-To: python-dev at dinsdale.python.org
Received: from python.org (parrot.python.org [132.151.1.90])
	by dinsdale.python.org (Postfix) with ESMTP id EFDA11CDB9
	for <python-dev at dinsdale.python.org>; Mon, 13 Dec 1999 10:21:56 -0500 (EST)
Received: from cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1] (may be forged))
	by python.org (8.9.1a/8.9.1) with ESMTP id KAA06423
	for <python-dev at python.org>; Mon, 13 Dec 1999 10:21:55 -0500 (EST)
Received: from kaluha.cnri.reston.va.us (kaluha.cnri.reston.va.us [132.151.7.31])
	by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id KAA04774
	for <python-dev at python.org>; Mon, 13 Dec 1999 10:21:56 -0500 (EST)
Received: from eric.cnri.reston.va.us (eric.cnri.reston.va.us [10.27.10.23])
	by kaluha.cnri.reston.va.us (8.9.1b+Sun/8.9.1) with ESMTP id KAA04556
	for <python-dev at python.org>; Mon, 13 Dec 1999 10:22:34 -0500 (EST)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by eric.cnri.reston.va.us (8.9.3+Sun/8.9.1) with ESMTP id KAA18858
	for <python-dev at python.org>; Mon, 13 Dec 1999 10:22:34 -0500 (EST)
Resent-Message-Id: <199912131522.KAA18858 at eric.cnri.reston.va.us>
Message-Id: <199912131522.KAA18858 at eric.cnri.reston.va.us>
To: "James C. Ahlstrom" <jim at interet.com>
Subject: Re: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-reply-to: Your message of "Mon, 13 Dec 1999 09:50:11 EST."
             <385507A3.9F6AAF0F at interet.com> 
References: <000301bf4206$b39e5b80$36a2143f at tim> <384FC47A.BB4DA517 at interet.com> <384FDAF5.C25C447C at equi4.com> <38510254.ED15D32B at interet.com>  
            <385507A3.9F6AAF0F at interet.com> 
Date: Mon, 13 Dec 1999 10:22:12 -0500
From: Guido van Rossum <guido at CNRI.Reston.VA.US>
Resent-Cc: python-dev at python.org
Resent-Date: Mon, 13 Dec 1999 10:22:34 -0500
Resent-From: Guido van Rossum <guido at CNRI.Reston.VA.US>
Sender: python-dev-admin at python.org
Errors-To: python-dev-admin at python.org
X-BeenThere: python-dev at python.org
X-Mailman-Version: 1.2 (experimental)
Precedence: bulk
List-Id: Python core developers <python-dev.python.org>

> OK, I now have a new module "zipfile" which reads and
> writes ZIP files.  It is written in Python and has been tested
> on Windows and Linux.  I tested it with WinZip and found that
> the files it creates are read OK with WinZip, and WinZip
> files are read OK with zipfile.  So I am withdrawing my
> Python archive file format, and re-writing all my stuff
> using zipfile.  It should all be done in a week.

Ah, good!  (This saves me the trouble of cleaning up our own zip code :-)

> Basically everything works fine.  But there are some problems.
> 
> Python seems to lack a CRC-32 function, so I wrote one
> in Python.  It is slow.  We need to add a CRC-32 function
> to some Python built-in module that it always present, like
> md5 or binascci.  The zlib module is not necessarily present.
> 
> I can't seem to get WinZip to record a partial path.  That is,
> I want the ./Lib/test package to have these ZIP paths:
>   test/__init__.pyc
>   test/testall.pyc
>   ...
> but WinZip creates files with either no path at all or the
> fully specified path.  Am I missing something?  Do all
> other ZIP tools do this too?

Unclick the "Save Extra Folder Info" and then drag the *parent* folder
into the archive.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Mon Dec 13 18:00:26 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 13 Dec 1999 12:00:26 -0500 (EST)
Subject: [Python-Dev] confstr(), fpathconf(), pathconf(), sysconf()
Message-ID: <14421.9770.623399.673010@weyr.cnri.reston.va.us>

  I've just checked in bindings for these POSIX.1 and POSIX.2
functions, and thought I'd explain the interfaces for those who don't
want to read the diffs.  ;)
  These functions expect a "name" parameter (that's how it's described 
in the man pages and the O'Reilly book).  The value for "name" is an
integer that's defined in the system headers.  The constants all have
the form

    _XX_SOME_NAME

where XX is PC for fpathconf()- and pathconf()-related names, SC for
sysconf()-related names, and CS for confstr()-related names.  Some
names are defined by the standards, but additional names are defined
by implementations (there are a *lot* of sysconf() names under
Solaris!).
  We don't want to expose enormous numbers of constants in the
module's interface, however, as there are already a lot of names in
the posix module.  That would also slow down module initialization.
We also don't want to force callers to use magic numbers in code that 
uses these functions, especially since the values may be
system-specific.
  The best way to call these functions, then, is to use a *string*
that corresponds to the name of the C #define sysmbol with the leading 
underscore stripped off.  For example, to get the length of the
arguments to exec(), you could say:

    num_args = os.sysconf("SC_ARG_MAX")

  The string will be mapped to the appropriate numeric value defined
in an internal table.  If the name isn't defined for the platform, a
ValueError will be raised.

    >>> num_args = os.sysconf("FOO_BAR")
    Traceback (innermost last):
      File "<stdin>", line 1, in ?
    ValueError: unrecognized configuration name

  To allow retrieval for platform-dependent configuration information, 
integers can also be passed in.  On Solaris, this is equivalent to
using "SC_ARG_MAX":

    num_args = os.sysconf(1)

(Ignoring the portability and readability issues, ha!)
  There are three separate tables used for this; one for confstr(),
one for sysconf(), and one shared by fpathconf() and pathconf().  The
names used to build the tables come from Linux and Solaris; we can add 
other names as needed.  To add names, I'd need the names to add and
how to test for their existence at compile time (#ifdef, etc.).


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From fdrake at acm.org  Mon Dec 13 19:35:49 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 13 Dec 1999 13:35:49 -0500 (EST)
Subject: [Python-Dev] CVS: python/dist/src/Modules posixmodule.c,2.116,2.117
In-Reply-To: <Pine.LNX.4.10.9912131025480.16305-100000@nebula.lyra.org>
References: <199912131637.LAA17318@weyr.cnri.reston.va.us>
	<Pine.LNX.4.10.9912131025480.16305-100000@nebula.lyra.org>
Message-ID: <14421.15493.28263.387680@weyr.cnri.reston.va.us>

Greg Stein writes:
 > I'm not very familiar with these APIs, but should you let go of the
 > interpreter lock when you call them?
 > (and for the other new funcs)

  None of these should be doing an I/O as far as I can determine.
Whenever I get to getlogin() (which AMK & I decided should be
included, based on the specs that /F pointed us to), I will release
the interpreter lock for the getlogin_r() variant.  I'm not sure I
should release it for the non-reentrant getlogin(), however; the
specification for getlogin*() pretty much requires that it read from
utmp.  ;(


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From gstein at lyra.org  Mon Dec 13 21:31:22 1999
From: gstein at lyra.org (Greg Stein)
Date: Mon, 13 Dec 1999 12:31:22 -0800 (PST)
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <385507A3.9F6AAF0F@interet.com>
Message-ID: <Pine.LNX.4.10.9912131229590.16305-100000@nebula.lyra.org>

On Mon, 13 Dec 1999, James C. Ahlstrom wrote:
>...
> OK, I now have a new module "zipfile" which reads and
> writes ZIP files.  It is written in Python and has been tested
> on Windows and Linux.  I tested it with WinZip and found that
> the files it creates are read OK with WinZip, and WinZip
> files are read OK with zipfile.  So I am withdrawing my
> Python archive file format, and re-writing all my stuff
> using zipfile.  It should all be done in a week.

Can you post zipfile.py so that people can starting reviewing that?

>...
> Python seems to lack a CRC-32 function, so I wrote one
> in Python.  It is slow.  We need to add a CRC-32 function
> to some Python built-in module that it always present, like
> md5 or binascci.  The zlib module is not necessarily present.

See zlib.crc32()

This is interesting, of course, because we have previously stated that
zlib (and its compression) is optional. But if we need the CRC-32
function...

hehe...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one at email.msn.com  Mon Dec 13 23:11:33 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Mon, 13 Dec 1999 17:11:33 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <385507A3.9F6AAF0F@interet.com>
Message-ID: <000401bf45b7$04edfaa0$96a2143f@tim>

[James C. Ahlstrom]
> ...
> Python seems to lack a CRC-32 function, so I wrote one
> in Python.  It is slow.  We need to add a CRC-32 function
> to some Python built-in module that it always present, like
> md5 or binascci.  The zlib module is not necessarily present.

Unfortunately, there are many different CRC functions in common use.  None
belong in md5; if the intent is to support just zip's version, adding a
(say) zipcrc32 function to binascii would be ok; if we expect to support
others as well, a new parameterized crc module would be in order.

> I can't seem to get WinZip to record a partial path.  That is,
> I want the ./Lib/test package to have these ZIP paths:
>   test/__init__.pyc
>   test/testall.pyc
>   ...
> but WinZip creates files with either no path at all or the
> fully specified path.  Am I missing something?  Do all
> other ZIP tools do this too?

No, it's a clumsiness unique to WinZip (damn GUIs <0.9 wink>).  In the Add
dialog box, you need to cd to the *Lib* directory, check the "Save extra
folder info" box, and then, e.g.,

1. Put
      test\*.pyc
   in the Add Files line, and click Add With Wildcards.
   Then all test\*.pyc files will be added, with paths test/__init__.pyc
   etc.

or

2. Put
      "test\__init__.pyc" "test\testall.pyc"
   (including the quotes!) in the Add Files line, and click Add.

Since #2 can be unbearable, other useful strategies include:

3. Use #1 (e.g. with dir\*.*) then delete the files you didn't really
   want.

4. Use #1 repeatedly, cleverly using a number of wildcard patterns that
   cover the files of interest.

5. Mixtures of #3 and #4.

6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has
   an "experimental" cmdline add-on too, but haven't tried it).


From jim at interet.com  Tue Dec 14 14:13:03 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 08:13:03 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <Pine.LNX.4.10.9912131229590.16305-100000@nebula.lyra.org>
Message-ID: <3856425F.8C5E7A42@interet.com>

Greg Stein wrote:
> 

> Can you post zipfile.py so that people can starting reviewing that?

Yes, it will be available by next Monday.  I just want to
get it really working and pretty, and with documentation.

JimA


From jim at interet.com  Tue Dec 14 14:26:50 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 08:26:50 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000401bf45b7$04edfaa0$96a2143f@tim>
Message-ID: <3856459A.BF5A798A@interet.com>

Tim Peters wrote:
> 
> [James C. Ahlstrom]
> > ...
> > Python seems to lack a CRC-32 function, so I wrote one
>
> Unfortunately, there are many different CRC functions in common use.  None
> belong in md5; if the intent is to support just zip's version, adding a
> (say) zipcrc32 function to binascii would be ok; if we expect to support
> others as well, a new parameterized crc module would be in order.

OK, a CRC-32 in binascii it is.  The CRC-32 I
have comes with these comments which seem to indicate it is a
more "official standard" CRC-32 than average:

# *  Crc - 32 BIT ANSI X3.66 CRC checksum files
#*********************************************************************\
#*                                                                    *|
#* Demonstration program to compute the 32-bit CRC used as the frame  *|
#* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71     *|
#* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level     *|
#* protocol).  The 32-bit FCS was added via the Federal Register,     *|
#* 1 June 1982, p.23798.  I presume but don't know for certain that   *|
#* this polynomial is or will be included in CCITT V.41, which        *|
#* defines the 16-bit CRC (often called CRC-CCITT) polynomial.  FIPS  *|
#* PUB 78 says that the 32-bit FCS reduces otherwise undetected       *|
#* errors by a factor of 10^-5 over 16-bit FCS.                       *|
#*                                                                    *|
#*********************************************************************
#* Copyright (C) 1986 Gary S. Brown.  You may use this program, or
#* code or tables extracted from it, as desired without restriction.
 
I can submit this as a patch to binascii, or if the Copyright bothers
anyone, maybe it is better for Guido to use his CRC-32 from his ZIP
code.  Preference?

> > I can't seem to get WinZip to record a partial path.  That is,
>
> dialog box, you need to cd to the *Lib* directory, check the "Save extra
> folder info" box, and then, e.g.,

Thanks.  I knew there had to be some magic incantation to do it.
 
> 6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has
>    an "experimental" cmdline add-on too, but haven't tried it).

Actually pkzip 2.04g doesn't work because it writes names in upper case
and is limited to 8.3 names (I think).  My zipfile.py can be used as
a basis for a command line tool.  Actually I use makefiles with imbedded
Python programs and find this easier than command line tools.

JimA


From guido at CNRI.Reston.VA.US  Tue Dec 14 15:53:04 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 09:53:04 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Tue, 14 Dec 1999 08:26:50 EST."
             <3856459A.BF5A798A@interet.com> 
References: <000401bf45b7$04edfaa0$96a2143f@tim>  
            <3856459A.BF5A798A@interet.com> 
Message-ID: <199912141453.JAA23429@eric.cnri.reston.va.us>

> OK, a CRC-32 in binascii it is.  The CRC-32 I
> have comes with these comments which seem to indicate it is a
> more "official standard" CRC-32 than average:
> 
> # *  Crc - 32 BIT ANSI X3.66 CRC checksum files
> #*********************************************************************\
> #*                                                                    *|
> #* Demonstration program to compute the 32-bit CRC used as the frame  *|
> #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71     *|
> #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level     *|
> #* protocol).  The 32-bit FCS was added via the Federal Register,     *|
> #* 1 June 1982, p.23798.  I presume but don't know for certain that   *|
> #* this polynomial is or will be included in CCITT V.41, which        *|
> #* defines the 16-bit CRC (often called CRC-CCITT) polynomial.  FIPS  *|
> #* PUB 78 says that the 32-bit FCS reduces otherwise undetected       *|
> #* errors by a factor of 10^-5 over 16-bit FCS.                       *|
> #*                                                                    *|
> #*********************************************************************
> #* Copyright (C) 1986 Gary S. Brown.  You may use this program, or
> #* code or tables extracted from it, as desired without restriction.
>  
> I can submit this as a patch to binascii, or if the Copyright bothers
> anyone, maybe it is better for Guido to use his CRC-32 from his ZIP
> code.  Preference?

I looked, but "my" crc32 in the zlib module (which was actually
contributed by Andrew Kuchling) is just a wrapper around the crc32
function in zlib, which is copyrighted by Mark Adler and follows the
zlib rules.

I propose to use Gary Brown's code.  I'll defend this to CNRI's
lawyers if need be.

Jim, have you checked that this is the right CRC to use for zip's CRC?
(This in the light of Tim's assertion that there are many CRCs around.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at interet.com  Tue Dec 14 16:22:56 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 10:22:56 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000401bf45b7$04edfaa0$96a2143f@tim>  
	            <3856459A.BF5A798A@interet.com> <199912141453.JAA23429@eric.cnri.reston.va.us>
Message-ID: <385660D0.C6C0C7B9@interet.com>

Guido van Rossum wrote:

> I propose to use Gary Brown's code.  I'll defend this to CNRI's
> lawyers if need be.
> 
> Jim, have you checked that this is the right CRC to use for zip's CRC?
> (This in the light of Tim's assertion that there are many CRCs around.)

The CRC it calculates agrees with the CRC of WinZip for all
files I have tried.  The original Gary Brown code was much
longer and included file reading.  Here is the shortened version:

JimA


# *  Crc - 32 BIT ANSI X3.66 CRC checksum files
#*********************************************************************\
#*                                                                    *|
#* Demonstration program to compute the 32-bit CRC used as the frame  *|
#* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71     *|
#* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level     *|
#* protocol).  The 32-bit FCS was added via the Federal Register,     *|
#* 1 June 1982, p.23798.  I presume but don't know for certain that   *|
#* this polynomial is or will be included in CCITT V.41, which        *|
#* defines the 16-bit CRC (often called CRC-CCITT) polynomial.  FIPS  *|
#* PUB 78 says that the 32-bit FCS reduces otherwise undetected       *|
#* errors by a factor of 10^-5 over 16-bit FCS.                       *|
#*                                                                    *|
#*********************************************************************

#
#* Copyright (C) 1986 Gary S. Brown.  You may use this program, or
#* code or tables extracted from it, as desired without restriction.
 
# First, the polynomial itself and its table of feedback terms.  The  
# polynomial is                                                       
# X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X^1+X^0 
# Note that we take it "backwards" and put the highest-order term in  
# the lowest-order bit.  The X^32 term is "implied"; the LSB is the   
# X^31 term, etc.  The X^0 term (usually shown as "+1") results in    
# the MSB being 1.                                                    

# Note that the usual hardware shift register implementation, which   
# is what we're using (we're merely optimizing it by doing eight-bit  
# chunks at a time) shifts bits into the lowest-order term.  In our   
# implementation, that means shifting towards the right.  Why do we   
# do it this way?  Because the calculated CRC must be transmitted in  
# order from highest-order term to lowest-order term.  UARTs transmit 
# characters in order from LSB to MSB.  By storing the CRC this way,  
# we hand it to the UART in the order low-byte to high-byte; the UART 
# sends each low-bit to hight-bit; and the result is transmission bit 
# by bit from highest- to lowest-order term without requiring any bit 
# shuffling on our part.  Reception works similarly.                  

# The feedback terms table consists of 256, 32-bit entries.  Notes:   
#                                                                     
#  1. The table can be generated at runtime if desired; code to do so 
#     is shown later.  It might not be obvious, but the feedback      
#     terms simply represent the results of eight shift/xor opera-    
#     tions for all combinations of data and CRC register values.     
#                                                                     
#  2. The CRC accumulation logic is the same for all CRC polynomials, 
#     be they sixteen or thirty-two bits wide.  You simply choose the 
#     appropriate table.  Alternatively, because the table can be     
#     generated at runtime, you can start by generating the table for 
#     the polynomial in question and use exactly the same "updcrc",   
#     if your application needn't simultaneously handle two CRC       
#     polynomials.  (Note, however, that XMODEM is strange.)          
#                                                                     
#  3. For 16-bit CRCs, the table entries need be only 16 bits wide;   
#     of course, 32-bit entries work OK if the high 16 bits are zero. 
#                                                                     
#  4. The values must be right-shifted by eight bits by the "updcrc"  
#     logic; the shift must be unsigned (bring in zeroes).  On some   
#     hardware you could probably optimize the shift in assembler by  
#     using byte-swap instructions.                                   

# Converted to Python by James C. Ahlstrom

crc_32_tab = [	# CRC polynomial 0xedb88320
0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f,
0xe963a535, 0x9e6495a3,
0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, 0x09b64c2b, 0x7eb17cbd,
0xe7b82d07, 0x90bf1d91,
0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, 0x1adad47d, 0x6ddde4eb,
0xf4d4b551, 0x83d385c7,
0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, 0x14015c4f, 0x63066cd9,
0xfa0f3d63, 0x8d080df5,
0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, 0x3c03e4d1, 0x4b04d447,
0xd20d85fd, 0xa50ab56b,
0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, 0x32d86ce3, 0x45df5c75,
0xdcd60dcf, 0xabd13d59,
0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423,
0xcfba9599, 0xb8bda50f,
0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, 0x2f6f7c87, 0x58684c11,
0xc1611dab, 0xb6662d3d,
0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f,
0x9fbfe4a5, 0xe8b8d433,
0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d,
0x91646c97, 0xe6635c01,
0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, 0x6c0695ed, 0x1b01a57b,
0x8208f4c1, 0xf50fc457,
0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49,
0x8cd37cf3, 0xfbd44c65,
0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7,
0xa4d1c46d, 0xd3d6f4fb,
0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, 0x44042d73, 0x33031de5,
0xaa0a4c5f, 0xdd0d7cc9,
0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3,
0xb966d409, 0xce61e49f,
0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81,
0xb7bd5c3b, 0xc0ba6cad,
0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, 0xead54739, 0x9dd277af,
0x04db2615, 0x73dc1683,
0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d,
0x0a00ae27, 0x7d079eb1,
0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb,
0x196c3671, 0x6e6b06e7,
0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, 0xf9b9df6f, 0x8ebeeff9,
0x17b7be43, 0x60b08ed5,
0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767,
0x3fb506dd, 0x48b2364b,
0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55,
0x316e8eef, 0x4669be79,
0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, 0xcc0c7795, 0xbb0b4703,
0x220216b9, 0x5505262f,
0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31,
0x2cd99e8b, 0x5bdeae1d,
0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f,
0x72076785, 0x05005713,
0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, 0x92d28e9b, 0xe5d5be0d,
0x7cdcefb7, 0x0bdbdf21,
0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b,
0x6fb077e1, 0x18b74777,
0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69,
0x616bffd3, 0x166ccf45,
0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, 0xa7672661, 0xd06016f7,
0x4969474d, 0x3e6e77db,
0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5,
0x47b2cf7f, 0x30b5ffe9,
0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693,
0x54de5729, 0x23d967bf,
0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, 0xb40bbe37, 0xc30c8ea1,
0x5a05df1b, 0x2d02ef8d
]


def crc32(string):
  crc = 0xFFFFFFFF
  for ch in string:
    crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) &
0xFFFFFF)
  return ~crc


From tim_one at email.msn.com  Tue Dec 14 18:06:36 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 14 Dec 1999 12:06:36 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912141453.JAA23429@eric.cnri.reston.va.us>
Message-ID: <000101bf4655$94e40840$3a2d153f@tim>

[Guido]
> I propose to use Gary Brown's code.  I'll defend this to CNRI's
> lawyers if need be.

If there's a hassle, I can do a clean-room implementation easily enough --
although I'd rather not.

> Jim, have you checked that this is the right CRC to use for zip's CRC?

If WinZip unzips Jim's files without griping, the odds that he's got the
wrong CRC are about 1 in 2**36 <wink>.

> (This in the light of Tim's assertion that there are many CRCs
> around.)

There are, and several others are hiding in assorted communications stds
(e.g., Ethernet uses a different 32-bit CRC); but the zip CRC is the one
you'll find most commonly described on the Web.

All the same, once Jim releases his code, I'll do an anal verification that
it's the right one.


From jim at interet.com  Tue Dec 14 18:54:35 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 12:54:35 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000101bf4655$94e40840$3a2d153f@tim>
Message-ID: <3856845B.6C3C7330@interet.com>

Tim Peters wrote:

> If WinZip unzips Jim's files without griping, the odds that he's got the
> wrong CRC are about 1 in 2**36 <wink>.

You mean 2**32, right?  Oh, sorry, you must be
using a DEC-10  <wink again>.

JimA


From gstein at lyra.org  Tue Dec 14 20:23:36 1999
From: gstein at lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 11:23:36 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <3856425F.8C5E7A42@interet.com>
Message-ID: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, James C. Ahlstrom wrote:

> Greg Stein wrote:
> > 
> 
> > Can you post zipfile.py so that people can starting reviewing that?
> 
> Yes, it will be available by next Monday.  I just want to
> get it really working and pretty, and with documentation.

My point was that people could possibly use it *before* then. Not
everybody needs it to be pretty, needs doc, or needs it fully working.
Maybe people would like to provide feedback on the API. Maybe they'd like
to start their own modules that use your library.

This goes back to my years-old statement: release it now rather than later
-- people can always use it now, and there might not be a later.

Release early. Release often. :-)

People are too hesitant to release code. Why? Just send it out there. When
you update it, send out another. It doesn't hurt anybody to have more than
one release.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one at email.msn.com  Wed Dec 15 05:20:25 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 14 Dec 1999 23:20:25 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <3856845B.6C3C7330@interet.com>
Message-ID: <000501bf46b3$b6184f40$05a0143f@tim>

[Tim]
> If WinZip unzips Jim's files without griping, the odds that he's
> got the wrong CRC are about 1 in 2**36 <wink>.

[JimA]
> You mean 2**32, right?

Nope!  For each of the 2**32 polynomials you may have pulled out of thin
air, there are about a dozen common variations in the details of CRC
algorithms.  For example, a CRC used for hashing usually initializes "the
register" to 0, but a CRC used to protect against transmission errors
usually initializes to a block of 1 bits (since leading zeroes don't affect
the result, and a common transmission error is dropping a prefix of the
msg).  Similarly, algorithms vary in the order they scan the data; in
whether they use the raw data or its complement; and in whether they return
the actual remainder, the complement of the remainder, or a checksum
cleverly computed so that "the other end" always sees a fixed remainder
other than 0 (or ~0).

> Oh, sorry, you must be using a DEC-10  <wink again>.

I used a Univac 1108 in college, back when ASCII was in its infancy.  They
couldn't decide on the natural size for a character, so the 36-bit 1108
could be configured to treat each word as either 6 6-bit bytes or 4 9-bit
ones.  If they had been thinking ahead, they would have defined it as two
Unicode characters plus a 4-bit tag field for the Python implementation to
play with <wink>.

now-they-make-their-living-suing-.gif-bandits-ly y'rs  - tim


From tim_one at email.msn.com  Wed Dec 15 08:40:11 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 15 Dec 1999 02:40:11 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <385660D0.C6C0C7B9@interet.com>
Message-ID: <000b01bf46cf$9ebe27e0$05a0143f@tim>

[JimA posts his Python rendering of Gary Brown's code]

Yup!  That's the zip algorithm, right down to the absurdly bit-reversed
polynomial.

> def crc32(string):
>   crc = 0xFFFFFFFF
>   for ch in string:
>     crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) &
> 0xFFFFFF)
>   return ~crc

Note that the last line is better (whether in Python or C!) as

    return crc ^ 0xffffffff

Else you'll get a surprising result in a 64-bit Python, and in some 64-bit C
implementations.

it's-a-32-bit-algorithm-not-an-"int"-or-"long"-one-ly y'rs  - tim


From fredrik at pythonware.com  Wed Dec 15 10:31:29 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 15 Dec 1999 10:31:29 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000101bf4655$94e40840$3a2d153f@tim>
Message-ID: <002601bf46e0$06e25ca0$f29b12c2@secret.pythonware.com>

> [Guido]
> > I propose to use Gary Brown's code.  I'll defend this to CNRI's
> > lawyers if need be.
> 
> If there's a hassle, I can do a clean-room implementation easily enough --
> although I'd rather not.

or you can grab the code from PIL, which already
comes with a Python compatible license...

(it's based on ISO 3307, but judging from the table
James posted, it's the same thing...)

</F>


From fredrik at pythonware.com  Wed Dec 15 10:39:19 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 15 Dec 1999 10:39:19 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000b01bf46cf$9ebe27e0$05a0143f@tim>
Message-ID: <003001bf46e0$43860b20$f29b12c2@secret.pythonware.com>

Tim Peters <tim_one at email.msn.com> wrote:
> Yup!  That's the zip algorithm, right down to the absurdly bit-reversed
> polynomial.

also known as ISO 3307, according to some
strange comments in PIL's sources...

</F>


From jim at interet.com  Wed Dec 15 16:53:34 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 15 Dec 1999 10:53:34 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
Message-ID: <3857B97E.3684224F@interet.com>

Greg Stein wrote:

> Release early. Release often. :-)

You are right of course.  OK, the zipfile.py code and docs are at:

  ftp://ftp.interet.com/pub/pylib.html

Despite the ftp URL, clicking on it should display the html.

Please don't panic if is seems to be slow.  It uses a Python CRC-32
which is slow.  You may want to hack it to use zlib.crc32() if you
have it.

I am testing with WinZip.  If you have another zip tool, it
would be interesting to see how compatible it is.

JimA


From guido at CNRI.Reston.VA.US  Wed Dec 15 17:38:47 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 15 Dec 1999 11:38:47 -0500
Subject: [Python-Dev] Writers wanted for Linux Journal Python special issue
Message-ID: <199912151638.LAA02522@eric.cnri.reston.va.us>

Linux Journal is preparing a special issue devoted to Python (actually
more like a pullout section or whatever I think).  They are looking
for writers, e.g. to write a piece about Python's history and/or an
introduction.  And probably anything else Python related.

If you're interested, please write to Marjorie Richardson
<mlr at ssc.com>, who is coordinating.  Also direct any questions to her.

This is for the June issue which will be on newsstands mid-May and
mailed to subscribers even earlier, I believe.  The deadline is
February 1st (magazine production takes forever!).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin at mems-exchange.org  Wed Dec 15 19:17:53 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Wed, 15 Dec 1999 13:17:53 -0500 (EST)
Subject: [Python-Dev] fwd. from Paul Prescod
Message-ID: <14423.56145.877163.395736@amarok.cnri.reston.va.us>

This is a forwarded e-mail from the XML-SIG mailing list, in which
Paul makes some good points.  Some context: I've been arguing against
adding more XML stuff to the base Python distribution, because 1) it's
bloat for those people don't care about XML, and 2) the Distutils is
supposed to fix this by making installing things easier.  Paul's
response, below, has shaken my conviction a bit (*only* a bit,
though).  If it's deemed valuable, perhaps the XML-SIG could
concentrate on the minimal set of parser + SAX + DOM that could be
included in 1.6.

Please join the XML-SIG to follow the specifics of this thread
further, as it relates only to XML.  As a more general philosophical
question for python-dev: do we want to add things to 1.6 following the
"batteries included" philosophy?  Or should we wave in the direction
of the distutils and say they'll fix the problem?  (In which case they
should be given high priority, as in "1.6 doesn't ship until they're
done".)

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
And after all, why should I go to bed every night? Sleep is only a habit.
    -- Cornelius Van Horne


Paul Prescod writes:
>"Andrew M. Kuchling" wrote:
>> 
>> Huh?  There's obviously a good deal of stuff in there, some of it
>> perhaps too esoteric, but I don't see where there's overlap.  
>
>Well, there are several parsers and parser wrappers. How is a user
>supposed to choose? And there is PyDOM, Minidom and qp_dom.
>
>> Or are
>> you talking about Python tools in general, where there are 3 DOM
>> implementations?  (PyDOM, 4DOM, and ZDOM hiding inside Zope.)
>
>That too.
>
>> I lean against shoveling more stuff into 1.6; better to get the
>> Distutils widely used, which makes it easier to install *all* Python
>> extensions.
>
>I don't think that XML is any more of an "add-on" to a modern scripting
>language than URL support or regular expression support. I'm in the
>"batteries included" camp for this and several other reasons: 
>
>	* standard Python libraries may soon need XML support. If WebDAV takes
>off then there should be a libWebDAV right alongside libftp and libhttp.
>And libWebDAV will require XML
>
>	* there is a difference between theory and practice. In theory,
>distutils will be done soon and everything will be easy. In practice, it
>is the end of 1999 and at every conference I have to install the XML sig
>package on the machines of several people who haven't been able to get
>it going themselves. In practice, we can't wait for distutils because
>people are choosing their XML tools now.
>
>> >Ideally we would have one (or at most two!) implementation of each of
>> >the major specs:
>> >XML    >SAX   >Unicode    >XPath    >XPointer   >XSLT    >DOM
>> 
>> Do you mean "one implementation of each in a single package", or "one
>> implementation existing for Python, distributed separately"?
>
>With the possible exception of XSLT, one implementation of each *in
>Python 1.6*.
>
>> We need to come up with a position paper for developer's day, stating
>> what needs to be discussed.  Suggestions?  I'd propose focusing on
>> getting the XML-SIG package to 1.0, but that's just an idea.
>
>I don't see how the XML-SIG package can ever get to 1.0. Anybody can
>contribute code at anytime and thus far we've been totally flexible
>about putting it in. I think that's great. It just won't ever lead to a
>stable, carefully maintained, tightly interoperable package. Some of the
>maintainers of the individual pieces have probably lost interest and
>there is probably nobody that understands it all enough to integrate it
>nicely.
>
>-- 
> Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
>


From fdrake at acm.org  Wed Dec 15 20:47:01 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 15 Dec 1999 14:47:01 -0500 (EST)
Subject: [Python-Dev] posix module
Message-ID: <14423.61493.90107.433664@weyr.cnri.reston.va.us>

  Ok, I think I'm done with the posix module updates, modulo bugs and
additional symbols for the *conf*() tables.  That leaves us with the
following status for interfaces that Andrew brought up in the message
that started this spate of additions:

Worth adding?
=============
opendir(), readdir(), closedir() -- not added
           The only thing these give us that os.listdir() doesn't is
           the inode numbers.  Unless someone actually wants those,
           it's not worth having.

Worth adding:
=============

abort() -- added

ctermid(), ctermid_r() -- added
            
fpathconf(fd, name) -- added

getlogin() -- added

getgroups(gidsetsize, grouplist) -- added

pathconf(path, name) -- added

sysconf(int name) -- added; also added confstr(int name)

Not worth adding:
=================
clearerr() -- not added

cuserid() -- not added

difftime -- not added

tmpfile(), tmpnam() -- added, also tempnam()

mblen(), mbstowcs(), mbtowc(), wcstombs(),  wctomb() -- not added


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jeremy at cnri.reston.va.us  Wed Dec 15 20:58:16 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 15 Dec 1999 14:58:16 -0500 (EST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
References: <3856425F.8C5E7A42@interet.com>
	<Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
Message-ID: <14423.62168.576273.719577@goon.cnri.reston.va.us>

>>>>> "GS" == Greg Stein <gstein at lyra.org> writes:

  GS> On Tue, 14 Dec 1999, James C. Ahlstrom wrote:
  >> Greg Stein wrote: >
  >> 
  >> > Can you post zipfile.py so that people can starting reviewing
  >> that?
  >> 
  >> Yes, it will be available by next Monday.  I just want to get it
  >> really working and pretty, and with documentation.

  GS> My point was that people could possibly use it *before*
  GS> then. Not everybody needs it to be pretty, needs doc, or needs
  GS> it fully working.  Maybe people would like to provide feedback
  GS> on the API. Maybe they'd like to start their own modules that
  GS> use your library.

  GS> This goes back to my years-old statement: release it now rather
  GS> than later -- people can always use it now, and there might not
  GS> be a later.

Ok.  I think we need some kind of zip file support in the core so that
it can be used as a standard distribution format.  I'd be happy if
Jim's zipfile module ended up being it.  We've got some zip code that
we developed at CNRI; it's a bit of a mess, but it might be helpful to
see what we did.  Our code is at ftp://www.python.org/pub/tmp/zip.zip

Jeremy


From jim at interet.com  Thu Dec 16 16:41:56 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 16 Dec 1999 10:41:56 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com>
Message-ID: <38590844.769C3025@interet.com>

Did anyone look at this yet?

   ftp://ftp.interet.com/pub/pylib.html

   ftp://ftp.interet.com/pub/zipfile.py

JimA


From skip at mojam.com  Thu Dec 16 16:46:28 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 16 Dec 1999 09:46:28 -0600 (CST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38590844.769C3025@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
	<3857B97E.3684224F@interet.com>
	<38590844.769C3025@interet.com>
Message-ID: <14425.2388.529932.61119@dolphin.mojam.com>

    JA> Did anyone look at this yet?
    JA>    ftp://ftp.interet.com/pub/pylib.html
    JA>    ftp://ftp.interet.com/pub/zipfile.py

I thought it wasn't supposed to be out until Monday?  You're looking for,
perhaps, a time machine? ;-)

(More seriously, it won't have any effect on my "gotta have this done
yesterday" list, so I will let others comment...)

Skip


From jim at interet.com  Thu Dec 16 18:16:21 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 16 Dec 1999 12:16:21 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com>
Message-ID: <38591E65.4885A39D@interet.com>

"James C. Ahlstrom" wrote:
 
>    ftp://ftp.interet.com/pub/pylib.html

I just changed zipfile.py so that regular zip compression
works.  And if zlib is available,
its crc32() is used instead of the Python version.

I should mention that the current code rejects zip files which have
an archive comment added to the end.  Accepting them would require
a search, and I am not sure it is worth it.

JimA


From fdrake at acm.org  Thu Dec 16 18:19:23 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 16 Dec 1999 12:19:23 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <Pine.LNX.4.10.9912151910500.16305-100000@nebula.lyra.org>
References: <199912151831.NAA02685@weyr.cnri.reston.va.us>
	<Pine.LNX.4.10.9912151910500.16305-100000@nebula.lyra.org>
Message-ID: <14425.7963.347400.763562@weyr.cnri.reston.va.us>

[Note that Greg's message went to python-checkins since he responded
to a checkin message, but I suspect he meant to change the header to
point to python-dev.  ;)  If not, too bad!]

Greg Stein writes:
 > But this means that your tables no long reside in "const" space. Yet More
 > Per-Process Memory...
 > 
 > It would be nice to have those tables marked as "const".

  Perhaps; as Guido points out, there haven't been a lot of complaints 
about this issue.
  I will note that only the tables aren't constant; the strings that
are pointed to are still constant.  I'm inclined to let the compiler/
linker care about this, and not change the code without a really clear 
need to do so.
  Here are the sizes of those tables and the strings they point to
(including terminating null bytes for the strings):

pathconf_names:  14 entries, 112 bytes,  176 string bytes
confstr_names:   25 entries, 200 bytes,  576 string bytes
sysconf_names:  108 entries, 864 bytes, 1774 string bytes

  Figures are for Solaris7.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From gstein at lyra.org  Thu Dec 16 19:10:14 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 10:10:14 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <14425.7963.347400.763562@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161006011.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Fred L. Drake, Jr. wrote:
> [Note that Greg's message went to python-checkins since he responded
> to a checkin message, but I suspect he meant to change the header to
> point to python-dev.  ;)  If not, too bad!]

I didn't really care too much where it went. I would actually suggest that
the Reply-To: on the checkin list is set to python-dev if that is where
replies are Supposed To Go.
[ I do this with mod_dav checkins; replies to dav-checkins mail goes to
  dav-dev. ]

> Greg Stein writes:
>  > But this means that your tables no long reside in "const" space. Yet More
>  > Per-Process Memory...
>  > 
>  > It would be nice to have those tables marked as "const".
> 
>   Perhaps; as Guido points out, there haven't been a lot of complaints 
> about this issue.
>   I will note that only the tables aren't constant; the strings that
> are pointed to are still constant.  I'm inclined to let the compiler/
> linker care about this, and not change the code without a really clear 
> need to do so.
>   Here are the sizes of those tables and the strings they point to
> (including terminating null bytes for the strings):
> 
> pathconf_names:  14 entries, 112 bytes,  176 string bytes
> confstr_names:   25 entries, 200 bytes,  576 string bytes
> sysconf_names:  108 entries, 864 bytes, 1774 string bytes
> 
>   Figures are for Solaris7.

Ah. I just replied to that. Guess that one went to python-checkins :-)

True, this is a small amount of memory. But they start to add up.
non-const globals also pain me when I start to work on free-threading
stuff (each must be examined to see if synchronization is needed), so
reducing the number there is important. Regarding the memory itself: as I
mentioned in the other note, I just want to ensure that Python's working
set remains low (reasons given in that email).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skip at mojam.com  Thu Dec 16 19:09:11 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 16 Dec 1999 12:09:11 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <Pine.LNX.4.10.9912161003010.16305-100000@nebula.lyra.org>
References: <199912161553.KAA08428@eric.cnri.reston.va.us>
	<Pine.LNX.4.10.9912161003010.16305-100000@nebula.lyra.org>
Message-ID: <14425.10951.169751.843764@dolphin.mojam.com>

>>>>> "Greg" == Greg Stein <gstein at lyra.org> writes:

    Greg> On Thu, 16 Dec 1999, Guido van Rossum wrote:
    >> I don't think there's much of a need to worry about this.  Why are
    >> you always bringing up this subject?  No-one else that I know has
    >> ever had this concern...

    Greg> Somebody has to :-)

    Greg> Keeping the working set low is more efficient from a system
    Greg> standpoint. 

Not to mention the not-all-that-occasional-anymore requests to have Python
on various itty-bitty things like Palm Pilots and WinCE devices.  It's one
thing to add size to modules people can live without for many applications,
but I think the posix module and its other platform-specific relations are
fairly heavily used.  (I realize this specific example isn't likely to apply
to PP/WinCE.)

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From gstein at lyra.org  Thu Dec 16 19:21:54 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 10:21:54 -0800 (PST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
In-Reply-To: <199912161527.KAA08308@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Guido van Rossum wrote:
>...
> I realize it's just a rant.  In this case (distutils) your advice is
> correct.  (I usually paraphrase it as "release early, release often".)

True. I prefer that phrase, too, but I used it on JimA earlier in the day
or the previous day. I didn't want to sound like a broken record :-). But
that is why I moved into <rant> mode... it seems like the mindset was
spreading :-) I've railed at AMK for it, too :-), when he was talking
about 0.5.1pre1 or whatever, rather than just releasing 0.5.1 and doing an
0.5.2 if there was a problem.

> However there are other situations, like core Python itself, where
> it's really useful to have stable releases -- if only for those users
> who won't touch anything with "beta" in its name.  I still hear from
> people who haven't upgraded to 1.5.2.

But this doesn't explain why there isn't a 1.5.3b1, 1.5.3b2, etc. Or
1.6.0a1 or whatever (maybe "d" or "r" for dev release, as opposed to
alpha).

There are some people would like the releases rather than using CVS. Some
people can't even use CVS because of firewall issues. Of course, an
alternative is snapshot-tarballs of the CVS repository. But a snapshot
could *really* be broken; something like 1.6.0d1 says "well, it's a
development release, but I've hit a good point between some changes."

> I wonder if perhaps for those cases (where there's a demand for stable
> releases) some other strategy could be used?  Such as labeling
> releases "stable" after the fact?  Or what Linus seems to do with the
> Linux kernel (even = stable, odd = development; or was it the other
> way around?).

Yes: even are stable (e.g. 1.0, 1.2, 2.0, 2.2). The odd numbers are for
development. Linus is currently working 2.3.x, but declared in the past
couple days that things will be wrapping up to move towards 2.4. Once he
thinks it is ready, he'll start off with 2.4.0pre1, pre2, pre3... At some
point the "pre" suffix will drop and 2.4.0 will be released.

You might have a bit of problem using that mechanism since the current
stable release is 1.5 :-). Once 1.6 hits the street, then you could start
doing 1.9 releases (dev) and shift to 2.0 once it is "stable".

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul at prescod.net  Thu Dec 16 19:02:55 1999
From: paul at prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 10:02:55 -0800
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
References: <199912132354.SAA10101@amarok.cnri.reston.va.us>
		<3856A77C.3A4D9F00@prescod.net>
		<14423.49044.143333.790752@amarok.cnri.reston.va.us>
		<3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us>
Message-ID: <3859294F.138FF398@prescod.net>

"Andrew M. Kuchling" wrote:
> 
>     * Python revisions come out slowly, once every year or two.  XML
>     standards have been revolving faster , and we don't want to wait
>     until 1.7 for SAX2, or DOM Level2, or other new revisions.
>     Keeping the modules out of the core lets them be updated at their
>     own pace.  A counterargument is that the XML specs are slowing
>     down -- add namespace support to SAX, and finalize DOM
>     Level 2, and I don't think any other standards are very important
>     to basic XML programming.

I agree with your counterargument. :) Anyhow, isn't there a logical
fallacy in your original argument? Why can't we offer a DOM 3 module or
extension after Python ships with DOM 2? 

>     * We really want a C-based parser to be commonly available.
>     sgmlop is the only reasonable choice for this, because I'd be
>     against including Expat.  To replay some arguments I made against
>     including the zlib library in 1.6, what if a C extension requires
>     a newer version of the library?  Symbol conflicts if you're lucky,
>     hard-to-debug problems if you're not.

I don't understand this issue. Why would a C extension build on sgmlop
which is designed to make XML information available to *Python*
programmers?

>     * We can drop various marginal bits of the CVS tree; the xmlarch
>     support is probably not of very wide interest, for example.

How about "expat", "mac", "pyexpat", "utils", "windows". There is just
too much stuff there! And I daresay that alot of it has not been
"quality controlled" to the level that we would expect if it were a part
of the real Python library. In other words, there is no single place to
go to get only XML-processing software that works well and works
together.

> I think I'm on the record as saying that Python's major problems now
> aren't language-related, but are with the development environment.
> Language changes (from minor, like 'for i in 1..9', to major, like
> fixing the type/class dichotomy or adding static types) aren't going
> to bring in piles of new users, useful though they might be to
> experienced Pythoneers, large projects, or some other specific
> application.

(irrelevant aside: I agree 100% that making things easier to install
will actually improve newbies experience more than (e.g.) static type
checking but I do not agree that it is a better "sales tool". Most
people are sold based on the language and its libraries before they
start trying to install extensions.)

> If installing things is a problem, then we need to
> buckle down and finish the distutils.  So, overall, I'd still vote
> against inclusion in 1.6.

So are you saying that Python 2 might have only five packages and
everything else must be downloaded? No httplib, no pickle, no random or
math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?

When people download Python and go to the library documentation that
impressive array of BUILT-IN-FEATURES is part of what sells them on
Python. Hell, I can download all of that stuff for Scheme but what makes
Python beautiful is that I don't have to download it for Python. It's
just there. But if an XML person comes to Python after hearing us rant
about how great it is for processing XML and all they find is
xmllib...they will be underwhelmed.

> No, it's *got* to reach 1.0.  The point of the package is that it's
> exactly *one* thing to install that gives basic XML tools; you don't
> need to chase down the SAX modules from Lars' page, PyExpat from
> ftp.cwi.nl, sgmlop from pythonware.com, and so forth.  If the
> Distutils made it as easy as:
> 
> python fetchpackage.py SAX PyExpat DOM sgmlop
>    <find PySAX's home site>
>    <download it>
>    <compile & install>
>    etc...
> 
> then much of the need for a single package goes away, but, as you
> point out, that isn't currently the case.

I'm a little lost here. We need xmllib to continue because distutils
doesn't do what we need yet but we don't need to put the stuff in the
Python library because disutils will work well enough soon.

But there is an important issue that disutils will not solve. One of the
beautiful things about the Python library is that everything is at the
same version level. When you install it you know that everything works
together or else it WILL in the next patch level if you report the
incompatibility. When the xml package gets versioned incompatibly with
the Python library you don't have that safe feeling. 

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From akuchlin at mems-exchange.org  Thu Dec 16 19:50:48 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Thu, 16 Dec 1999 13:50:48 -0500 (EST)
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
In-Reply-To: <3859294F.138FF398@prescod.net>
References: <199912132354.SAA10101@amarok.cnri.reston.va.us>
	<3856A77C.3A4D9F00@prescod.net>
	<14423.49044.143333.790752@amarok.cnri.reston.va.us>
	<3857CEB0.C29C5F24@prescod.net>
	<14423.57778.131798.776845@amarok.cnri.reston.va.us>
	<3859294F.138FF398@prescod.net>
Message-ID: <14425.13448.737831.460241@amarok.cnri.reston.va.us>

(Responding to the python-dev related portion of this...)

Paul Prescod writes:
>I don't understand this issue. Why would a C extension build on sgmlop
>which is designed to make XML information available to *Python*
>programmers?

No, no; I'm arguing against shipping with Expat; sgmlop good!
Consider this scenario:

	* Python includes Expat 1.0
	* Some C library (for DAV or whatever) uses Expat 1.1
	* Someone writes a Python interface to this C library and
	  attempts to compile it statically.
	* Two versions of Expat in the same binary; symbol conflicts
	  and core dumps, oh my!

>So are you saying that Python 2 might have only five packages and
>everything else must be downloaded? No httplib, no pickle, no random or
>math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?

I'm not arguing for dropping existing packages; I'm against adding
many more of them.  Existing library modules can stay where they are.
But I wouldn't mind a minimalist Python too much, if it came with a
script fetch-basic-packages:

python fetch-packages.py httplib
python fetch-packages.py imaplib
 ...  200 more lines ...

>I'm a little lost here. We need xmllib to continue because distutils
>doesn't do what we need yet but we don't need to put the stuff in the
>Python library because disutils will work well enough soon.

Basically, yes.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
And now let us hasten to the station. I have commanded the rain to fall at
exactly one-fifteen and I would hate to get my shoes wet.
    -- Lord Lavender, in SEBASTIAN O #2


From bwarsaw at cnri.reston.va.us  Thu Dec 16 19:50:49 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Thu, 16 Dec 1999 13:50:49 -0500 (EST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
References: <199912161527.KAA08308@eric.cnri.reston.va.us>
	<Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>
Message-ID: <14425.13449.954026.960703@anthem.cnri.reston.va.us>

    >> I wonder if perhaps for those cases (where there's a demand for
    >> stable releases) some other strategy could be used?  Such as
    >> labeling releases "stable" after the fact?  Or what Linus seems
    >> to do with the Linux kernel (even = stable, odd = development;
    >> or was it the other way around?).

I really dislike the odd/even distinction for exactly this reason.

-Barry


From guido at CNRI.Reston.VA.US  Thu Dec 16 20:02:16 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 16 Dec 1999 14:02:16 -0500
Subject: [Python-Dev] Batteries Included?
Message-ID: <199912161902.OAA11345@eric.cnri.reston.va.us>

I like the batteries included approach, but I also feel resistence
against including stuff I cannot maintain.  The XML code base is a
point in case; I don't understand enough about XML.  (I just read that
xmllib.py is "illegal".  Jeez!  What happened?  Did Congress pass a
law against it?)

I think it may be time for separate Python distributions, like Linux
-- I can concentrate on the core, and keep it really small; others can
make all-encompassing distributions.

There are currently some drawbacks to this approach: non-core modules
have less status; and the documentation process is fundamentally
different for core and non-core modules.  There's also the version
dependency stuff, but I think resolving that is the responsibility of
the distribution makers.

I think the status problem will be gone once there is a respected
distribution -- then you derive status from being in that
distribution, rather than from being in the core distribution.  (Well,
you would still derive status from being in the core, but it would be
much harder to obtain, since I can set a much higher standard.)

The documentation problem is the one that's left.  I think the doc-sig
may be on its way as we speak to solve this, though.  Fred?

This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip at mojam.com  Thu Dec 16 20:05:05 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 16 Dec 1999 13:05:05 -0600 (CST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
In-Reply-To: <14425.13449.954026.960703@anthem.cnri.reston.va.us>
References: <199912161527.KAA08308@eric.cnri.reston.va.us>
	<Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>
	<14425.13449.954026.960703@anthem.cnri.reston.va.us>
Message-ID: <14425.14305.907618.978628@dolphin.mojam.com>

    >>> Or what Linus seems to do with the Linux kernel (even = stable, odd
    >>> = development; or was it the other way around?).

    BAW> I really dislike the odd/even distinction for exactly this reason.

It's one saving grace is that it is a uniform format.  There are no
"optional" tokens like "pre", "alpha", "beta", etc for the most part.

To remember which way it is, I find it useful to execute "uname -r", check
the second digit, then look down at my shirt for a pocket protector.  The
two pieces of information together work for me.  I currently get
"2.2.13-4mdk" from uname.  I don't even have a pocket, let alone a pocket
protector, so even numbers must be stable releases...

;-)

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From fdrake at acm.org  Thu Dec 16 20:05:22 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 16 Dec 1999 14:05:22 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <14425.10951.169751.843764@dolphin.mojam.com>
References: <199912161553.KAA08428@eric.cnri.reston.va.us>
	<Pine.LNX.4.10.9912161003010.16305-100000@nebula.lyra.org>
	<14425.10951.169751.843764@dolphin.mojam.com>
Message-ID: <14425.14322.355507.500813@weyr.cnri.reston.va.us>

Skip Montanaro writes:
 > fairly heavily used.  (I realize this specific example isn't likely to apply
 > to PP/WinCE.)

  Or any version of Windows, I suspect; perhaps Mark Hammond can
elaborate.  Appearantly none of the pathconf() constants are defined
on that platform, at least not as #define constants.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jcw at equi4.com  Thu Dec 16 20:09:42 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Thu, 16 Dec 1999 20:09:42 +0100
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
References: <199912132354.SAA10101@amarok.cnri.reston.va.us>
			<3856A77C.3A4D9F00@prescod.net>
			<14423.49044.143333.790752@amarok.cnri.reston.va.us>
			<3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> <3859294F.138FF398@prescod.net>
Message-ID: <385938F6.C4164756@equi4.com>

Paul Prescod wrote:
[...]
> (irrelevant aside: [...] Most people are sold based on the language
> and its libraries before they start trying to install extensions.)
> 
> [AMK]
> > If installing things is a problem, then we need to
> > buckle down and finish the distutils.  So, overall, I'd still vote
> > against inclusion in 1.6.
> 
> So are you saying that Python 2 might have only five packages and
> everything else must be downloaded? No httplib, no pickle, no random
> or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?
> 
> When people download Python and go to the library documentation that
> impressive array of BUILT-IN-FEATURES is part of what sells them on
> Python. Hell, I can download all of that stuff for Scheme but what
> makes Python beautiful is that I don't have to download it for Python.
> It's just there. But if an XML person comes to Python after hearing us
> rant about how great it is for processing XML and all they find is
> xmllib...they will be underwhelmed.

(Nodding in agreement)

Could this perhaps be solved with a large batteries-included standard
distribution, plus a real easy/effective way to strip Python down and
wrap things up for deployment?  

In other words, aim for two very distinct goals: everything within easy
reach for development + fully signed-sealed-delivered products.

The first goal can evolve to do fancy net-bourne distribution, even if
it is a brittle process, because this is for Python developers.  They
want it all, so open the floodgate to give it all to them.

The second becomes a matter or pruning down and wrapping up.  All the
way down to an single installation-less executable, if possible.

I may well be wrong (and I'm not tracking distutils), but might it not
be simpler to focus on 1) power users + 2) production-grade deployment,
instead of trying to streamline a tangled-web-of-module-dependencies
into a distribution system which tries to meet a wide range of needs?

> [...] One of the beautiful things about the Python library is that
> everything is at the same version level. When you install it you know
> that everything works together or else it WILL in the next patch level
> if you report the incompatibility.  [...]

More nods.  So why not allow the Python distribution to become very
large - with every release moving to a better-tuned combination of all
the different parts (occasional mishaps can quickly be fixed)?

Plus some tools to dist(ut)il(l) a turnkey solution from this big soup.

Sort-of-from-violin-to-quartet-all-the-way-to-symphony-orchestra...

-- Jean-Claude


From gstein at lyra.org  Thu Dec 16 21:02:46 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 12:02:46 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38590844.769C3025@interet.com>
Message-ID: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, James C. Ahlstrom wrote:
> Did anyone look at this yet?
> 
>    ftp://ftp.interet.com/pub/pylib.html
> 
>    ftp://ftp.interet.com/pub/zipfile.py

I went to look for it, but I think that was before you put zipfile up.

Looking at it now...  The writepy() as a method is questionable, I think.
I think it should open the file at instantiation time. I don't see a
reason to allow that to be deferred. Especially given that some of the
methods fail if open() hasn't been called. It would be good to have
symbolic names for the 0 and 8 compression constants, and to fail if 8 is
passed and zlib is not available (otherwise, it doesn't fail until
read/write time, and with a NameError). There should probably be a
__del__ that calls close(). Oh, and a "closed" attribute that can be
checked and an error raised if an operation is done after the file has
been closed. I think dir() should return the contents, rather than print
them. read() and write() ought to fail if the mode is incorrect. Oh, some
symbolic constants for things like "PK\005\006" would be nice.

Do you have a ZipImporter written?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Thu Dec 16 21:12:30 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 12:12:30 -0800 (PST)
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
In-Reply-To: <14425.13448.737831.460241@amarok.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161210350.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Andrew M. Kuchling wrote:
> Paul Prescod writes:
> >I don't understand this issue. Why would a C extension build on sgmlop
> >which is designed to make XML information available to *Python*
> >programmers?
> 
> No, no; I'm arguing against shipping with Expat; sgmlop good!
> Consider this scenario:
> 
> 	* Python includes Expat 1.0
> 	* Some C library (for DAV or whatever) uses Expat 1.1
> 	* Someone writes a Python interface to this C library and
> 	  attempts to compile it statically.
> 	* Two versions of Expat in the same binary; symbol conflicts
> 	  and core dumps, oh my!

We should ship pyexpat, not Expat.  (IMO)

> >So are you saying that Python 2 might have only five packages and
> >everything else must be downloaded? No httplib, no pickle, no random or
> >math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?
> 
> I'm not arguing for dropping existing packages; I'm against adding
> many more of them.  Existing library modules can stay where they are.
> But I wouldn't mind a minimalist Python too much, if it came with a
> script fetch-basic-packages:
> 
> python fetch-packages.py httplib
> python fetch-packages.py imaplib
>  ...  200 more lines ...

Considering that it would probably use HTTP to fetch the packages, I think
you wouldn't be fetching httplib :-)

But yes: I agree with the basic sentiment.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From petrilli at amber.org  Thu Dec 16 21:55:16 1999
From: petrilli at amber.org (Christopher Petrilli)
Date: Thu, 16 Dec 1999 15:55:16 -0500
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <199912161902.OAA11345@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Thu, Dec 16, 1999 at 02:02:16PM -0500
References: <199912161902.OAA11345@eric.cnri.reston.va.us>
Message-ID: <19991216155516.A28037@trump.amber.org>

Guido van Rossum [guido at CNRI.Reston.VA.US] wrote:
> I think it may be time for separate Python distributions, like Linux
> -- I can concentrate on the core, and keep it really small; others can
> make all-encompassing distributions.

My fear is what we face in the Zope world---different distributions break
in totally diffrent ways, and sometimes we have to ask 30 questions to figure
out what might be going wrong :/  The nice thing is hat if someone installes
Python from the source, we know what's going to happen.  I don't know if
this is solvable, honestly.

> This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

I think Guido just wants to IPO and retire :-)

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org


From gward at cnri.reston.va.us  Thu Dec 16 22:03:26 1999
From: gward at cnri.reston.va.us (Greg Ward)
Date: Thu, 16 Dec 1999 16:03:26 -0500
Subject: [Python-Dev] distutils-sig/python-dev crosstalk
Message-ID: <19991216160325.H4289@cnri.reston.va.us>

Most recent threads on distutils-sig seem to have migrated to python-dev
pretty quickly.  This means that a) there are python-dev people on
distutils-sig (duh), b) they think what goes on there is important
enough to interest the other core developers (good!), and c) they assume
there are people on python-dev who are not also on distutils-sig.

Is this last assumption true?  If you read python-dev, are interested in
distutils issues, but do *not* read distutils-sig, please drop me a
note.  If no one says anything, I will (politely, tentatively) propose
that we keep the distutils threads on distutils-sig and leave python-dev
for, well, core Pythond development.

If you think that the two are inextricably linked and I might as well
just cross-post everything on distutils-sig to python-dev, let me know
about that too.  ;-)

        Greg
-- 
Greg Ward - software developer                    gward at cnri.reston.va.us
Corporation for National Research Initiatives    
1895 Preston White Drive                           voice: +1-703-620-8990
Reston, Virginia, USA  20191-5434                    fax: +1-703-620-0913


From gstein at lyra.org  Thu Dec 16 22:18:50 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:18:50 -0800 (PST)
Subject: [Python-Dev] distutils-sig/python-dev crosstalk
In-Reply-To: <19991216160325.H4289@cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161316580.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Greg Ward wrote:
>...
> If you think that the two are inextricably linked and I might as well
> just cross-post everything on distutils-sig to python-dev, let me know
> about that too.  ;-)

:-)  I think distutils is about the mechanics. And it is a large and
sophisticated problem (which why it has a SIG :-). You could almost view
it as a spinoff of the python-dev grand problem set.

When we get into the question of "what does Python ship with?", then I
think it belongs in python-dev, as that is a discussion of what
constitutes Python itself.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Thu Dec 16 22:21:12 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:21:12 -0800 (PST)
Subject: [Python-Dev] distutils-sig/python-dev crosstalk
In-Reply-To: <19991216160325.H4289@cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161318550.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Greg Ward wrote:
> Most recent threads on distutils-sig seem to have migrated to python-dev
> pretty quickly.  This means that a) there are python-dev people on
> distutils-sig (duh), b) they think what goes on there is important
> enough to interest the other core developers (good!), and c) they assume
> there are people on python-dev who are not also on distutils-sig.

Oh. One more thing.

Actually, what I am somewhat worried about is whether there was relevant
discussion on python-dev that should have been visible to the distutils
people. Not sure if there was, but that is always a potential problem.
Same with the recent xml-sig / python-dev crosstalk. Specifically, Paul
Prescod is not on python-dev, so he may have missed a response or two.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From mal at lemburg.com  Thu Dec 16 22:23:30 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 16 Dec 1999 22:23:30 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com>
Message-ID: <38595852.E8054741@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "James C. Ahlstrom" wrote:
> 
> >    ftp://ftp.interet.com/pub/pylib.html
> 
> I just changed zipfile.py so that regular zip compression
> works.  And if zlib is available,
> its crc32() is used instead of the Python version.
> 
> I should mention that the current code rejects zip files which have
> an archive comment added to the end.  Accepting them would require
> a search, and I am not sure it is worth it.

I don't think it is needed for our purposes, but maybe a
subclass could provide it ?

FYI, I've tested the module against mxStack-0.3.0.zip which 
you can find on my Python Pages. It was created using Info-ZIP's
zip 2.2 on Linux.

Unfortunately, I always get the following traceback when trying
to print the directory:

>>> z.open('../projects/distribution/mxStack-0.3.0.zip','rb')
>>> z.dir()
File Name                             Modified             Size
Stack/mxStack/mxStack.h        1999-04-16 10:50:06         4368
Stack/mxStack/mxstdlib.h       1999-04-13 15:37:52         5433
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File "/home/lemburg/lib/zipfile.py", line 120, in dir
    bytes = self.read(name)     # Just to check CRC-32
  File "/home/lemburg/lib/zipfile.py", line 133, in read
    bytes = zlib.decompress(bytes, -15)
zlib.error: Error -5 while decompressing data

Some notes on the API:
----------------------
* I would find it more convenient if the filename and mode
would be constructor parameters, e.g.

	zfile = zipfile('myfile.zip','rb')

with compression defaulting to 8 rather than 0 (most zip files
will be deflated since this is the ZIP default).

* Also, I would like a method much like the os.listdir()
which returns a list of filenames rather than print it
to stdout.

* .is_zipfile() should probably be a separate function: it
doesn't use any of the class' features.

More wishes to come ;-)

So far: Great Work !

Aside: I found that you are using undocumented arguments to
zlib.compressobj() ... are these extra arguments left out of
the documentation on purpose or by simple oversight ? I couldn't
find them in the HTML docs and neither in the docstrings.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    15 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein at lyra.org  Thu Dec 16 22:32:09 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:32:09 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38595852.E8054741@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912161330570.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, M.-A. Lemburg wrote:
>...
> Some notes on the API:
> ----------------------
> * I would find it more convenient if the filename and mode
> would be constructor parameters, e.g.
> 
> 	zfile = zipfile('myfile.zip','rb')
> 
> with compression defaulting to 8 rather than 0 (most zip files
> will be deflated since this is the ZIP default).
> 
> * Also, I would like a method much like the os.listdir()
> which returns a list of filenames rather than print it
> to stdout.

The above two items were in my ramble, just not as clear as MAL :-)

> * .is_zipfile() should probably be a separate function: it
> doesn't use any of the class' features.

Ah! Good call. It is even more important to shift it out if the
constructor now opens a file.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fdrake at acm.org  Thu Dec 16 22:33:36 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 16 Dec 1999 16:33:36 -0500 (EST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38595852.E8054741@lemburg.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
	<3857B97E.3684224F@interet.com>
	<38590844.769C3025@interet.com>
	<38591E65.4885A39D@interet.com>
	<38595852.E8054741@lemburg.com>
Message-ID: <14425.23216.636687.704436@weyr.cnri.reston.va.us>

M.-A. Lemburg writes:
 > Aside: I found that you are using undocumented arguments to
 > zlib.compressobj() ... are these extra arguments left out of
 > the documentation on purpose or by simple oversight ? I couldn't
 > find them in the HTML docs and neither in the docstrings.

  The documentation is way out of date and Jeremy Hylton and Andrew
Kuchling haven't updated it.  I'm not sure which of them changed the
signatures for that module, but I've pestered Jeremy about it a few
times.
  If anyone would like to update the documentation, I'd certainly
appreciate it.  I don't know the details of those interfaces, and this 
is somewhere where the details are pretty critical.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From bwarsaw at cnri.reston.va.us  Fri Dec 17 00:10:11 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Thu, 16 Dec 1999 18:10:11 -0500 (EST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
References: <199912161527.KAA08308@eric.cnri.reston.va.us>
	<Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>
	<14425.13449.954026.960703@anthem.cnri.reston.va.us>
	<14425.14305.907618.978628@dolphin.mojam.com>
Message-ID: <14425.29011.429867.485070@anthem.cnri.reston.va.us>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> To remember which way it is, I find it useful to execute
    SM> "uname -r", check the second digit, then look down at my shirt
    SM> for a pocket protector.  The two pieces of information
    SM> together work for me.  I currently get "2.2.13-4mdk" from
    SM> uname.  I don't even have a pocket, let alone a pocket
    SM> protector, so even numbers must be stable releases...

What do you do if it's the second Thursday after the full moon, and
the local hockey team has just skated to a 3-3 tie?

-Barry


From mal at lemburg.com  Thu Dec 16 22:53:36 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 16 Dec 1999 22:53:36 +0100
Subject: [Python-Dev] Batteries Included?
References: <199912161902.OAA11345@eric.cnri.reston.va.us>
Message-ID: <38595F60.7C1B34FF@lemburg.com>

Guido van Rossum wrote:
> 
> I like the batteries included approach, but I also feel resistence
> against including stuff I cannot maintain. 
> ...
> This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

I think we should wait for distutils to get up and running
perfectly for everyone before taking such a step.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    15 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein at lyra.org  Fri Dec 17 09:31:38 1999
From: gstein at lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 00:31:38 -0800 (PST)
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <38595F60.7C1B34FF@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912170027530.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, M.-A. Lemburg wrote:
> Guido van Rossum wrote:
> > I like the batteries included approach, but I also feel resistence
> > against including stuff I cannot maintain. 

This is an interesting comment, and is similar to the Apache sentiment.
Nothing gets added to the standard distribution unless somebody in the
Group is willing to maintain it. It provides a good mechanism for keeping
the module set to a reasonable size and a set that can/will actually be
maintained.

> > ...
> > This isn't rocket science.  Red Hat Python?  I'm all for it! :-)
> 
> I think we should wait for distutils to get up and running
> perfectly for everyone before taking such a step.

You can also operate on the assumption that it will be done by the time
1.6 is ready to be released. In other words: do the work (distutils and
minimizing the release) in parallel, rather than in sequence.

I would also think that a large distro isn't going to be assembled with
distutils. Somebody will sit down, pull all the components together, and
make a big release.

However, I do see the distutils as being needed for the people who grab
the minimal distro. They need it to grab add'l packages.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik at pythonware.com  Fri Dec 17 10:06:20 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 17 Dec 1999 10:06:20 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>
Message-ID: <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com>

James C. Ahlstrom wrote:
> > Did anyone look at this yet?
> > 
> >    ftp://ftp.interet.com/pub/pylib.html
> > 
> >    ftp://ftp.interet.com/pub/zipfile.py
> 
> I went to look for it, but I think that was before you put zipfile up.

just a few comments (from reading the docs):

-- it would be great if "open" could take an open file
object as well as a file name.

(in this case, you also need to document what you
expect from the underlying file object: read, write,
seek, tell should be enough, right?  haven't looked
at the code -- assuming it works, I'm only interested
in the interface)

-- or you could nuke "open" and pass those arguments
to the constructor instead.

-- I assume "open" adds "b" to the given mode argument.

-- "dir" looks a bit strange.  and hey, there's no "listdir"
in there.  I'd prefer a recursive "listdir" method, which
takes an optional "depth" argument (e.g. 0=this dir,
1=this dir and first subdir, None=infinity, i.e. the full
tree).

that's all for now.

</F>


From fredrik at pythonware.com  Fri Dec 17 13:21:03 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 17 Dec 1999 13:21:03 +0100
Subject: [Python-Dev] posix module
References: <14423.61493.90107.433664@weyr.cnri.reston.va.us>
Message-ID: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com>

> Ok, I think I'm done with the posix module updates, modulo bugs and
> additional symbols for the *conf*() tables.

gcc  -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H -c ./posixmodule.c
./posixmodule.c:3789: `_SC_AIO_LIST_MAX' undeclared here (not in a function)
./posixmodule.c:3789: initializer element for `posix_constants_sysconf[10].value' is not constant
make[1]: *** [posixmodule.o] Error 1
make[1]: Leaving directory `/data/repository/BleedingEdge/python/dist/src/Modules'

(current CVS stuff, on Red Hat 5.2)

</F>


From jim at interet.com  Fri Dec 17 15:33:31 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 09:33:31 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>
Message-ID: <385A49BB.4D064240@interet.com>

Greg Stein wrote:
> 
> On Thu, 16 Dec 1999, James C. Ahlstrom wrote:
> > Did anyone look at this yet?
> >
> >    ftp://ftp.interet.com/pub/pylib.html
> >
> >    ftp://ftp.interet.com/pub/zipfile.py
> 
> Looking at it now...  The writepy() as a method is questionable, I think.
> I think it should open the file at instantiation time. I don't see a
> reason to allow that to be deferred. Especially given that some of the
> methods fail if open() hasn't been called.

I eliminated open and added its args to the constructor.

> It would be good to have
> symbolic names for the 0 and 8 compression constants, and to fail if 8 is
> passed and zlib is not available (otherwise, it doesn't fail until
> read/write time, and with a NameError). There should probably be a
> __del__ that calls close(). Oh, and a "closed" attribute that can be
> checked and an error raised if an operation is done after the file has
> been closed.

All done.

> I think dir() should return the contents, rather than print
> them.

I added listdir() and documented self.TOC.  I kept printdir()
as example code.

> read() and write() ought to fail if the mode is incorrect. Oh, some
> symbolic constants for things like "PK\005\006" would be nice.

All done.

JimA


From guido at CNRI.Reston.VA.US  Fri Dec 17 15:43:23 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 17 Dec 1999 09:43:23 -0500
Subject: [Python-Dev] Batteries Included?
In-Reply-To: Your message of "Thu, 16 Dec 1999 22:53:36 +0100."
             <38595F60.7C1B34FF@lemburg.com> 
References: <199912161902.OAA11345@eric.cnri.reston.va.us>  
            <38595F60.7C1B34FF@lemburg.com> 
Message-ID: <199912171443.JAA12414@eric.cnri.reston.va.us>

> Guido van Rossum wrote:
> > 
> > I like the batteries included approach, but I also feel resistence
> > against including stuff I cannot maintain. 
> > ...
> > This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

MAL:
> I think we should wait for distutils to get up and running
> perfectly for everyone before taking such a step.

Fair enough -- but in the mean time, no more pushing for new modules
in the core distribution (distutils excluded).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward at cnri.reston.va.us  Fri Dec 17 15:59:09 1999
From: gward at cnri.reston.va.us (Greg Ward)
Date: Fri, 17 Dec 1999 09:59:09 -0500
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us>; from guido@cnri.reston.va.us on Fri, Dec 17, 1999 at 09:43:23AM -0500
References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> <199912171443.JAA12414@eric.cnri.reston.va.us>
Message-ID: <19991217095908.B8799@cnri.reston.va.us>

On 17 December 1999, Guido van Rossum said:
> Fair enough -- but in the mean time, no more pushing for new modules
> in the core distribution (distutils excluded).

So anyone who wants a new module snuck into the core just has to
convince me to add it the distutils package, right?  >snicker<

        Greg


From jeremy at cnri.reston.va.us  Fri Dec 17 19:30:37 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Fri, 17 Dec 1999 13:30:37 -0500 (EST)
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us>
References: <199912161902.OAA11345@eric.cnri.reston.va.us>
	<38595F60.7C1B34FF@lemburg.com>
	<199912171443.JAA12414@eric.cnri.reston.va.us>
Message-ID: <14426.33101.757523.853781@goon.cnri.reston.va.us>

>>>>> "GvR" == Guido van Rossum <guido at CNRI.Reston.VA.US> writes:

  >> Guido van Rossum wrote:  I like the batteries included
  >> approach, but I also feel resistence  against including stuff I
  >> cannot maintain.   ...   This isn't rocket science.  Red Hat
  >> Python?  I'm all for it! :-)

  >> MAL wrote:
  >> I think we should wait for distutils to get up and running
  >> perfectly for everyone before taking such a step.

  GvR> Fair enough -- but in the mean time, no more pushing for new
  GvR> modules in the core distribution (distutils excluded).

Perhaps the right long-term solution (post-distutils) is to split
Python into a core architected by Guido and a bazaar-style standard
library maintained in a more apache-style.

Jeremy


From jim at interet.com  Fri Dec 17 16:25:10 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 10:25:10 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com>
Message-ID: <385A55D6.A8A05EB9@interet.com>

"M.-A. Lemburg" wrote:

> Unfortunately, I always get the following traceback when trying
> to print the directory:

OK, I changed the decompress code (10:23 AM), please re-try.

> with compression defaulting to 8 rather than 0 (most zip files
> will be deflated since this is the ZIP default).

The compress mode only applies to writing.  On read, the
method recorded in the file controls.

JimA


From jim at interet.com  Fri Dec 17 15:49:20 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 09:49:20 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org> <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com>
Message-ID: <385A4D70.A162C584@interet.com>

Fredrik Lundh wrote:
> 
> James C. Ahlstrom wrote:
> > >
> > >    ftp://ftp.interet.com/pub/pylib.html

> -- it would be great if "open" could take an open file
> object as well as a file name.

I put these arguments into the constructor now.

> (in this case, you also need to document what you
> expect from the underlying file object: read, write,
> seek, tell should be enough, right?  haven't looked
> at the code -- assuming it works, I'm only interested
> in the interface)

OK, docs updated.

> -- I assume "open" adds "b" to the given mode argument.

Correct.  The mode can be either "w" or "wb" etc., and it works.

> -- "dir" looks a bit strange.  and hey, there's no "listdir"
> in there.  I'd prefer a recursive "listdir" method, which
> takes an optional "depth" argument (e.g. 0=this dir,
> 1=this dir and first subdir, None=infinity, i.e. the full
> tree).

I added a plain listdir() and changed dir() to printdir().  I also
documented self.TOC which gets you the values too.

JimA


From jim at interet.com  Fri Dec 17 15:39:51 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 09:39:51 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com>
Message-ID: <385A4B37.333B9443@interet.com>

"M.-A. Lemburg" wrote:
> 
> "James C. Ahlstrom" wrote:
> > >    ftp://ftp.interet.com/pub/pylib.html
> >

> Unfortunately, I always get the following traceback when trying
> to print the directory:

Yes, compression isn't there yet.  I am looking into it.
 
> Some notes on the API:
> ----------------------
> * I would find it more convenient if the filename and mode
> would be constructor parameters, e.g.
> 
>         zfile = zipfile('myfile.zip','rb')

OK, done.
 
> with compression defaulting to 8 rather than 0 (most zip files
> will be deflated since this is the ZIP default).

Until compression works, and zlib ships with Python I
would rather default to no compression (method 0).  Otherwise
this is not useful as a Python import archive.
 
> * Also, I would like a method much like the os.listdir()
> which returns a list of filenames rather than print it
> to stdout.

OK, done.
 
> * .is_zipfile() should probably be a separate function: it
> doesn't use any of the class' features.

OK, done.
  
> Aside: I found that you are using undocumented arguments to
> zlib.compressobj() ... are these extra arguments left out of
> the documentation on purpose or by simple oversight ? I couldn't
> find them in the HTML docs and neither in the docstrings.

I am following the CNRI code blindly here.  I don't have
docs either.

JimA


From jack at oratrix.nl  Fri Dec 17 23:54:03 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 17 Dec 1999 23:54:03 +0100
Subject: [Python-Dev] Batteries Included? 
In-Reply-To: Message by Jeremy Hylton <jeremy@cnri.reston.va.us> ,
	     Fri, 17 Dec 1999 13:30:37 -0500 (EST) , <14426.33101.757523.853781@goon.cnri.reston.va.us> 
Message-ID: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl>

Recently, Jeremy Hylton <jeremy at cnri.reston.va.us> said:
> Perhaps the right long-term solution (post-distutils) is to split
> Python into a core architected by Guido and a bazaar-style standard
> library maintained in a more apache-style.

I can't help feeling uncomfortable with this. I've had quite some work 
to get an Apache with SSL up and running, even though someone gave me
quite precise instructions. With Perl I fared even worse, despite
their distutils-like package, when I wanted to try a PalmPilot package 
for Unix that needed Perl. I finally had to give up after quite some
effort because the addon installers kept finding the older version of
Perl that the system mgr had installed in stead of my newer version.

I think distutils will be wonderful for us, the Python community, but
something more RedHattish is needed for the general world who just want 
Python plus a certain set of extensions because some application needs 
it, so they can just download a fresh copy of ParrotPython 3.4.4 and
know the application will work, without interfering with another
application that happens to use Inquisition 1a5 and lives elsewhere on 
the disk.

And maybe the answer is a much simpler freezing process, like
MacPython BuildApplication where any Python user can drop a script on
it and end up with a fully self-contained app guaranteed (well.... No
reports to the contrary have been heard so far, at least:-) to contain
everything needed and not interfere with an existing MacPython
installation (or be interfered with by it). Then a popular app will
have prebuilt binaries available for all platforms quickly, made by
the Python community, and the enduser interested in the app but not in 
Python can simply download that.

--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From mal at lemburg.com  Sat Dec 18 14:17:52 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 18 Dec 1999 14:17:52 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A4B37.333B9443@interet.com>
Message-ID: <385B8980.11CDE9AC@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> > "James C. Ahlstrom" wrote:
> > > >    ftp://ftp.interet.com/pub/pylib.html
> > >
> 
> > Unfortunately, I always get the following traceback when trying
> > to print the directory:
> 
> Yes, compression isn't there yet.  I am looking into it.

Great :-)
 
> > Some notes on the API:
> > ----------------------
> > * I would find it more convenient if the filename and mode
> > would be constructor parameters, e.g.
> >
> >         zfile = zipfile('myfile.zip','rb')
> 
> OK, done.
> 
> > with compression defaulting to 8 rather than 0 (most zip files
> > will be deflated since this is the ZIP default).
> 
> Until compression works, and zlib ships with Python I
> would rather default to no compression (method 0).  Otherwise
> this is not useful as a Python import archive.

Point taken.

Perhaps it would be even better to not have a
default at all: that way people will have to think about the
issue *before* implementing it, rather than debug code
that produces tracebacks.

> > * Also, I would like a method much like the os.listdir()
> > which returns a list of filenames rather than print it
> > to stdout.
> 
> OK, done.
> 
> > * .is_zipfile() should probably be a separate function: it
> > doesn't use any of the class' features.
> 
> OK, done.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    13 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Sat Dec 18 16:16:44 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 18 Dec 1999 16:16:44 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com>
Message-ID: <385BA55C.9DFCA88D@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> 
> > Unfortunately, I always get the following traceback when trying
> > to print the directory:
> 
> OK, I changed the decompress code (10:23 AM), please re-try.

Everything is fine now... it's really impressive how easy
you can manipulate ZIP files with it.

One thing I'd suugest is to include some way to delete and
update contents, e.g. the write() method should overwrite
any existing entry in the archive (if it not already does --
I haven't tested it, just read the code and it seems to raise
an exception), plus maybe a .remove() method which deletes
an entry.
 
> > with compression defaulting to 8 rather than 0 (most zip files
> > will be deflated since this is the ZIP default).
> 
> The compress mode only applies to writing.  On read, the
> method recorded in the file controls.

True. How about making the compression argument mandatory
for file opened in 'wb' mode only ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    13 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From da at ski.org  Sat Dec 18 18:35:00 1999
From: da at ski.org (David Ascher)
Date: Sat, 18 Dec 1999 09:35:00 -0800
Subject: [Python-Dev] Year 2000 O'Reilly Python Conference
Message-ID: <003501bf497e$368f6f60$e655cfc0@ski.org>

I just got off the phone with someone at O'Reilly, who is starting to plan
the next O'Reilly Open Source Convention.  I've agreed to be the chair of
the Python conference, just so that there are no delays in getting the
conference organized.  If someone feels that I should not be chair, speak
now and we can figure out who takes the 'job'.

There are short-term and long-term issues to discuss:

Short term:

- We need a program committee -- If you're interested in being on said
committee or know someone who should be, let me know. I'd like to get
representatives from various subconstituencies on there (web types, zope
types, business types, scientist types, linux types, hackers, etc.)

- The call for papers is going on the O'Reilly website soon.  I will try and
get them to pass things by me first, but if we want to emphasize specific
kinds of paper submissions, we need to decide that soon.

- Greg or Barry, is it possible for one of you to setup a mailman mailing
list which will be used by the program committee?  eGroups is easy for me to
setup, but lots of people hated it last year.  I don't want to pollute
python-dev with conference discussions.

Longer term:

- The schedule for the conference is (supposedly) going to be the same as
last year.  conference-wide keynotes at the beginning of both days, and
4x90minute segments.

- We have two parallel tracks

- We have 4 half-day tutorial slots

- All of the paper materials have to be 'in' by March 1.  We need to decide
how much time we need to go through the review/revision process ourselves.
In other words, the deadline for submissions is up to us, but we don't have
that much time.

--david ascher


From jeremy at cnri.reston.va.us  Sat Dec 18 23:39:58 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Sat, 18 Dec 1999 17:39:58 -0500 (EST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <385A4B37.333B9443@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
	<3857B97E.3684224F@interet.com>
	<38590844.769C3025@interet.com>
	<38591E65.4885A39D@interet.com>
	<38595852.E8054741@lemburg.com>
	<385A4B37.333B9443@interet.com>
Message-ID: <14428.3390.671438.663889@bitdiddle.cnri.reston.va.us>

>>>>> "JCA" == James C Ahlstrom <jim at interet.com> writes:

  >> Aside: I found that you are using undocumented arguments to
  >> zlib.compressobj() ... are these extra arguments left out of the
  >> documentation on purpose or by simple oversight ? I couldn't find
  >> them in the HTML docs and neither in the docstrings.

  JCA> I am following the CNRI code blindly here.  I don't have docs
  JCA> either.

The docs for the zlib module are quite out of date, although I think
the docstrings may be better (not necessarily completely up-to-date
thought :-).  The specific parameters to pass to zlib don't seem to be
documented anywhere either; IIRC I dug them out of some example C code
somewhere that used zlib to read Zip files.

Jeremy


From gstein at lyra.org  Sun Dec 19 00:14:02 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 15:14:02 -0800 (PST)
Subject: [Python-Dev] Year 2000 O'Reilly Python Conference
In-Reply-To: <003501bf497e$368f6f60$e655cfc0@ski.org>
Message-ID: <Pine.LNX.4.10.9912181513020.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, David Ascher wrote:
>...
> - Greg or Barry, is it possible for one of you to setup a mailman mailing
> list which will be used by the program committee?  eGroups is easy for me to
> setup, but lots of people hated it last year.  I don't want to pollute
> python-dev with conference discussions.

Done. ora-pc at pythonpros.com.
http://mailman.pythonpros.com/mailman/listinfo/ora-pc

I also removed the old monterey-speakers mailing list :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From da at ski.org  Sun Dec 19 08:24:51 1999
From: da at ski.org (David Ascher)
Date: Sat, 18 Dec 1999 23:24:51 -0800
Subject: [Python-Dev] Year 2000 O'Reilly Python Conference
References: <Pine.LNX.4.10.9912181513020.16305-100000@nebula.lyra.org>
Message-ID: <013301bf49f2$243946f0$df55cfc0@ski.org>

From: Greg Stein <gstein at lyra.org>
> On Sat, 18 Dec 1999, David Ascher wrote:
> >...
> > - Greg or Barry, is it possible for one of you to setup a mailman
mailing
> > list which will be used by the program committee?

> Done. ora-pc at pythonpros.com.
> http://mailman.pythonpros.com/mailman/listinfo/ora-pc

Thanks, Greg.

Now, folks, please consider joining the program committee.  We need a few
volunteers - not too many, but somewhere between 5 and 10 would be good.
You don't even have to commit to making it to the conference, if that's a
concern.

-- david


From jim at interet.com  Mon Dec 20 15:18:17 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 09:18:17 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>
Message-ID: <385E3AA9.162BE568@interet.com>

Greg Stein wrote:

> Do you have a ZipImporter written?

Yes, it is ftp://ftp.interet.com/pub/importer.py

JimA


From jim at interet.com  Mon Dec 20 15:35:58 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 09:35:58 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com>
Message-ID: <385E3ECE.F8DCDE28@interet.com>

"M.-A. Lemburg" wrote:

> One thing I'd suugest is to include some way to delete and
> update contents, e.g. the write() method should overwrite
> any existing entry in the archive (if it not already does --
> I haven't tested it, just read the code and it seems to raise
> an exception), plus maybe a .remove() method which deletes
> an entry.

Currently, adding a file requires the "a" append mode, while
the "w" mode re-writes the file.  Adding a duplicate file name
produces an error message.  I can change this,
but removing a file would either waste space, or else the file
contents must be copied over the old file and all the offsets
updated.  I don't like this because it is complicated, and I think
it is fast enough to just re-write the archive.  But it
could be added if people want.

> True. How about making the compression argument mandatory
> for file opened in 'wb' mode only ?

The default of zero provides a little guidance that you should
use zero.  I added a warning message if 8 is used which should
discourage people from using 8.  Or I could disallow 8.
Is that OK?

JimA


From jim at interet.com  Mon Dec 20 16:34:02 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 10:34:02 -0500
Subject: [Python-Dev] Batteries Included?
References: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl>
Message-ID: <385E4C6A.BEC0F728@interet.com>

Jack Jansen wrote:

> And maybe the answer is a much simpler freezing process, like
> MacPython BuildApplication where any Python user can drop a script on
> it and end up with a fully self-contained app guaranteed (well.... No
> reports to the contrary have been heard so far, at least:-) to contain
> everything needed and not interfere with an existing MacPython
> installation (or be interfered with by it). Then a popular app will
> have prebuilt binaries available for all platforms quickly, made by
> the Python community, and the enduser interested in the app but not in
> Python can simply download that.

IMHO the "much simpler freezing process" is archive files.  A simple
script can build them, imputil can import them, and the only
remaining problem is to find them.  Please see:

ftp://ftp.interet.com/pub/bootmodule.html
ftp://ftp.interet.com/pub/pylib.html

JimA


From jack at oratrix.nl  Mon Dec 20 17:50:32 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Mon, 20 Dec 1999 17:50:32 +0100
Subject: [Python-Dev] Batteries Included? 
In-Reply-To: Message by "James C. Ahlstrom" <jim@interet.com> ,
	     Mon, 20 Dec 1999 10:34:02 -0500 , <385E4C6A.BEC0F728@interet.com> 
Message-ID: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl>

> IMHO the "much simpler freezing process" is archive files.  A simple
> script can build them, imputil can import them, and the only
> remaining problem is to find them.  Please see:

Archive files solves the problem for Python modules. But that leaves the 
problem of dynamically loaded modules. And resources for dialogs and such, if 
you use native GUI stuff on Mac or Windows.

And most serious applications that I've seen (GRiNS and Zope, to name two, 
Mailman is the only exception I can think of) depend on non-standard plugin 
modules.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From mal at lemburg.com  Mon Dec 20 15:44:42 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 20 Dec 1999 15:44:42 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com>
Message-ID: <385E40DA.37AD704F@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> 
> > One thing I'd suugest is to include some way to delete and
> > update contents, e.g. the write() method should overwrite
> > any existing entry in the archive (if it not already does --
> > I haven't tested it, just read the code and it seems to raise
> > an exception), plus maybe a .remove() method which deletes
> > an entry.
> 
> Currently, adding a file requires the "a" append mode, while
> the "w" mode re-writes the file.  Adding a duplicate file name
> produces an error message.  I can change this,
> but removing a file would either waste space, or else the file
> contents must be copied over the old file and all the offsets
> updated.  I don't like this because it is complicated, and I think
> it is fast enough to just re-write the archive.  But it
> could be added if people want.

I guess it would be ok to waste space. You could provide
a .cleanup() or .rewrite() method that takes care of
reorganizing the file to fill up the gaps.
 
> > True. How about making the compression argument mandatory
> > for file opened in 'wb' mode only ?
> 
> The default of zero provides a little guidance that you should
> use zero.  I added a warning message if 8 is used which should
> discourage people from using 8.  Or I could disallow 8.
> Is that OK?

Well the module seems to work just fine with compression
on, so disallowing it or issuing a warning would reduce its value,
IMHO. How about making compression a boolean value and then
converting any true value to 8 ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    11 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fdrake at acm.org  Mon Dec 20 19:52:41 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 20 Dec 1999 13:52:41 -0500 (EST)
Subject: [Python-Dev] posix module
In-Reply-To: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com>
References: <14423.61493.90107.433664@weyr.cnri.reston.va.us>
	<036d01bf4889$30b63470$f29b12c2@secret.pythonware.com>
Message-ID: <14430.31481.402469.896400@weyr.cnri.reston.va.us>

Fredrik Lundh writes:
 > (current CVS stuff, on Red Hat 5.2)

  Ok, Guido figured it out; this is a typo in the header
/usr/include/confname.h; the enum and the #define don't have the same
name.
  Do you know a way to detect the Linux kernel version using
pre-preprocessor macros?  (Seems very fragile.)  Would it be
reasonable to only add that table entry for kernel versions >= 2.2?


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jim at interet.com  Mon Dec 20 20:25:27 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 14:25:27 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com>
Message-ID: <385E82A7.72345807@interet.com>

"M.-A. Lemburg" wrote:

> I guess it would be ok to waste space. You could provide
> a .cleanup() or .rewrite() method that takes care of
> reorganizing the file to fill up the gaps.

OK, adding a duplicate name replaces the old file.

> Well the module seems to work just fine with compression
> on, so disallowing it or issuing a warning would reduce its value,
> IMHO.

Yes compression works, but 90% of Python installations don't have
zlib, so it is an ERROR to create archives with compression when
these archives are distributed to other sites.

> How about making compression a boolean value and then
> converting any true value to 8 ?

It would close the door to future or other compression methods.
Currently the method must be 0 or 8 or a traceback will result.

JimA


From jim at interet.com  Mon Dec 20 20:33:11 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 14:33:11 -0500
Subject: [Python-Dev] Batteries Included?
References: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl>
Message-ID: <385E8477.F727E0F8@interet.com>

Jack Jansen wrote:

> Archive files solves the problem for Python modules. But that leaves the
> problem of dynamically loaded modules. And resources for dialogs and such, if
> you use native GUI stuff on Mac or Windows.

Point taken.

For dynamically loaded modules, I believe in following the
native system's DLL path, and not adding eccentric Python
logic.  But many disagreed a couple week's ago when I raised this.

For resources, I think the archive file can accommodate this,
although it seems highly system dependent.

Anyway, any file at all can live in the archive and the import
mechanism for *.pyc will not be damaged nor unduly slowed down
by its presence.

JimA


From gstein at lyra.org  Mon Dec 20 21:11:50 1999
From: gstein at lyra.org (Greg Stein)
Date: Mon, 20 Dec 1999 12:11:50 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <385E82A7.72345807@interet.com>
Message-ID: <Pine.LNX.4.10.9912201208290.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, James C. Ahlstrom wrote:
> "M.-A. Lemburg" wrote:
> > I guess it would be ok to waste space. You could provide
> > a .cleanup() or .rewrite() method that takes care of
> > reorganizing the file to fill up the gaps.
> 
> OK, adding a duplicate name replaces the old file.

But it shouldn't print a warning(!). If an application wants to replace a
file, then stuff shouldn't appear on stdout as a result.

> > Well the module seems to work just fine with compression
> > on, so disallowing it or issuing a warning would reduce its value,
> > IMHO.
> 
> Yes compression works, but 90% of Python installations don't have
> zlib, so it is an ERROR to create archives with compression when
> these archives are distributed to other sites.

While it may be problem to distribute them to other sites, that is not up
to the library. If I want compression, then I should get compression. A
library module should not determine application-level policy.

The warning that __init__ prints shouldn't be there.

Really: there should not be a single "print" in the library (well,
printdir() is fine... that's what it is supposed to do; printing in the
test code would be fine). In normal, or even exceptional(!), operation
there should never be a print.

> > How about making compression a boolean value and then
> > converting any true value to 8 ?
> 
> It would close the door to future or other compression methods.
> Currently the method must be 0 or 8 or a traceback will result.

I definitely agree with JimA here. For example, maybe we want bzip
compression in there. Sure, non-portable, but that's my problem :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim at interet.com  Mon Dec 20 21:50:46 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 15:50:46 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912201208290.16305-100000@nebula.lyra.org>
Message-ID: <385E96A6.40CCF285@interet.com>

Greg Stein wrote:
> 
> On Mon, 20 Dec 1999, James C. Ahlstrom wrote:
> > "M.-A. Lemburg" wrote:
> But it shouldn't print a warning(!). If an application wants to replace a
> file, then stuff shouldn't appear on stdout as a result.

OK, no warning.
 
> The warning that __init__ prints shouldn't be there.

OK, it is gone.
 
> Really: there should not be a single "print" in the library (well,

No print unless _debug > 0

JimA


From mal at lemburg.com  Mon Dec 20 22:16:39 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 20 Dec 1999 22:16:39 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com>
Message-ID: <385E9CB7.5DE4848A@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> 
> > I guess it would be ok to waste space. You could provide
> > a .cleanup() or .rewrite() method that takes care of
> > reorganizing the file to fill up the gaps.
> 
> OK, adding a duplicate name replaces the old file.

Cool.
 
> > Well the module seems to work just fine with compression
> > on, so disallowing it or issuing a warning would reduce its value,
> > IMHO.
> 
> Yes compression works, but 90% of Python installations don't have
> zlib, so it is an ERROR to create archives with compression when
> these archives are distributed to other sites.

Sure, for the sake of creating Python code archives, but
your module is much more versatile: e.g. I could automatically
create ZIP archives of log files or sets of other files and
then have Python email them to someone who uses these archives
through standard tools such as WinZip -- the target doesn't always
have to be a Python process :-)

> > How about making compression a boolean value and then
> > converting any true value to 8 ?
> 
> It would close the door to future or other compression methods.
> Currently the method must be 0 or 8 or a traceback will result.

Ok.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    11 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jim at interet.com  Mon Dec 20 22:37:20 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 16:37:20 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com> <385E9CB7.5DE4848A@lemburg.com>
Message-ID: <385EA190.6AF511BD@interet.com>

"M.-A. Lemburg" wrote:
>
> Sure, for the sake of creating Python code archives, but
> your module is much more versatile: e.g. I could automatically
> create ZIP archives of log files or sets of other files and

OK, zipfile.py no longer complains about compression != 0

JimA


From fdrake at acm.org  Tue Dec 21 23:42:26 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 21 Dec 1999 17:42:26 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212238.RAA13660@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
Message-ID: <14432.594.33416.600794@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > + 
 > + class GetoptError(Exception):
 > +     opt = ''
 > +     msg = ''
 > +     def __init__(self, *args):
 > +         self.args = args
 > +         if len(args) == 1:
 > +             self.msg = args[0]
 > +         elif len(args) == 2:
 > +             self.msg = args[0]
 > +             self.opt = args[1]
 > + 
 > +     def __str__(self):
 > +         return self.msg
 >   
 > ! error = GetoptError # backward compatibility

  This breaks as soon as the standard exceptions are strings; does
this mean -X will be removed in the next release?  (Please????)


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From bwarsaw at cnri.reston.va.us  Tue Dec 21 23:44:46 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 17:44:46 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
Message-ID: <14432.734.155183.508785@anthem.cnri.reston.va.us>

>>>>> "Fred" == Fred L Drake, Jr <fdrake at acm.org> writes:

    Fred>   This breaks as soon as the standard exceptions are
    Fred> strings; does this mean -X will be removed in the next
    Fred> release?  (Please????)

Pretty please? :)


From guido at CNRI.Reston.VA.US  Wed Dec 22 00:05:28 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:05:28 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 17:42:26 EST."
             <14432.594.33416.600794@weyr.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us>  
            <14432.594.33416.600794@weyr.cnri.reston.va.us> 
Message-ID: <199912212305.SAA13722@eric.cnri.reston.va.us>

> Guido van Rossum writes:
>  > + 
>  > + class GetoptError(Exception):
>  > +     opt = ''
>  > +     msg = ''
>  > +     def __init__(self, *args):
>  > +         self.args = args
>  > +         if len(args) == 1:
>  > +             self.msg = args[0]
>  > +         elif len(args) == 2:
>  > +             self.msg = args[0]
>  > +             self.opt = args[1]
>  > + 
>  > +     def __str__(self):
>  > +         return self.msg
>  >   
>  > ! error = GetoptError # backward compatibility

[Fred Drake]

>   This breaks as soon as the standard exceptions are strings; does
> this mean -X will be removed in the next release?  (Please????)

Not a bad idea.

Anybody got a reason why -X should stay?

(The next step would be to outlaw raise with a string argument; I
think I can't make that for 1.6.  But it would be a good idea to scan
the standard library for string exceptions and convert all of them.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw at cnri.reston.va.us  Wed Dec 22 00:21:38 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:21:38 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <14432.2946.857539.898577@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido at cnri.reston.va.us> writes:

    Guido> Anybody got a reason why -X should stay?

Kill it.

    Guido> (The next step would be to outlaw raise with a string
    Guido> argument; I think I can't make that for 1.6.  But it would
    Guido> be a good idea to scan the standard library for string
    Guido> exceptions and convert all of them.)

Or require that exception classes be derived from exceptions.Exception
:)

-Barry


From guido at CNRI.Reston.VA.US  Wed Dec 22 00:23:29 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:23:29 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 18:21:38 EST."
             <14432.2946.857539.898577@anthem.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>  
            <14432.2946.857539.898577@anthem.cnri.reston.va.us> 
Message-ID: <199912212323.SAA13803@eric.cnri.reston.va.us>

[Barry]
>     Guido> Anybody got a reason why -X should stay?
> 
> Kill it.

You already said that.

Anybody else?

>     Guido> (The next step would be to outlaw raise with a string
>     Guido> argument; I think I can't make that for 1.6.  But it would
>     Guido> be a good idea to scan the standard library for string
>     Guido> exceptions and convert all of them.)
> 
> Or require that exception classes be derived from exceptions.Exception
> :)

That's hard to require.  But it could easily be a requirement checked
by one of the hypothetical typecheckers that are being discussed in
the types-sig.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw at cnri.reston.va.us  Wed Dec 22 00:27:31 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:27:31 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
	<14432.2946.857539.898577@anthem.cnri.reston.va.us>
	<199912212323.SAA13803@eric.cnri.reston.va.us>
Message-ID: <14432.3299.404561.698836@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido at cnri.reston.va.us> writes:

    BAW> Or require that exception classes be derived from
    BAW> exceptions.Exception :)

    Guido> That's hard to require.  But it could easily be a
    Guido> requirement checked by one of the hypothetical typecheckers
    Guido> that are being discussed in the types-sig.

Hmm, the raise could probably enforce this, but it might not be that
useful.

-Barry


From guido at CNRI.Reston.VA.US  Wed Dec 22 00:40:22 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:40:22 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 18:27:31 EST."
             <14432.3299.404561.698836@anthem.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us>  
            <14432.3299.404561.698836@anthem.cnri.reston.va.us> 
Message-ID: <199912212340.SAA13851@eric.cnri.reston.va.us>

> >>>>> "Guido" == Guido van Rossum <guido at cnri.reston.va.us> writes:
> 
>     BAW> Or require that exception classes be derived from
>     BAW> exceptions.Exception :)
> 
>     Guido> That's hard to require.  But it could easily be a
>     Guido> requirement checked by one of the hypothetical typecheckers
>     Guido> that are being discussed in the types-sig.
> 
> Hmm, the raise could probably enforce this, but it might not be that
> useful.
> 
> -Barry

The raise could easily enforce this, but it would break lots of
existing code.

I wish I had done it right from the start -- then exceptions would
have been classes from the start and would have required inheritance
from the Exception base class.  Like in Java.  (And in C++?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw at CNRI.Reston.VA.US  Wed Dec 22 00:43:59 1999
From: bwarsaw at CNRI.Reston.VA.US (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:43:59 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
	<14432.2946.857539.898577@anthem.cnri.reston.va.us>
	<199912212323.SAA13803@eric.cnri.reston.va.us>
	<14432.3299.404561.698836@anthem.cnri.reston.va.us>
	<199912212340.SAA13851@eric.cnri.reston.va.us>
Message-ID: <14432.4287.543786.308468@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido at CNRI.Reston.VA.US> writes:

    Guido> The raise could easily enforce this, but it would break
    Guido> lots of existing code.

Maybe not (I'm not sure).  All the standard exceptions inherit from
Exception, and of course there'd be nothing to enforce for existing
user-defined string based exceptions.  How pervasive are user-defined
class based exceptions that don't inherit from Exception?  (I don't
know, and I haven't grepped, but I think we've been making that
recommendation from day 1 of class-based standard exceptions, and I
try to follow this recommendation in my own code).

    Guido> I wish I had done it right from the start -- then
    Guido> exceptions would have been classes from the start and would
    Guido> have required inheritance from the Exception base class.
    Guido> Like in Java.  (And in C++?)

All Hail, Python 2.0, our Savior and Redeemer! :)

-Barry


From guido at CNRI.Reston.VA.US  Wed Dec 22 00:49:09 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:49:09 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 18:43:59 EST."
             <14432.4287.543786.308468@anthem.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us>  
            <14432.4287.543786.308468@anthem.cnri.reston.va.us> 
Message-ID: <199912212349.SAA13892@eric.cnri.reston.va.us>

> From: "Barry A. Warsaw" <bwarsaw at cnri.reston.va.us>

> >>>>> "Guido" == Guido van Rossum <guido at CNRI.Reston.VA.US> writes:
> 
>     Guido> The raise could easily enforce this, but it would break
>     Guido> lots of existing code.
> 
> Maybe not (I'm not sure).  All the standard exceptions inherit from
> Exception, and of course there'd be nothing to enforce for existing
> user-defined string based exceptions.  How pervasive are user-defined
> class based exceptions that don't inherit from Exception?  (I don't
> know, and I haven't grepped, but I think we've been making that
> recommendation from day 1 of class-based standard exceptions, and I
> try to follow this recommendation in my own code).

Yes, but class-based user exceptions existed many Python versions
before class-based standard exceptions!

Two examples in the standard library: ConfigParser.py and xdrlib.py.

> All Hail, Python 2.0, our Savior and Redeemer! :)

Or, the perfect excuse for procrastination :)

(But yes, 2.0 will enforce this.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein at lyra.org  Wed Dec 22 00:53:50 1999
From: gstein at lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 15:53:50 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 getopt.py,1.7,1.8
In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912211552380.16305-100000@nebula.lyra.org>

On Tue, 21 Dec 1999, Guido van Rossum wrote:
>...
> [Fred Drake]
> >   This breaks as soon as the standard exceptions are strings; does
> > this mean -X will be removed in the next release?  (Please????)
> 
> Not a bad idea.
> 
> Anybody got a reason why -X should stay?

Kill it.

> (The next step would be to outlaw raise with a string argument; I
> think I can't make that for 1.6.  But it would be a good idea to scan
> the standard library for string exceptions and convert all of them.)

Keep string exceptions. I think there is probably a lot of code that still
uses them. I know I do :-)

We can issues warnings about string exceptions via the type-checking tool.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From bwarsaw at CNRI.Reston.VA.US  Wed Dec 22 00:54:04 1999
From: bwarsaw at CNRI.Reston.VA.US (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:54:04 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
	<14432.2946.857539.898577@anthem.cnri.reston.va.us>
	<199912212323.SAA13803@eric.cnri.reston.va.us>
	<14432.3299.404561.698836@anthem.cnri.reston.va.us>
	<199912212340.SAA13851@eric.cnri.reston.va.us>
	<14432.4287.543786.308468@anthem.cnri.reston.va.us>
	<199912212349.SAA13892@eric.cnri.reston.va.us>
Message-ID: <14432.4892.908107.421149@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido at CNRI.Reston.VA.US> writes:

    Guido> Yes, but class-based user exceptions existed many Python
    Guido> versions before class-based standard exceptions!

True, but I suspect that legacy class-based user exceptions are rare.
I might be wrong, but you're absolutely right that these would all be
broken.

    Guido> Two examples in the standard library: ConfigParser.py and
    Guido> xdrlib.py.

Fortunately these are fixed with two 11 character patches :)

I'm not necessarily arguing for or against tightening this.

-Barry


From gmcm at hypernet.com  Wed Dec 22 00:55:07 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Tue, 21 Dec 1999 18:55:07 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us>
References: Your message of "Tue, 21 Dec 1999 18:27:31 EST."             <14432.3299.404561.698836@anthem.cnri.reston.va.us> 
Message-ID: <1266302877-22249299@hypernet.com>

[Guido]

> I wish I had done it right from the start -- then exceptions
> would have been classes from the start and would have required
> inheritance from the Exception base class.  Like in Java.  (And
> in C++?)

In C++ you can throw anything at all. Strings, ints, that 
Warsaw blockhead...

off-topic-ly y'rs

- Gordon


From tismer at appliedbiometrics.com  Wed Dec 22 01:57:27 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 22 Dec 1999 01:57:27 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib 
 getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>  
	            <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us>
Message-ID: <386021F7.4F94C458@appliedbiometrics.com>


Guido van Rossum wrote:
> 
> [Barry]
> >     Guido> Anybody got a reason why -X should stay?
> >
> > Kill it.
> 
> You already said that.
> 
> Anybody else?

I'd say kill -X, but keep allowing string exceptions if
it doesn't cost too much. I think of C++, like Gordon said.

Also I'd take the chance and move the exceptions Python
module back into the core, as a frozen mdule or whatever.

Reason: At the moment, the CVS version of the Python library
is incompatible to 1.5.2, which makes testing against the
standard dist quite inconvenient. A compiled CVS Python
does not run under PythonWin when I put it into my standard
installation. Or is there an easy way to switch all settings
to a completely different path?

Anyway, I'm most probably off until Y2K.

See ya all then, provided we survive - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From guido at CNRI.Reston.VA.US  Wed Dec 22 02:01:16 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 20:01:16 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 01:57:27 +0100."
             <386021F7.4F94C458@appliedbiometrics.com> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us>  
            <386021F7.4F94C458@appliedbiometrics.com> 
Message-ID: <199912220101.UAA14109@eric.cnri.reston.va.us>

> I'd say kill -X, but keep allowing string exceptions if
> it doesn't cost too much. I think of C++, like Gordon said.

Agreed.

> Also I'd take the chance and move the exceptions Python
> module back into the core, as a frozen mdule or whatever.
> 
> Reason: At the moment, the CVS version of the Python library
> is incompatible to 1.5.2, which makes testing against the
> standard dist quite inconvenient. A compiled CVS Python
> does not run under PythonWin when I put it into my standard
> installation. Or is there an easy way to switch all settings
> to a completely different path?

Point the PYTHONHOME variable to the top of your install directory.
(On Windows you may have to kill the registry settings -- this is a
bug.)

> Anyway, I'm most probably off until Y2K.

Ditto.

> See ya all then, provided we survive - chris

Best wishes to all,

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at digicool.com  Wed Dec 22 14:54:41 1999
From: jim at digicool.com (Jim Fulton)
Date: Wed, 22 Dec 1999 08:54:41 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib 
 getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>  
	            <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <3860D821.576B3146@digicool.com>

Guido van Rossum wrote:
> 
> (The next step would be to outlaw raise with a string argument; I
> think I can't make that for 1.6.  But it would be a good idea to scan
> the standard library for string exceptions and convert all of them.)

This would be waaaaay to big a change for Python 1.x. There are alot
of Python modules outside the standard distribution that use string 
exceptions. This would be a huge backward incompatability. 

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From fdrake at acm.org  Wed Dec 22 15:23:29 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 22 Dec 1999 09:23:29 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <14432.57057.535205.558@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > (The next step would be to outlaw raise with a string argument; I
 > think I can't make that for 1.6.  But it would be a good idea to scan
 > the standard library for string exceptions and convert all of them.)

  I don't know if requiring class-based exceptions will make the
runtime any simpler, but that seems the only reason to do it.
  The only reason to remove -X, and possibly the string exception
fallback code, is to ensure that we *can* subclass Exception and
friends without having to catch TypeError and do something different.


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From fdrake at acm.org  Wed Dec 22 15:25:33 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 22 Dec 1999 09:25:33 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <14432.2946.857539.898577@anthem.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
	<14432.2946.857539.898577@anthem.cnri.reston.va.us>
Message-ID: <14432.57181.944364.427093@weyr.cnri.reston.va.us>

Barry A. Warsaw writes:
 > Or require that exception classes be derived from exceptions.Exception
 > :)

  Ok, it's early, and maybe I haven't had enough coffee(!).  But is
this serious?  Does JPython gain some benefit from this, is it your
preference, or are you just yanking on my leg?  ("Pulling my arm" as
my 5-year-old says!)


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From guido at CNRI.Reston.VA.US  Wed Dec 22 15:40:39 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 22 Dec 1999 09:40:39 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 09:23:29 EST."
             <14432.57057.535205.558@weyr.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>  
            <14432.57057.535205.558@weyr.cnri.reston.va.us> 
Message-ID: <199912221440.JAA16198@eric.cnri.reston.va.us>

> From: "Fred L. Drake, Jr." <fdrake at acm.org>
> 
> Guido van Rossum writes:
>  > (The next step would be to outlaw raise with a string argument; I
>  > think I can't make that for 1.6.  But it would be a good idea to scan
>  > the standard library for string exceptions and convert all of them.)
> 
>   I don't know if requiring class-based exceptions will make the
> runtime any simpler, but that seems the only reason to do it.

Do what?  *Require* class exceptions?  You're probably right, and I
think the gain is minimal.

There's another reason to scan the std library though -- not to set a
bad example.  I want to eventually (in 2.0) move to a
class-derived-from-Exception-only scheme.

>   The only reason to remove -X, and possibly the string exception
> fallback code, is to ensure that we *can* subclass Exception and
> friends without having to catch TypeError and do something different.

And that's a very good reason indeed.

Let me repeat my plans for 1.6.

- Remove -X; the standard exceptions are always class-based.

- Change all standard library and other example code to use
class-based exceptions with a standard exception as base class, to set
an example.

- Still allow string exceptions in user code.

- Still allow class exceptions that don't use a standard exception
base class in user code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From marangoz at python.inrialpes.fr  Wed Dec 22 19:09:47 1999
From: marangoz at python.inrialpes.fr (Vladimir Marangozov)
Date: Wed, 22 Dec 1999 19:09:47 +0100 (CET)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912221440.JAA16198@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 09:40:39 AM
Message-ID: <199912221809.TAA25322@python.inrialpes.fr>

Guido van Rossum wrote:
> 
> [Fred Drake]
> >   I don't know if requiring class-based exceptions will make the
> > runtime any simpler, but that seems the only reason to do it.
> 
> Do what?  *Require* class exceptions?  You're probably right, and I
> think the gain is minimal.

Yes. Besides, I still think that string-based exceptions are just
convenient for quick & dirty, throw-away test scripts.

> 
> Let me repeat my plans for 1.6.
> 
> - Remove -X; the standard exceptions are always class-based.
> 
> - Change all standard library and other example code to use
> class-based exceptions with a standard exception as base class, to set
> an example.
> 
> - Still allow string exceptions in user code.
> 
> - Still allow class exceptions that don't use a standard exception
> base class in user code.

Sounds okay.

---

PS: I'm particularly happy today :-) because I've finally published
 the new version of our Web site http://www.inrialpes.fr. Two things
 I'd like to mention:
 (1) it shouldn't have been possible without quick Python scripts ;)
 (2) I'll find the time to reinvoke some of the topics discussed here
     instead of being mute as a fish.

That said, Merry Christmas and a Happy New Year to all of you!

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov at inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From guido at CNRI.Reston.VA.US  Wed Dec 22 19:23:45 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 22 Dec 1999 13:23:45 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 19:09:47 +0100."
             <199912221809.TAA25322@python.inrialpes.fr> 
References: <199912221809.TAA25322@python.inrialpes.fr> 
Message-ID: <199912221823.NAA16517@eric.cnri.reston.va.us>

Vladimir.Marangozov at inrialpes.fr:

> Yes. Besides, I still think that string-based exceptions are just
> convenient for quick & dirty, throw-away test scripts.

They have a hard-to-understand quirk though: the id() of the string is
used to check rather than its value, so that except "foo" doesn't
necessarily catch raise "foo"; but due to various optimization, this
usually works, and people get bent out of shape when it doesn't.
Since you have to give your exception a name, how hard is it to say

class MyError(Exception): pass

rathern than

MyError = "MyError"

?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein at lyra.org  Wed Dec 22 19:33:19 1999
From: gstein at lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 10:33:19 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 getopt.py,1.7,1.8
In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912221031390.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, Guido van Rossum wrote:
> Vladimir.Marangozov at inrialpes.fr:
> > Yes. Besides, I still think that string-based exceptions are just
> > convenient for quick & dirty, throw-away test scripts.
> 
> They have a hard-to-understand quirk though: the id() of the string is
> used to check rather than its value, so that except "foo" doesn't
> necessarily catch raise "foo"; but due to various optimization, this
> usually works, and people get bent out of shape when it doesn't.
> Since you have to give your exception a name, how hard is it to say
> 
> class MyError(Exception): pass
> 
> rathern than
> 
> MyError = "MyError"
> 
> ?

It is very hard. My fingers do the typing for me, and they fill in
strings. I'm trying to teach them otherwise, but they insist.

You're also assuming that MyError gets defined. Sometimes, my little
fingers like typing:

  try:
    foo
  except:
    raise "foo broke for some reason"


Quick and dirty, indeed! :-)

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From fdrake at acm.org  Wed Dec 22 20:59:55 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 22 Dec 1999 14:59:55 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
	<14432.2946.857539.898577@anthem.cnri.reston.va.us>
	<199912212323.SAA13803@eric.cnri.reston.va.us>
	<14432.3299.404561.698836@anthem.cnri.reston.va.us>
	<199912212340.SAA13851@eric.cnri.reston.va.us>
Message-ID: <14433.11707.607533.698901@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > I wish I had done it right from the start -- then exceptions would
 > have been classes from the start and would have required inheritance
 > from the Exception base class.  Like in Java.  (And in C++?)

  I've seen this said or hinted at in a couple of places (the specific 
requirement that exception derive from Exception), but I've seen
nothing that indicates any reason or derived value for this.  Could
someone please clarify?


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From guido at CNRI.Reston.VA.US  Wed Dec 22 21:05:52 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 22 Dec 1999 15:05:52 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 14:59:55 EST."
             <14433.11707.607533.698901@weyr.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us>  
            <14433.11707.607533.698901@weyr.cnri.reston.va.us> 
Message-ID: <199912222005.PAA17291@eric.cnri.reston.va.us>

> From: "Fred L. Drake, Jr." <fdrake at acm.org>

> Guido van Rossum writes:
>  > I wish I had done it right from the start -- then exceptions would
>  > have been classes from the start and would have required inheritance
>  > from the Exception base class.  Like in Java.  (And in C++?)
> 
>   I've seen this said or hinted at in a couple of places (the specific 
> requirement that exception derive from Exception), but I've seen
> nothing that indicates any reason or derived value for this.  Could
> someone please clarify?

It's simply an extra bit of checking that your program is reasonable
-- if you accidentally raise a non-exception class, there's probably
something wrong with your program, and it gives the reader a hint
about the intended use of the class.

Other languages (e.g. Modula-3) have a specific exception type that
can be used only for that one purpose.  However it's useful to allow
methods an subclassing of exceptions, so they might as well be
classes.  So, all exceptions are classes.  But not all classes are
exceptions.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein at lyra.org  Wed Dec 22 21:11:43 1999
From: gstein at lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 12:11:43 -0800 (PST)
Subject: [Python-Dev] Please test new dynamic load behavior
Message-ID: <Pine.LNX.4.10.9912221200290.16305-100000@nebula.lyra.org>

Hi all,

I reorganized Python's dynamic load/import code over the past few days.
Gudio provided some feedback, I did some more mods, and now it is checked
into CVS. The new loading behavior has been tested on Linux, IRIX, and
Solaris (and probably Windows by now).

For people with CVS access, I'd like to ask that you grab an updated copy
and shake out the new code. There have been updates to the "configure"
process, so you'll need to run configure again. Make sure that you alter
your Modules/Setup to build some shared modules, and then try it out.

Here are some of the platforms that I believe need specific testing:

- NetBSD, FreeBSD, OpenBSD, ...
- AIX
- HP/UX
- BeOS
- NeXT
- Mac
- OS/2
- Win16

I believe it should work for most people, but we may be looking for the
wrong "init<module>" symbol on some platforms. We might even be selecting
the wrong import mechanism (or missing it altogether!) on some platforms.

If you get a chance to test this, then please drop me a note with your
platform and whether it succeeded or failed (and how it failed).

Thanx!
-g

p.s. you can tell if dynamic loading is missing by watching for
DYNLOADFILE in the configure process and seeing if it used dynload_stub.
alternatively, you can import the "imp" module and see if "load_dynamic"
is missing.

-- 
Greg Stein, http://www.lyra.org/


From gvwilson at nevex.com  Thu Dec 23 04:43:40 1999
From: gvwilson at nevex.com (gvwilson at nevex.com)
Date: Wed, 22 Dec 1999 22:43:40 -0500 (EST)
Subject: [Python-Dev] re: Open Source design competition / Python / software tools
Message-ID: <Pine.LNX.4.10.9912222242070.4839-200000@akbar.nevex.com>

Hi, folks.  I hope you don't mind another mail out of the blue, but I got
notice on Saturday that the Department of Energy is giving me $860K over
two years to support development of easier-to-use software engineering
tools.  All of the work will be Open Source, and will be done in Python,
with a strong emphasis on design, testing, and documentation.  The
project's long-term objective is to encourage scientists and engineers to
treat programs in the same way as they do other experiments, i.e. to
calibrate, test, peer review, and so on.

To kick-start things, we're going to be holding a two-round design
competition.  Anyone (individual or team, professional or student) can
submit a short entry for the first round; the judges will pick four
candidates to go forward in each of four categories, and those
individuals or teams will be asked to submit full entries. The four
categories are:

* an issue tracking system to replace Gnats and Bugzilla;

* a build system to replace make;

* a platform inspection and configuration system to replace autoconf;
  and

* a testing framework to replace XUnit, Expect, and DejaGnu.

Would you be interested in participating in any way---judging, entering a
design, critiquing things from the pointer of view of end users, or
anything else? I realize that you're probably up past your eyeballs with
work, and that the money on offer is nothing special, but I think this
could be a lot of fun, and could help to shift the emphasis of the Open
Source community from hacking to design (both by drawing attention to, and
rewarding, design, and by creating a corpus of examples and commentary for
programmers to refer to).  It could also make life a lot easier for
computational scientists and engineers...

Please let me know if you'd like to be involved, or if you'd like more
information than is contained in the FAQ (attached).  Timescales are a
bit tight---I'd like to be able to make an announcement on January
14---but I'll be reading email at this address several times a day
during the holiday.

I look forward to hearing from you,

Greg Wilson

p.s. please note that the attached FAQ is a first draft; I'd be grateful
if you could show it to anyone you think might be interested, but I'd
also be grateful if you wouldn't broadcast it until it's gone through 
one more editing pass.
-------------- next part --------------
<HTML>
<HEAD>
<TITLE>Software Carpentry FAQ</TITLE>
</HEAD>
<BODY>

<H1 ALIGN="CENTER">Software Carpentry FAQ</H1>


<H2>General information</H2>

<OL>

<LI><EM>What is the Software Carpentry project? </EM>
<BR>
The aim of the Software Carpentry project is to make it easier for
programmers in general, and scientific programmers in particular, to
adopt better software development practices. The project will achieve
this by creating tools that are easier to learn and use, and by
documenting those tools and the practices they embody.
</LI>

<LI><EM>Where does the name come from?</EM>
<BR>
The name is a play on "software engineering", and is meant to indicate
that this project is initially concerned with medium-sized teams (up
to a dozen or two programmers) and medium-term timescales (a year or
two).
</LI>

<LI><EM>How did the project get started?</EM>
<BR>
The project has its origins in a <A
HREF="http://www.acl.lanl.gov/sc/resources/cse/index.html">series of
articles</A> that Greg Wilson organized for the Fall 1996 and Winter
1996 issues of <CITE>IEEE Computational Science and
Engineering</CITE>. These articles outlined what their authors thought
computer scientists should teach to physical scientists and
engineers. Most authors recommended numerical methods or the standard
Unix toolset, but Steve McConnell argued that better programming
practices would have the greatest impact on productivity.

<BR> As a result of that observation, Greg Wilson, Brent Gorda, and
Steve McConnell put together a 3-day course on software engineering
for scientists and engineers, which they taught several times at the
Los Alamos National Laboratory. Feedback on the course was very
positive, but many participants felt that the tools being
taught---Perl, Make, CVS, and so on---were unnecessarily difficult to
install, learn, and use. They were also frustrated by the scarcity of
examples of design documents, testing plans, and all of the other
things the course was trying to teach them.
</LI>

<LI><EM>Why Open Source?</EM>
<BR>
There are three reasons why the Software Carpentry project is
following the Open Source model:
</LI>

	<OL>

	<LI><EM>Leveraging existing knowledge. </EM>
	<BR>
	A closed project can only take advantage of a few minds. As
	Linux and other projects have shown, a well-run Open Source
	project can harness the experience and insight of thousands of
	people.
	</LI>

	<LI><EM>Lowering barriers to adoption. </EM>
	<BR>
	Freely-available tools are more likely to be picked up than
	their commercial equivalents. This is particularly true when
	the tool in question does something novel (at least from the
	point of the person adopting it), and in academia (where
	budgets are limited).
	</LI>

	<LI><EM>Encouraging peer review.</EM>
	<BR>
	Dan Gezelter?s <a
	href="http://www.openscience.org/talk/bnl/index.html">talk</a>
	at the first Open Source/Open Science conference discussed how
	the scientific tradition of peer review fits with the
	philosophy of the Open Source movement. By designing and
	building these tools in the open, the Software Carpentry
	project will both encourage peer review of the tools
	themselves, and demonstrate how this ought to be done for
	scientific and commercial software.
	</LI>

	</OL>

<LI><EM>Where does the funding come from? </EM>
<BR>
The funding comes from the U.S. Department of Energy, through the
Advanced Computing Laboratory at Los Alamos National Laboratory. The
project is being administered by Code Sourcery. US$480,000 has been
provided for 2000, and US$380,000 for 2001.
</LI>

<LI><EM>Why would the Department of Energy fund something like this?</EM>
<BR>
The funding has been provided partly because the DoE would like
scientists and engineers to be more productive, and partly because it
would like to find out whether the Open Source model and community can
meet the special needs of high-performance computational science. The
last few years have seen most manufacturers of special-purpose
supercomputers disappear or be bought out, and the rise of clusters
based on commercial off-the-shelf (COTS) hardware, Linux, MPI, the GNU
compiler toolset, and so on. There is a growing feeling that these
machines could bring scalable supercomputing into the mainstream, but
this will only happen if good tools and practices are accessible
enough.
</LI>

<LI><EM>I'm not a scientist or engineer---what's in it for me? </EM>
<BR>
The things that make many existing Open Source software development
tools difficult to learn and use---obscure syntax, arbitrary or
hard-to-follow behavior, and poor documentation---affect professional
programmers and computer science students just as much as they do
computational scientists and engineers. If the Open Source movement
can build tools that are simple enough to be learned by people who
have problems of their own to solve, and yet powerful enough to
support distributed development of hundreds of thousands of lines of
complex numerical and visualization code, then those tools will
probably also help people who want to build Internet chat rooms and
order-tracking systems.
<BR>
This project should also be interesting to the general programming
community because it is going to place more emphasis on design and
early feedback than most Open Source projects have to date. Instead of
growing someone?s pet project, Software Carpentry is going to
organize---and pay for---a design competition. If this works, it could
be an interesting model for other Open Source projects to adopt.
</LI>

<LI><EM>I think [tool] is good enough already---why are you re-inventing the wheel? </EM>
<BR>
The short answer to this is Alan Cooper's:


	<BLOCKQUOTE>
	The phrase "computer literate user" really means the person
	has been hurt so many times that the scar tissue is thick
	enough so he no longer feels the pain.
	<BR>
	-- Alan Cooper,
	<CITE>The Inmates are Running the Asylum</CITE>
	</BLOCKQUOTE>

The longer answer is that the "accidental complexity" of the standard
Unix command-line toolset is a major barrier to its adoption by people
who are not full-time programmers, or for whom programming is just
something that has to be done in order to do something else. Many
professional programmers---particularly those who enjoy programming
enough to be involved in the Open Source movement---have been using
these tools for so long that they simply don't remember how hard it is
to configure Gnats, or pass variable bindings between recursive calls
to Make.
<BR>
And let's face it: if Make or Autoconf were built from scratch today,
they would be written as extensible, embeddable modules in a
high-level scripting language. This would not only make them easier to
use, it would also make them easier to learn, since they would employ
one syntax for all purposes. Microsoft Visual Basic has shown just how
useful it can be to have a single general-purpose "glue" language
capable of binding disparate tools together; the aim of the first half
of this project is to bring those benefits to the Open Source
community.

</OL>

<H2>Development</H2>

<OL>

<LI><EM>What projects are currently under way? </EM>
<BR>Software Carpentry will start by producing:
</LI>

	<OL>

	<LI>a platform inspection tool similar to Autoconf;</LI>

	<LI>a build management tool similar to Make;</LI>

	<LI>an issue tracking system similar to Gnats or Bugzilla; and</LI>

	<LI>a unit and regression testing harness with the
	functionality of XUnit, Expect, and DejaGnu.</LI>

	</OL>

<LI><EM>Why were those tools chosen? </EM>
<BR>
These four tools were chosen as initial targets for several
reasons. First, the working practices they support are essential to
medium-scale software engineering. Second, the tools they are intended
to replace are generally recognized as being outdated or flawed. This
creates demand, and increases the odds that rational reimplementations
will be adopted. Third, enough people have enough experience with the
tools that are to be replaced to participate in the design competition
described later.
</LI>

<LI><EM>Why isn?t [tool] on this list?</EM>
<BR>
There are several other tools that could have been on this list, and
will be added if the first round of work goes well. A cross-platform
version control system that corrects the many deficiencies in CVS, for
example, is an obvious candidate, but is probably too large to be
tackled initially, and any work done by Software Carpentry could well
be superseded by BitKeeper. Similarly, the world needs a good Open
Source project management tool with the functionality of Microsoft
Project, but probably needs the four tools listed above more urgently.
</LI>

<LI><EM>What languages and tools will be used? </EM>
<BR>
All development work will be done in Python.
</LI>

<LI><EM>Why Python? </EM>
<BR>
This is actually three questions:

	<OL>

	<LI><EM>Why mandate a language? </EM>
	<BR>
	Building everything in a single language will encourage
	projects to share code, which will both keep the total volume
	of code manageable and raise the quality of the
	implementations (since the shared code will be exercised, and
	tested, in many different ways). Using a single language will
	also improve the comprehensibility, and hence the
	maintainability and extensibility, of the tools. The varying
	syntax of Make, Autoconf, and other tools is a large practical
	barrier to their adoption by people who have better (or at
	least more pressing) things to do than learn yet another
	syntax. Microsoft?s Visual Basic has shown how powerful it
	is to use a single, flexible language everywhere.
	</LI>

	<LI><EM>Why use a scripting language? </EM>
	<BR>
	A lot of anecdotal evidence shows that "relaxed" high-level
	languages (like Python, Perl, and Visual Basic) are more
	productive vehicles for process management, text processing,
	and similar tasks than their "strict" equivalents (like C++
	and Java).
	</LI>

	<LI><EM>Why use Python? </EM>
	<BR>
	The four candidates considered were Visual Basic, Perl, Tcl,
	and Python.

		<OL>

		<LI><EM>Visual Basic </EM>
		<BR>
		Visual Basic is proprietary, and there is no
		indication that a credible Open Source implementation
		will appear any time soon.
		</LI>

		<LI><EM>Perl</EM>
		<BR>
		Perl was a strong contender, primarily because of the
		many libraries that have been developed for it, and
		because of the number of books that document
		it. However, our experience teaching at Los Alamos was
		that Perl?s syntax is hard to learn, its behavior
		often arbitrary, and its size intimidating. While
		full-time professional programmers with several other
		languages under their belts might (and often do) say
		that it all makes sense once you know it, we want to
		make the learning curve as gentle as possible.
		</LI>


		<LI><EM>Tcl</EM>
		<BR>
		Tcl is easier to learn and read than Perl, but is not
		as well documented, and doesn?t come with as many
		libraries. Had Python not existed, Tcl would probably
		have been chosen for this project.
		</LI>

		<LI><EM>Python</EM>
		<BR>
		Python provides the same functionality as Perl or Tcl,
		but has proved to be easier to learn, read, and
		remember. (For example, words like "except" and
		"unless" appear much less often in Python reference
		material than they do in Perl reference material.)
		Python is not yet as extensively documented as Perl,
		but the number of books is growing, as is the number
		of modules and libraries. Finally, the Python
		community is still small enough for a project like
		this one to attract the attention of a significant
		proportion of it.
		</LI>

		</OL>
	</LI>
	</OL>

</LI>

<LI><EM>How will development be organized and coordinated? </EM>
<BR>
Everything the project produces---designs, critiques of those designs,
test suites, and examples, as well as actual source code---will be
available through the project?s Web site at
software-carpentry.codesourcery.com. Each project will have a
coordinator, whose job it will be to moderate discussion, synchronize
releases, track work items, and report on progress. The coordinator
will also be responsible for collating and editing feedback from
judges during the design competition.
</LI>

</OL>

<H2>Design competition</H2>

<OL>

<LI><EM>Why a design competition?</EM>
<BR>
Most Open Source packages have their roots in someone?s pet hobby
project, which others have picked up, extended, and modified. This
kind of organic growth has a lot of good features, but a
well-documented design is not one of them. As a result, programmers
often have to rely on folklore and reverse engineering if they want to
add to, or fix, these tools. In addition, there is a dearth of
examples of good design for new programmers to learn from. <BR> The
Software Carpentry project hopes to address both problems by running a
two-stage design competition. The best entries in both rounds will be
published, along with commentary from the competition?s
judges. This material will serve both to inform and guide further
development, and to show novices what experienced programmers think
about before they start coding.
</LI>

<LI><EM>Who can enter? </EM>
<BR>
Everyone: individuals and teams, students and professionals, from
anywhere in the world.
</LI>

<LI><EM>What are the rules? </EM>
<BR>The full rules are available at:
<CENTER>
software-carpentry.codesourcery.com/design-competition/rules.html
</CENTER>
Basically, initial submissions must be written in English, and can be
up to 10 pages long. Examples count against this limit, but diagrams
and a Unix-style man page do not. Any person or team may submit only
one entry in any given category, but can submit in as many of the four
categories as desired.
<BR>
The best four entries in each category will be awarded US$2500, and
asked to submit full designs. Participants will be strongly encouraged
to pool their efforts for the second round. The best second-round
submission will be awarded an additional US$7500, while the others
will receive another US$2500 each. The real reward will be seeing the
design implemented, and being in a good position to bid on the
implementation work.
</LI>

<LI><EM>What should first-round submissions contain? </EM>
<BR>
An example of what a submission should contain, and how it should be
formatted is available at:
<CENTER>
software-carpentry.codesourcery.com/design-competition/example.html
<CENTER>
First-round entries should focus primarily on what the tool will do,
and how it will be used: command-line options, input and output file
formats, sketches of Web and GUI interfaces (where appropriate), and
so on. Second-round submissions will then be expected to describe how
it?s all going to be implemented.
</LI>

<LI><EM>Who will the judges be? </EM>
<BR>
<B>Need to firm up the list of judges ASAP.</B>
</LI>

<LI><EM>When are the deadlines? </EM>
<BR>
The deadline for first-round submissions is March 31, 2000. The five
best proposals in each category will be announced on April 30,
2000. Full submissions are due on June 1, 2000, and winners will be
announced on June 30, 2000.
</LI>

<LI><EM>Won't prizes discourage co-operation? </EM>
<BR>
We don?t know. On the one hand, people might want to hoard their
best ideas; on the other hand, the best designs in both rounds are
going to be published, along with the judges? commentary, and we
will be encouraging participants to pool their efforts. Most of the
money that will be paid out will go to fund implementation, testing,
and documentation; we hope that people will collaborate in the early
stages, and treat the prizes as recognition for their effort, rather
than treating US$10,000 as their retirement fund.
</LI>

</OL>

<H2>Documentation</H2>

<OL>

<LI><EM>What documentation will be produced?</EM>
<BR>
The Software Carpentry project will produce several different kinds of
documentation:

	<OL>

	<LI><EM>Design documentation. </EM>
	<BR>
	As stated above, the best designs in each category will be
	published, along with the judges? commentary. This material
	ought to play the role that music criticism has played in the
	development of music, by giving newcomers (and experienced
	programmers) better insight into how good designers think.
	</LI>

	<LI><EM>User guides. </EM>
	<BR>
	The project will pay for the development of man pages, user
	guides, online help, and all the other documentation needed to
	turn a program into a product.
	</LI>

	<LI><EM>Test suites. </EM>
	<BR>
	The project will also pay for the development of
	industrial-strength test suites for all four tools. These
	suites will be published, both to serve as a starting point
	for other projects and to demonstrate good practice.
	</LI>

	<LI><EM>Case studies. </EM>
	<BR>
	It is often easier to show someone how to do something than to
	explain it to them. The Software Carpentry project will pay
	for case studies that describe how these tools, and (more
	importantly) the working practices they support, have been
	deployed in practice. Checklists, templates for forms, and
	other errata can be submitted.
	</LI>

	</OL>

</LI>

<LI><EM>What format(s) will be used? </EM>
<BR>
The primary format for all documentation will be HTML. The project
will migrate to XML when and as feasible.
</LI>

<LI><EM>What restrictions are there on using the documentation?</EM>
<BR>
Only those that also apply to the software, under the terms of its
Open Source license. You can copy and distribute the documentation in
any form, but only if its author(s) and origin are clearly shown, and
if you include a description of how readers can access the
originals. In particular, the documentation can be reproduced in
books, but only if the authors, origin, and location of the originals
is printed clearly on each page.
</LI>

</OL>

</BODY>
</HTML>

From jack at oratrix.nl  Thu Dec 23 11:24:26 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Thu, 23 Dec 1999 11:24:26 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib 
 getopt.py,1.7,1.8
In-Reply-To: Message by Guido van Rossum <guido@CNRI.Reston.VA.US> ,
	     Wed, 22 Dec 1999 13:23:45 -0500 , <199912221823.NAA16517@eric.cnri.reston.va.us> 
Message-ID: <19991223102426.CCB75370CF2@snelboot.oratrix.nl>

> Vladimir.Marangozov at inrialpes.fr:
> 
> > Yes. Besides, I still think that string-based exceptions are just
> > convenient for quick & dirty, throw-away test scripts.
> 
> They have a hard-to-understand quirk though: the id() of the string is
> used to check rather than its value, so that except "foo" doesn't
> necessarily catch raise "foo"; but due to various optimization, this
> usually works, and people get bent out of shape when it doesn't.

I sort-of use this feature when I'm debugging: if I want to know what happens 
in an exception that is usually caught somewhere higher up in the call stack I 
simply put quotes around the exception name and the exception will happen 
uncaught. The same trick works for except: clauses.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From harri.pasanen at trema.com  Thu Dec 23 12:44:04 1999
From: harri.pasanen at trema.com (Harri Pasanen)
Date: Thu, 23 Dec 1999 13:44:04 +0200
Subject: [Python-Dev] Re: [PSA MEMBERS] Please test new dynamic load behavior
References: <Pine.LNX.4.10.9912221200290.16305-100000@nebula.lyra.org>
Message-ID: <38620B04.7CC64485@trema.com>


Greg Stein wrote:
> 
> Hi all,
> 
> I reorganized Python's dynamic load/import code over the past few days.
> Gudio provided some feedback, I did some more mods, and now it is checked
> into CVS. The new loading behavior has been tested on Linux, IRIX, and
> Solaris (and probably Windows by now).
> 

...


What was the motivation behind this modification?

Just curious,

-Harri


From marangoz at python.inrialpes.fr  Thu Dec 23 13:12:40 1999
From: marangoz at python.inrialpes.fr (Vladimir Marangozov)
Date: Thu, 23 Dec 1999 13:12:40 +0100 (CET)
Subject: [Python-Dev] Please test new dynamic load behavior
In-Reply-To: <Pine.LNX.4.10.9912221200290.16305-100000@nebula.lyra.org> from "Greg Stein" at Dec 22, 1999 12:11:43 PM
Message-ID: <199912231212.NAA26572@python.inrialpes.fr>

Greg Stein wrote:
> 
> Hi all,
> 
> I reorganized Python's dynamic load/import code over the past few days.
> Gudio provided some feedback, I did some more mods, and now it is checked
> into CVS. The new loading behavior has been tested on Linux, IRIX, and
> Solaris (and probably Windows by now).
> 

Great work Greg!

> Here are some of the platforms that I believe need specific testing:
> 
> - NetBSD, FreeBSD, OpenBSD, ...
> - AIX
> - HP/UX
> - BeOS
> - NeXT
> - Mac
> - OS/2
> - Win16

AFAICT, the AIX version works perfectly okay.

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov at inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From jim at digicool.com  Thu Dec 23 15:41:23 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 09:41:23 -0500
Subject: [Python-Dev] str(1L) -> '1' ?
Message-ID: <38623493.E6BA6D6F@digicool.com>

In November there was an interesting discussion on comp.lang.python 
about the meaning of __str__ and __repr__.  One tidbit that came out
of this discussion was that __str__ for longs should drop the trailing 
'L'. Was there a decision on this? I'd really like this to happen.

We do alot of work with RDBMS systems and long integers seem to
come up alot with these systems (as do other fix-decimal number, 
but that's another topic ;).  For example, our latest Sybase and
Oracle support in Zope returns long integers for RDBMS types
like NUMBER(10,0).  The trailing 'L' in the string representation
is causeing us some headaches.  This seems also to be an issue when
using the current standard ODBC interface with Oracle, as indicated
in a DB-SIG post today.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From guido at CNRI.Reston.VA.US  Thu Dec 23 15:46:58 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 09:46:58 -0500
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: Your message of "Thu, 23 Dec 1999 09:41:23 EST."
             <38623493.E6BA6D6F@digicool.com> 
References: <38623493.E6BA6D6F@digicool.com> 
Message-ID: <199912231446.JAA22086@eric.cnri.reston.va.us>

[Jim F]
> In November there was an interesting discussion on comp.lang.python 
> about the meaning of __str__ and __repr__.  One tidbit that came out
> of this discussion was that __str__ for longs should drop the trailing 
> 'L'. Was there a decision on this? I'd really like this to happen.

Yes, I'd like it to happen.  I'd also like repr() of a float to return
the full precision (using the "%.17g" sprintf format).

I haven't done it for lack of time -- feel free to send a patch (don't
forget the disclaimer from http://www.python.org/1.5/bugrelease.html).

We haven't decided yet what to do with the greater topic of that
discussion (or was it a different one?) -- whether the values printed
by typing a bare expression in interactive mode should use str(),
repr(), or str-special-casing-the-snot-out-of-strings().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at digicool.com  Thu Dec 23 15:51:14 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 09:51:14 -0500
Subject: [Python-Dev] Fixed-decimal types
Message-ID: <386236E2.F97109D3@digicool.com>

While on the subject of RDBMS systems, a common need is to be able to
work with fixed-decimal data.  I think a standard Python fixed-decimal
type would help to make Python database interfaces alot more robust.
I even wonder if the Python long type might be hijacked for this purpose
by adding a "scale" that indicates the number of digits to the right
of the decimal point.  For example, an expression like:

  1000000000.2500L

would create a fixed decimal number with a scale of 4.

People have built Python classes for fixed-decimal
types, but when working with RDBMS data, one often deals with
lots of data and efficiency matters.  I also suspect that adding
scale to longs wouldn't be that hard and would be a fairly natural
extension.

In any case, a "standard" (being in the standard library would
be sufficient) fixed-decimal type would probably lead to better
database interfaces that (at least more) properly handled 
fixed-decimal data.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From guido at CNRI.Reston.VA.US  Thu Dec 23 15:56:33 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 09:56:33 -0500
Subject: [Python-Dev] Fixed-decimal types
In-Reply-To: Your message of "Thu, 23 Dec 1999 09:51:14 EST."
             <386236E2.F97109D3@digicool.com> 
References: <386236E2.F97109D3@digicool.com> 
Message-ID: <199912231456.JAA22134@eric.cnri.reston.va.us>

What would be scale of the product of two fixed-decimal numbers?
E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L?  There are
arguments for either.  Same question for division (harder, I think).

I like the idea of using the dd.ddL notation for this.

I have no time to implement it but would not be unwilling to accept
patches.  They would have to be accompanied with a wet signature, see
http://www.python.org/1.5/wetsign.html.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at digicool.com  Thu Dec 23 16:00:25 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 10:00:25 -0500
Subject: [Python-Dev] re: Open Source design competition / Python / software 
 tools
References: <Pine.LNX.4.10.9912222242070.4839-200000@akbar.nevex.com>
Message-ID: <38623909.CDF41014@digicool.com>

gvwilson at nevex.com wrote:
> 
> Hi, folks.  I hope you don't mind another mail out of the blue, but I got
> notice on Saturday that the Department of Energy is giving me $860K over
> two years to support development of easier-to-use software engineering
> tools.  All of the work will be Open Source, and will be done in Python,
> with a strong emphasis on design, testing, and documentation.  The
> project's long-term objective is to encourage scientists and engineers to
> treat programs in the same way as they do other experiments, i.e. to
> calibrate, test, peer review, and so on.
> 
> To kick-start things, we're going to be holding a two-round design
> competition.  Anyone (individual or team, professional or student) can
> submit a short entry for the first round; the judges will pick four
> candidates to go forward in each of four categories, and those
> individuals or teams will be asked to submit full entries. The four
> categories are:
> 
> * an issue tracking system to replace Gnats and Bugzilla;
> 
> * a build system to replace make;
> 
> * a platform inspection and configuration system to replace autoconf;
>   and
> 
> * a testing framework to replace XUnit, Expect, and DejaGnu.
> 
> Would you be interested in participating in any way

Are these categories fixed? I see a very strong need for an 
open-source UML modeling tool. UML is extremely powerful, but current
UML tools largely suck and are very expensive.  We are contemplating
launching an open-source development effort to build UML modeling tools
using Zope or the Zope object database as a repository. A contest
like this could help to kick-start this effort, but tools to automate
requirements and design seem to be missing. This is odd, considering that
up-front activities like requirements and design have the largest impact
on software-engineering project success.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From captainrobbo at yahoo.com  Thu Dec 23 16:13:22 1999
From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Thu, 23 Dec 1999 07:13:22 -0800 (PST)
Subject: [Python-Dev] Fixed-decimal types
Message-ID: <19991223151322.5698.qmail@web604.mail.yahoo.com>

--- Guido van Rossum <guido at CNRI.Reston.VA.US> wrote:
> What would be scale of the product of two
> fixed-decimal numbers?
> E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to
> 4.00L?  There are
> arguments for either.  Same question for division
> (harder, I think).
Most commonly one is trying to avoid rounding errors
when dealing with money - a few cents rounding error
tends to result in a few billable hours with the
accountants at the end of the year!

SQL dialects and type-safe languages would make you
specify the precision of the variable to be assigned,
so the issue does not arise for other languages.  

For the work I do, simply taking the precision of the
most precise input (4.00L)would do the trick, but your
answer (4.0000L) is purer.  We should provide a
rounding function, and in practice anyone using such a
function would round (or floor, or ceiling) to get to
the desired precision immediately.

I'm not sure on division either but I'm sure there are
precedents to look at.

On the subject of adding new types to the standard
library, what are the plans on dates and times?  Would
a cut-down mxDateTime ever be considered?  It is fully
Open Source (unlike mxODBC) and was designed for the
DBAPI.

Regards,

Andy

=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com


From guido at CNRI.Reston.VA.US  Thu Dec 23 16:23:43 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 10:23:43 -0500
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
In-Reply-To: Your message of "Thu, 23 Dec 1999 07:13:22 PST."
             <19991223151322.5698.qmail@web604.mail.yahoo.com> 
References: <19991223151322.5698.qmail@web604.mail.yahoo.com> 
Message-ID: <199912231523.KAA22232@eric.cnri.reston.va.us>

> On the subject of adding new types to the standard
> library, what are the plans on dates and times?  Would
> a cut-down mxDateTime ever be considered?  It is fully
> Open Source (unlike mxODBC) and was designed for the
> DBAPI.

I don't know much about date/time types, or about mxDateTime.
My intuition is that there are too many ways to do it, and that being
compatible with commercial databases may not be the right way to do it
for core Python.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Thu Dec 23 16:27:59 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 23 Dec 1999 10:27:59 -0500 (EST)
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: <38623493.E6BA6D6F@digicool.com>
References: <38623493.E6BA6D6F@digicool.com>
Message-ID: <14434.16255.58344.646524@weyr.cnri.reston.va.us>

Jim Fulton writes:
 > In November there was an interesting discussion on comp.lang.python 
 > about the meaning of __str__ and __repr__.  One tidbit that came out
 > of this discussion was that __str__ for longs should drop the trailing 
 > 'L'. Was there a decision on this? I'd really like this to happen.

  I liked that result as well, and thought about it just the other
day.  Luckily, you sent a note this morning and made me think about
again.  I'll have something checked into CVS shortly.  ;)


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From Mike.Da.Silva at uk.fid-intl.com  Thu Dec 23 17:30:07 1999
From: Mike.Da.Silva at uk.fid-intl.com (Da Silva, Mike)
Date: Thu, 23 Dec 1999 16:30:07 -0000
Subject: [Python-Dev] Fixed Decimal types
Message-ID: <DBF3B37F7BF1D111B2A10000F6B14B1FDDAF86@ukhil704nts.hld.uk.fid-intl.com>

	Andy Robinson wrote:
		For the work I do, simply taking the precision of the
		most precise input (4.00L)would do the trick, but your
		answer (4.0000L) is purer.  We should provide a
		rounding function, and in practice anyone using such a
		function would round (or floor, or ceiling) to get to
		the desired precision immediately.

		I'm not sure on division either but I'm sure there are
		precedents to look at.

	The AS400 provides a useful example of the right way to do scaled
decimals.

	In the RPG programming language, all internal calculations (i.e.
multiplication, division) are performed to the maximum precision of the
intermediate result (in the multiplication example below), the intermediate
result would be 4.0000L.  When the intermediate result is assigned to the
target scaled decimal number, the decimal precision is automatically
extended or truncated to fit the target precision.  One extra wrinkle in all
of this is the option to "half-adjust" the intermediate value on assignment;
that is to apply automatic 5/4 rounding to the precision of the target.

	So, if the target field is defined as numeric(4,2), the result will
be 4.00L.

	These are probably the kind of semantics that a scaled decimal type
would require in Python also; i.e. allow unlimited precision in intermediate
calculations, with a sensible set of rules for assignment to a variable of
different scale and precision.

	However, unlike RPG, we should probably ensure that attempts to
overflow or underflow the scale result in NaN or Overflow conditions, rather
than assuming the user is right and losing the significant digits.

	Regards,
	Mike da Silva


From jim at digicool.com  Thu Dec 23 17:37:10 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 11:37:10 -0500
Subject: [Python-Dev] Fixed-decimal types
References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us>
Message-ID: <38624FB6.ED903F@digicool.com>

Guido van Rossum wrote:
> 
> What would be scale of the product of two fixed-decimal numbers?
> E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L?  There are
> arguments for either.  Same question for division (harder, I think).

I'd be inclined to start by doing some research to see if some standard
(SQL?) defines this somewhere.  It would be nice if someone has already 
done the requirements work for us. :)

> I like the idea of using the dd.ddL notation for this.
> 
> I have no time to implement

Me neither.

> it but would not be unwilling to accept patches. 

Cool.  If no one else volunteers, then I'll try to find a way
to get this done (not necessarily by me). I think it is pretty
important.

> They would have to be accompanied with a wet signature, see
> http://www.python.org/1.5/wetsign.html.

Yup.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From captainrobbo at yahoo.com  Thu Dec 23 17:38:50 1999
From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Thu, 23 Dec 1999 08:38:50 -0800 (PST)
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
Message-ID: <19991223163850.15619.qmail@web604.mail.yahoo.com>

Sorry, should have replied to the list...

--- Andy Robinson <captainrobbo at yahoo.com> wrote:
> Date: Thu, 23 Dec 1999 08:37:18 -0800 (PST)
> From: Andy Robinson <captainrobbo at yahoo.com>
> Reply-to: andy at robanal.demon.co.uk
> Subject: Re: [Python-Dev] Date and timetypes (was:
> Fixed-decimal types)
> To: Guido van Rossum <guido at CNRI.Reston.VA.US>
> 
> --- Guido van Rossum <guido at CNRI.Reston.VA.US>
> wrote:
> > I don't know much about date/time types, or about
> > mxDateTime.
> > My intuition is that there are too many ways to do
> > it, and that being
> > compatible with commercial databases may not be
> the
> > right way to do it
> > for core Python.
> > 
> 
> OK.  Let me rephrase it.  Say we form a consensus on
> 'the right way'.  Are you amenable to some solution
> which goes back before 1970 and after 2038 going
> into
> the standard library?
> 
> And does your answer change if it involves some
> compiled code as well?  
> 
> I mention mxDateTime because it was agreed by a
> Python
> SIG, is mature and stable, and I find it very
> useful. 
> And the core type is pretty small - much of the
> helper
> stuff in the package now could be kept separate from
> the main Python distribution.  
> 
> - Andy
> 
> 
> =====
> Andy Robinson
> Robinson Analytics Ltd.
> ------------------
> My opinions are the official policy of Robinson
> Analytics Ltd.
> They just vary from day to day.
> 
>
_________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at
> http://mail.yahoo.com
> 


=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


From guido at CNRI.Reston.VA.US  Thu Dec 23 17:42:33 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 11:42:33 -0500
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
In-Reply-To: Your message of "Thu, 23 Dec 1999 08:38:50 PST."
             <19991223163850.15619.qmail@web604.mail.yahoo.com> 
References: <19991223163850.15619.qmail@web604.mail.yahoo.com> 
Message-ID: <199912231642.LAA22598@eric.cnri.reston.va.us>

> > OK.  Let me rephrase it.  Say we form a consensus on 'the right
> > way'.  Are you amenable to some solution which goes back before
> > 1970 and after 2038 going into the standard library?

No problem.

> > And does your answer change if it involves some
> > compiled code as well?

I'd rather not.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip at mojam.com  Thu Dec 23 18:05:52 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 23 Dec 1999 11:05:52 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <14434.22128.639699.738932@dolphin.mojam.com>

    Guido> (The next step would be to outlaw raise with a string argument; I
    Guido> think I can't make that for 1.6.  But it would be a good idea to
    Guido> scan the standard library for string exceptions and convert all
    Guido> of them.)

Agreed.  I know Zope uses (at least, my Zope-using code uses) stuff like 

    raise 'Redirect', url

to map names onto HTTP response codes.  Makes it easier on people to
remember names instead of numeric codes.  I suspect it will take the Zopers
awhile to convert to using class-based exceptions if they haven't already.
(For all I know I may be using a deprecated feature.)

Skip


From gvwilson at nevex.com  Thu Dec 23 18:24:05 1999
From: gvwilson at nevex.com (gvwilson at nevex.com)
Date: Thu, 23 Dec 1999 12:24:05 -0500 (EST)
Subject: [Python-Dev] re: Open Source design competition / Python /
 software  tools
In-Reply-To: <38623909.CDF41014@digicool.com>
Message-ID: <Pine.LNX.4.10.9912231219380.12516-100000@akbar.nevex.com>

Hi, everyone.  I'm sending my reply to Jim's message to the whole
python-dev list; I'll send follow-ups to individuals if people would
prefer.

> > * an issue tracking system to replace Gnats and Bugzilla;
> > 
> > * a build system to replace make;
> > 
> > * a platform inspection and configuration system to replace autoconf;
> >   and
> > 
> > * a testing framework to replace XUnit, Expect, and DejaGnu.

> Jim Fulton asked:
> Are these categories fixed?

For the first round, yes --- I have to prove that this model can solve
small problems before I'll be given the funding to tackle larger ones, and
I think that a UML modeling tool is definitely "large" :-).  I also have
to demonstrate uptake, and I think more people will adopt a sane
replacement for Autoconf in the next 18 months than would adopt a UML
modeler.  However, decent Open Source CASE tools are very (very) high on
my personal list --- if this works, I'd like to tackle them (along with
providing support for DDD, and a few other thingsl ike that).

Greg


From gstein at lyra.org  Thu Dec 23 19:26:44 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 23 Dec 1999 10:26:44 -0800 (PST)
Subject: [Python-Dev] Re: Please test new dynamic load behavior
In-Reply-To: <38620B04.7CC64485@trema.com>
Message-ID: <Pine.LNX.4.10.9912231022280.16305-100000@nebula.lyra.org>

On Thu, 23 Dec 1999, Harri Pasanen wrote:
> Greg Stein wrote:
> > Hi all,
> > 
> > I reorganized Python's dynamic load/import code over the past few days.
> > Gudio provided some feedback, I did some more mods, and now it is checked
> > into CVS. The new loading behavior has been tested on Linux, IRIX, and
> > Solaris (and probably Windows by now).
> 
> ...
> 
> What was the motivation behind this modification?

Harri -

With the new code structure, it is much easier to maintain Python's
loading code.

Each platform has its own file (e.g. dynload_aix.c) rather than being all
jammed together into importdl.c. This isn't a huge win by itself, but does
increase readability/maintainability. The big improvement, however, is
when you are adding support for new platforms or loading mechanisms. A new
dynload_*.c can be written and one line added to configure.in, and you're
done. No need to make importdl.c even uglier.  (actually, importdl.c no
longer contains *any* platform specific code; it has all been moved to the
dynload_*.c files)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim at digicool.com  Thu Dec 23 20:39:37 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 14:39:37 -0500
Subject: [Python-Dev] Fixed-decimal types
References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com>
Message-ID: <38627A79.BF379672@digicool.com>

Jim Fulton wrote:
> 
> Guido van Rossum wrote:
> >
> > What would be scale of the product of two fixed-decimal numbers?
> > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L?  There are
> > arguments for either.  Same question for division (harder, I think).
> 
> I'd be inclined to start by doing some research to see if some standard
> (SQL?) defines this somewhere.  It would be nice if someone has already
> done the requirements work for us. :)

Here is what the book "SQL-99 Complete, Really" says that the SQL
standard says:

  - for addition and subtraction of two "exact" (fixed-decimal)
    numbers, the result has the maximum of the scales.

  - for multiplication of two "exact" (fixed-decimal)
    numbers, the result has the sum of the scales.

  - punts on division

  - for addition, subtraction, multiplication or division
    between "exact" (fixed point) and "approximate" (floating point)
    yields an approximate result.  This means that fixed-decimal
    coerces to float.

I'm curious to see who else chips in with examples from other systems.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From jim at digicool.com  Thu Dec 23 20:43:41 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 14:43:41 -0500
Subject: [Python-Dev] Fixed Decimal types
References: <DBF3B37F7BF1D111B2A10000F6B14B1FDDAF86@ukhil704nts.hld.uk.fid-intl.com>
Message-ID: <38627B6D.447A9553@digicool.com>

"Da Silva, Mike" wrote:
> 
>         Andy Robinson wrote:
>                 For the work I do, simply taking the precision of the
>                 most precise input (4.00L)would do the trick, but your
>                 answer (4.0000L) is purer.  We should provide a
>                 rounding function, and in practice anyone using such a
>                 function would round (or floor, or ceiling) to get to
>                 the desired precision immediately.
> 
>                 I'm not sure on division either but I'm sure there are
>                 precedents to look at.
> 
>         The AS400 provides a useful example of the right way to do scaled
> decimals.
> 
>         In the RPG programming language, all internal calculations (i.e.
> multiplication, division) are performed to the maximum precision of the
> intermediate result (in the multiplication example below), the intermediate
> result would be 4.0000L.  When the intermediate result is assigned to the
> target scaled decimal number, the decimal precision is automatically
> extended or truncated to fit the target precision.  One extra wrinkle in all
> of this is the option to "half-adjust" the intermediate value on assignment;
> that is to apply automatic 5/4 rounding to the precision of the target.

Yee ha! This is great input. Anyone have any other examples of what
any other systems do? Anyone got a PL/I manual handy. ;)

>         So, if the target field is defined as numeric(4,2), the result will
> be 4.00L.

Since Python doesn't have types values, this is not an issue
internally, but would be an issue when binding to external databases.

>         These are probably the kind of semantics that a scaled decimal type
> would require in Python also; i.e. allow unlimited precision in intermediate
> calculations, with a sensible set of rules for assignment to a variable of
> different scale and precision.
> 
>         However, unlike RPG, we should probably ensure that attempts to
> overflow or underflow the scale result in NaN or Overflow conditions, rather
> than assuming the user is right and losing the significant digits.

Since this would be based on infinite-precision numbers, I don't
think that this would be an issue.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From guido at CNRI.Reston.VA.US  Thu Dec 23 20:44:36 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 14:44:36 -0500
Subject: [Python-Dev] Fixed-decimal types
In-Reply-To: Your message of "Thu, 23 Dec 1999 14:39:37 EST."
             <38627A79.BF379672@digicool.com> 
References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com>  
            <38627A79.BF379672@digicool.com> 
Message-ID: <199912231944.OAA23337@eric.cnri.reston.va.us>

Jim Fulton wrote:

>   - for addition and subtraction of two "exact" (fixed-decimal)
>     numbers, the result has the maximum of the scales.

One could argue that this is incorrect: if "3.1" means that I know the
value to one decimal of precision, and "2.01" means that I know that
value to two decimals of precision, stating the result of their sum as
"5.11" suggests that I know the result to two decimals of precision,
which is of course false: because I only knew one decimal of precision
for one of the operands, I only know (at most!) one decimal of
precision for the result.

Not arguing for this interpretation, just indicating that doing fixed
precision arithmetic right is hard.  I'm waiting for Tim Peters'
contribution, but he's on vacation so it may be a while.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Thu Dec 23 21:48:56 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Thu, 23 Dec 1999 15:48:56 -0500
Subject: [Python-Dev] Fixed Decimal types
In-Reply-To: <38627B6D.447A9553@digicool.com>
Message-ID: <1266141247-31971518@hypernet.com>

Jim Fulton wrote:
> "Da Silva, Mike" wrote:

[AS400 RPG rules...]

> Yee ha! This is great input. Anyone have any other examples of
> what any other systems do? Anyone got a PL/I manual handy. ;)


From jim at digicool.com  Thu Dec 23 23:18:37 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 17:18:37 -0500
Subject: [Python-Dev] re: Open Source design competition / Python /software  
 tools
References: <Pine.LNX.4.10.9912231219380.12516-100000@akbar.nevex.com>
Message-ID: <38629FBD.3B8F47D4@digicool.com>

gvwilson at nevex.com wrote:
> 
> Hi, everyone.  I'm sending my reply to Jim's message to the whole
> python-dev list; I'll send follow-ups to individuals if people would
> prefer.
> 
> > > * an issue tracking system to replace Gnats and Bugzilla;
> > >
> > > * a build system to replace make;
> > >
> > > * a platform inspection and configuration system to replace autoconf;
> > >   and
> > >
> > > * a testing framework to replace XUnit, Expect, and DejaGnu.
> 
> > Jim Fulton asked:
> > Are these categories fixed?
> 
> For the first round, yes 

OK.

>--- I have to prove that this model can solve
> small problems before I'll be given the funding to tackle larger ones, and
> I think that a UML modeling tool is definitely "large" :-). 

Well, since you gave rational ..... :)

<speech>
Isn't the Open Source community especially good at large problems?
Note that I'm thinking more in terms of an open source UML community
of tools, based around an existing repository rather than on a single 
monolithic tool.  I envision a community of diagramming and other small
tools orbiting Zope or ZODB. The hardest part of a UML tool is the
repository, and I think we've mostly got that.

I think that what the Open Source community desperately needs 
are tools for managing and sharing the most important artifacts
in the development process.
</speech>

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From gstein at lyra.org  Fri Dec 24 01:09:29 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 23 Dec 1999 16:09:29 -0800 (PST)
Subject: [Python-Dev] re: Open Source design competition / Python /software
   tools
In-Reply-To: <38629FBD.3B8F47D4@digicool.com>
Message-ID: <Pine.LNX.4.10.9912231605030.412-100000@nebula.lyra.org>

On Thu, 23 Dec 1999, Jim Fulton wrote:
> gvwilson at nevex.com wrote:
>...
> >--- I have to prove that this model can solve
> > small problems before I'll be given the funding to tackle larger ones, and
> > I think that a UML modeling tool is definitely "large" :-). 
> 
> Well, since you gave rational ..... :)
> 
> <speech>
> Isn't the Open Source community especially good at large problems?

Very true, I agree, but part of Greg's problem is "proving" that to the
DoE. Somebody has said those four problems are sufficient to do so, and
(probably) because they are reasonably constrained to allow completion
within a specified timeframe.

> Note that I'm thinking more in terms of an open source UML community
> of tools, based around an existing repository rather than on a single 
> monolithic tool.  I envision a community of diagramming and other small
> tools orbiting Zope or ZODB. The hardest part of a UML tool is the
> repository, and I think we've mostly got that.

Greg's proposal is quite specific. "A community" isn't, so it might not
help to create a proof to the DoE (otherwise, they could look at the Zope
community, or other communities!).

Jim: there isn't anything stopping or impeding the creation of an Open
Source community for UML modeling. This DoE competition won't affect
that...

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim at digicool.com  Fri Dec 24 01:27:53 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 19:27:53 -0500
Subject: [Python-Dev] re: Open Source design competition / Python 
 /softwaretools
References: <Pine.LNX.4.10.9912231605030.412-100000@nebula.lyra.org>
Message-ID: <3862BE09.9AF62090@digicool.com>

Greg Stein wrote:
> 
(snip)
> Jim: there isn't anything stopping or impeding the creation of an Open
> Source community for UML modeling.

Of course not.

> This DoE competition won't affect that...

Perhaps it could help it.
 
> Happy Holidays,

You too.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From ping at lfw.org  Fri Dec 24 09:55:28 1999
From: ping at lfw.org (Ka-Ping Yee)
Date: Fri, 24 Dec 1999 00:55:28 -0800 (PST)
Subject: [Python-Dev] re: Open Source design competition / Python /
 software tools
In-Reply-To: <Pine.LNX.4.10.9912222242070.4839-200000@akbar.nevex.com>
Message-ID: <Pine.LNX.4.10.9912240049360.655-100000@skuld.lfw.org>

On Wed, 22 Dec 1999 gvwilson at nevex.com wrote:
> To kick-start things, we're going to be holding a two-round design
> competition.  Anyone (individual or team, professional or student) can
> submit a short entry for the first round; the judges will pick four
> candidates to go forward in each of four categories, and those
> individuals or teams will be asked to submit full entries. The four
> categories are:
> 
> * an issue tracking system to replace Gnats and Bugzilla;

Hi there.

At ILM we've been using a system that i hacked up quickly in Python
called "Roundup".  It has a number of interesting properties that
have made it really useful to us, and arguably better than any of
the existing open-source bug-tracking things out there that i know
of.  It is not just a Web app; it lives between the Web and e-mail,
because we do so much of our communication that way.

For example, each request item gets its own virtual mailing list,
updated on the fly without the need for explicit subscription (if
you cc: somebody while discussing the bug, they get subscribed).
Empirically i've discovered that unsubscription is actually
unnecessary (!) because conversation will stop on a topic when it
gets resolved or when it ceases to be interesting.  These are
fine-grained discussion lists on a per-topic level.

This is just to let you know i'm interested.  I'm currently asking
for permission to open-source Roundup; if it can't be done, or
doesn't happen quickly enough, i'll just have to take a weekend and
rewrite the thing.  There were a few things i wanted to fix anyway.


-- ?!ng

"You should either succeed gloriously or fail miserably.  Just getting
by is the worst thing you can do."
    -- Larry Smith


From marangoz at python.inrialpes.fr  Fri Dec 24 13:07:05 1999
From: marangoz at python.inrialpes.fr (Vladimir Marangozov)
Date: Fri, 24 Dec 1999 13:07:05 +0100 (CET)
Subject: [Python-Dev] Exceptions
In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 01:23:45 PM
Message-ID: <199912241207.NAA18783@python.inrialpes.fr>

Guido van Rossum wrote:
> 
> Vladimir.Marangozov at inrialpes.fr:
> 
> > Yes. Besides, I still think that string-based exceptions are just
> > convenient for quick & dirty, throw-away test scripts.
> 
> They have a hard-to-understand quirk though: the id() of the string is
> used to check rather than its value, so that except "foo" doesn't
> necessarily catch raise "foo"; but due to various optimization, this
> usually works, and people get bent out of shape when it doesn't.

Which brings 2 important questions:

1. In the long run, which one is better -- compare and check exceptions by
   reference (by name) or by value?

   (currently, this is done by reference on predefined object types:
    strings, classes or instances)

   I'd say, exceptions have to be compared (catched) by value, i.e. use
   "e1 == e2" instead of "e1 is e2".

2. Should we limit the exception "types"?

   I'd say, no. My Pythonic view of things says that we raise "objects",
   be they classes, instances, strings or, why not, ints.

   However, if one wants to put some order in the "unordered set" of exceptions
   s/he uses, then classes is the way to do it, because classes were given some
   nice properties, like inheritance, that allow to group and to organize logically
   the objects we throw and catch as exceptions (+ other bonus properties coming
   from classes).

   Note that conceptually, when we say "strings and ints", we have in mind
   "string instances and int instances", whose "classes" are written in C.
   When there will be String and Int classes of some sort as first class objects,
   then we'll fall back to the terminology: Exceptions can be classes or instances.

If point 1 and (optionally) point 2 is implemented, the hard-to-understand quirk
wouldn't be an issue and string-based exceptions would have a legal reason to stay
and live.

> Since you have to give your exception a name, how hard is it to say
> 
> class MyError(Exception): pass
> 
> rathern than
> 
> MyError = "MyError"
> 
> ?

You know what I think about "names"...  I may have defined my exception conventions
and be interested in catching an exception named 404, implying that "a 404 bobo"
occured deeply in my code ("deeply in my code" meaning for example: database 4,
service 0, customer group 4, or just a standard HTTP "Code 404 - Not Found".)

Pushing this to the extreme to catapult your thoughts into the next millenium. :)
and to emphasize the importance of discussing and anwsering objectively the above
questions 1) and 2).

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov at inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From mal at lemburg.com  Fri Dec 24 12:03:37 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 24 Dec 1999 12:03:37 +0100
Subject: [Python-Dev] str(1L) -> '1' ?
References: <38623493.E6BA6D6F@digicool.com> <199912231446.JAA22086@eric.cnri.reston.va.us>
Message-ID: <38635309.2AEFF18D@lemburg.com>

Guido van Rossum wrote:
> 
> [Jim F]
> > In November there was an interesting discussion on comp.lang.python
> > about the meaning of __str__ and __repr__.  One tidbit that came out
> > of this discussion was that __str__ for longs should drop the trailing
> > 'L'. Was there a decision on this? I'd really like this to happen.
> 
> Yes, I'd like it to happen.  I'd also like repr() of a float to return
> the full precision (using the "%.17g" sprintf format).

While we're at it: how about adding a PyLong_AsString() API
to the C interface ? I currently use PyObject_Str() in mxODBC
and then slice off the 'L' -- not very elegant. A PyLong_AsString()
API would much better suit the task.

Merry Christmas,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     7 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Fri Dec 24 12:11:29 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 24 Dec 1999 12:11:29 +0100
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
References: <19991223163850.15619.qmail@web604.mail.yahoo.com> <199912231642.LAA22598@eric.cnri.reston.va.us>
Message-ID: <386354E1.DA560F42@lemburg.com>

Guido van Rossum wrote:
> 
> > > OK.  Let me rephrase it.  Say we form a consensus on 'the right
> > > way'.  Are you amenable to some solution which goes back before
> > > 1970 and after 2038 going into the standard library?
> 
> No problem.
> 
> > > And does your answer change if it involves some
> > > compiled code as well?
> 
> I'd rather not.

As far as mxDateTime goes, I'd rather not see it in the core
distribution. Including the mx stuff in a separate PythonPowerTools
distribution would be cool though. For a start in this direction
see e.g.:

     http://startship.skyport.net/~lemburg/PPowerTools-0.2.zip

Note that I'll wrap all my mx extensions into a new mx package
which will come in several flavours next year. There will no
longer be separate packages due to the various naming
collisions and to enable intra-mx-package dependencies.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     7 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From captainrobbo at yahoo.com  Fri Dec 24 13:22:29 1999
From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Fri, 24 Dec 1999 04:22:29 -0800 (PST)
Subject: [Python-Dev] Fixed Decimal types
Message-ID: <19991224122229.23506.qmail@web606.mail.yahoo.com>

> >> However, unlike RPG, we should probably ensure 
> >> that attempts to overflow or underflow the scale 
> >> result in NaN or Overflow conditions, rather
> >> than assuming the user is right and losing 
> >> the significant digits.
>  
> > Since this would be based on infinite-precision
> numbers, I don't
> > think that this would be an issue.


Three very general observations before I disappear for
Christmas:

(1) I think there is great mileage in combining the
fixed-decimal concept with Martin Fowler's Quantity
pattern, so that a variable could be defined as not
just two decimal places but also (say) "GBP" or "USD",
and it would be an error to add the two.  Same applies
for adding metres, kilograms and other quantities. 
There has also been discussion that the 'type' of a
quantity should determine what math should apply.

(2) If Python is going to be used increasingly in
eCommerce, it should be good at dealing with money -
maybe not in the core language, but we should aim for
one standard package.  

(3) We have a python-finance list
(python-finance at egroups.com), recently generalized to
cover business systems, which is a good place to
discuss this if anyone wants to.  There are people
there who have time, would love to prototype something
(indeed some work started in this area 3 months back),
and would use it at work too.  This would be an ideal
first target for that group - or indeed for a
finance-sig.  I'll pursue this in the New Year.

Merry Christmas,

Andy

=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


From jack at oratrix.nl  Fri Dec 24 13:34:28 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 24 Dec 1999 13:34:28 +0100
Subject: [Python-Dev] Fixed Decimal types 
In-Reply-To: Message by =?iso-8859-1?q?Andy=20Robinson?= 
 <captainrobbo@yahoo.com> ,
	     Fri, 24 Dec 1999 04:22:29 -0800 (PST) , <19991224122229.23506.qmail@web606.mail.yahoo.com> 
Message-ID: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl>

> (1) I think there is great mileage in combining the
> fixed-decimal concept with Martin Fowler's Quantity
> pattern, so that a variable could be defined as not
> just two decimal places but also (say) "GBP" or "USD",
> and it would be an error to add the two.  Same applies
> for adding metres, kilograms and other quantities. 
> There has also been discussion that the 'type' of a
> quantity should determine what math should apply.

Isn't this something that is ideally suited for implementation in a Python 
module, based on a core implementation of fixed decimal numbers?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From gstein at lyra.org  Fri Dec 24 21:05:22 1999
From: gstein at lyra.org (Greg Stein)
Date: Fri, 24 Dec 1999 12:05:22 -0800 (PST)
Subject: [Python-Dev] Fixed Decimal types 
In-Reply-To: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl>
Message-ID: <Pine.LNX.4.10.9912241204290.412-100000@nebula.lyra.org>

On Fri, 24 Dec 1999, Jack Jansen wrote:
> > (1) I think there is great mileage in combining the
> > fixed-decimal concept with Martin Fowler's Quantity
> > pattern, so that a variable could be defined as not
> > just two decimal places but also (say) "GBP" or "USD",
> > and it would be an error to add the two.  Same applies
> > for adding metres, kilograms and other quantities. 
> > There has also been discussion that the 'type' of a
> > quantity should determine what math should apply.
> 
> Isn't this something that is ideally suited for implementation in a Python 
> module, based on a core implementation of fixed decimal numbers?

I'd agree with Jack here.

The "simple" change of a scale for the Long values is nice. Starting to
lump in features like this begins to get a little messier...

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Fri Dec 24 21:13:50 1999
From: gstein at lyra.org (Greg Stein)
Date: Fri, 24 Dec 1999 12:13:50 -0800 (PST)
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: <38635309.2AEFF18D@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912241211460.412-100000@nebula.lyra.org>

On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> Guido van Rossum wrote:
> > [Jim F]
> > > In November there was an interesting discussion on comp.lang.python
> > > about the meaning of __str__ and __repr__.  One tidbit that came out
> > > of this discussion was that __str__ for longs should drop the trailing
> > > 'L'. Was there a decision on this? I'd really like this to happen.
> > 
> > Yes, I'd like it to happen.  I'd also like repr() of a float to return
> > the full precision (using the "%.17g" sprintf format).
> 
> While we're at it: how about adding a PyLong_AsString() API
> to the C interface ? I currently use PyObject_Str() in mxODBC
> and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> API would much better suit the task.

Fred just checked in a change yesterday. PyObject_Str() on a Long no
longer includes the 'L'.

You're going to need to update your code :-)
[ I've got some here and there to fix, too, with the idiom:
     if type(v) is type(1L): return str(v)[:-1]
  ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From mal at lemburg.com  Sun Dec 26 23:29:28 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 26 Dec 1999 23:29:28 +0100
Subject: [Python-Dev] str(1L) -> '1' ?
References: <Pine.LNX.4.10.9912241211460.412-100000@nebula.lyra.org>
Message-ID: <386696C8.6EBBF428@lemburg.com>

Greg Stein wrote:
> 
> On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> > While we're at it: how about adding a PyLong_AsString() API
> > to the C interface ? I currently use PyObject_Str() in mxODBC
> > and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> > API would much better suit the task.
> 
> Fred just checked in a change yesterday. PyObject_Str() on a Long no
> longer includes the 'L'.

Ah, ok... scanning the patches: they don't provide an externed
C interface... I would like to have such a beast if possible
(basically, the new long_format() as PyLong_AsString()).

> You're going to need to update your code :-)
> [ I've got some here and there to fix, too, with the idiom:
>      if type(v) is type(1L): return str(v)[:-1]
>   ]

Your above example will effectively divide the long value by 10
which will probably break things in very subtle ways... hmm, this
change ought to be made *very* visible to people upgrading to
1.6, IMHO.

I'll fix mxODBC to only truncate the string value iff
the 'L' is present.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     5 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From andy at robanal.demon.co.uk  Mon Dec 27 11:43:17 1999
From: andy at robanal.demon.co.uk (Andy Robinson)
Date: Mon, 27 Dec 1999 10:43:17 GMT
Subject: [Python-Dev] Fixed Decimal types 
In-Reply-To: <Pine.LNX.4.10.9912241204290.412-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912241204290.412-100000@nebula.lyra.org>
Message-ID: <38674259.5377973@post.demon.co.uk>

On Fri, 24 Dec 1999 12:05:22 -0800 (PST), you wrote:

>On Fri, 24 Dec 1999, Jack Jansen wrote:
>> > (1) I think there is great mileage in combining the
>> > fixed-decimal concept with Martin Fowler's Quantity
>> > pattern, so that a variable could be defined as not
>> > just two decimal places but also (say) "GBP" or "USD",
>> > and it would be an error to add the two.  Same applies
>> > for adding metres, kilograms and other quantities. 
>> > There has also been discussion that the 'type' of a
>> > quantity should determine what math should apply.
>> 
>> Isn't this something that is ideally suited for implementation in a Python 
>> module, based on a core implementation of fixed decimal numbers?
>
>I'd agree with Jack here.
>
Me too - I thought I said that in point 2, but in retrospect I didn't
say it clearly enough :-)


- Andy


From gstein at lyra.org  Mon Dec 27 12:31:29 1999
From: gstein at lyra.org (Greg Stein)
Date: Mon, 27 Dec 1999 03:31:29 -0800 (PST)
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: <386696C8.6EBBF428@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912270330180.412-100000@nebula.lyra.org>

On Sun, 26 Dec 1999, M.-A. Lemburg wrote:
> Greg Stein wrote:
> > On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> > > While we're at it: how about adding a PyLong_AsString() API
> > > to the C interface ? I currently use PyObject_Str() in mxODBC
> > > and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> > > API would much better suit the task.
> > 
> > Fred just checked in a change yesterday. PyObject_Str() on a Long no
> > longer includes the 'L'.
> 
> Ah, ok... scanning the patches: they don't provide an externed
> C interface... I would like to have such a beast if possible
> (basically, the new long_format() as PyLong_AsString()).

What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry
Point.

> > You're going to need to update your code :-)
> > [ I've got some here and there to fix, too, with the idiom:
> >      if type(v) is type(1L): return str(v)[:-1]
> >   ]
> 
> Your above example will effectively divide the long value by 10
> which will probably break things in very subtle ways... hmm, this

Yah :-(  Not a lot of fun, but I think for the best.

> change ought to be made *very* visible to people upgrading to
> 1.6, IMHO.

Yes.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From mal at lemburg.com  Mon Dec 27 13:51:36 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 27 Dec 1999 13:51:36 +0100
Subject: [Python-Dev] str(1L) -> '1' ?
References: <Pine.LNX.4.10.9912270330180.412-100000@nebula.lyra.org>
Message-ID: <386760D8.E897FADF@lemburg.com>

Greg Stein wrote:
> 
> On Sun, 26 Dec 1999, M.-A. Lemburg wrote:
> > Greg Stein wrote:
> > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> > > > While we're at it: how about adding a PyLong_AsString() API
> > > > to the C interface ? I currently use PyObject_Str() in mxODBC
> > > > and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> > > > API would much better suit the task.
> > >
> > > Fred just checked in a change yesterday. PyObject_Str() on a Long no
> > > longer includes the 'L'.
> >
> > Ah, ok... scanning the patches: they don't provide an externed
> > C interface... I would like to have such a beast if possible
> > (basically, the new long_format() as PyLong_AsString()).
> 
> What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry
> Point.

What's wrong with a rich C API :-) ?

The long_format function would be very useful for programs
interacting with other software at C level. Making it
external would give the programmer the ability to pass
long string representations in any base to other programs,
which is very useful for e.g. database interaction or
crypto software.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     4 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From bkc at murkworks.com  Mon Dec 27 23:04:25 1999
From: bkc at murkworks.com (Brad Clements)
Date: Mon, 27 Dec 1999 17:04:25 -0500
Subject: [Python-Dev] Re: [PSA MEMBERS] Re: Please test new dynamic load behavior
In-Reply-To: <Pine.LNX.4.10.9912231022280.16305-100000@nebula.lyra.org>
References: <38620B04.7CC64485@trema.com>
Message-ID: <199912272204.RAA26173@anvil.murkworks.com>

On 23 Dec 99, at 10:26, Greg Stein wrote:

> > > I reorganized Python's dynamic load/import code over the past few days.
> > > Gudio provided some feedback, I did some more mods, and now it is checked
> > > into CVS. The new loading behavior has been tested on Linux, IRIX, and
> > > Solaris (and probably Windows by now).


FYI, I downloaded the import stuff from CVS and used it in my port of 
Python to NetWare. Good timing, as I was just tackling dynamic 
loading on NetWare when I saw your message.

The new scheme is much better, and works for me.

Though I do need to add some special "un-import" code similar to what 
BEOS does. 


Brad Clements,                bkc at murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
netmeeting: ils://ils.murkworks.com               AOL-IM: BKClements


From skip at mojam.com  Tue Dec 28 22:41:33 1999
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 28 Dec 1999 15:41:33 -0600
Subject: [Python-Dev] Better text processing support in py2k?
Message-ID: <199912282141.PAA31426@dolphin.mojam.com>

It just occurred to me as I was replying to a request on the main list, that
Python's text handling capabilities could be a bit better than they are.
This will probably not come as a revelation to many of you, but I finally
put it together with the standard argument against beefing things up

    One fix would be to add regular expressions to the language core and
    have special syntax for them, as Perl has done. However, I don't like
    this solution because Python is a general-purpose language, and regular
    expressions are used for the single application domain of text
    processing. For other application domains, regular expressions may be of
    no interest, and you might want to remove them to save memory and code
    size.

and the observation that Python does support some builtin objects and syntax
that are fairly specific to some much more restricted application domains
than text processing.

I stole the above quote from Andrew Kuchling's Python Warts page, which I
also happened to read earlier today.

What AMK says makes perfect sense until you examine some of the other things
that are in the language, like the Ellipsis object and complex numbers.  If
I recall correctly both were added as a result of the NumPy package
development.

I have nothing against ellipses or complex numbers.  They are fine first
class objects that should remain in the language. But I have never used
either one in my day-to-day work.  On the other hand, I read files and
manipulate them with regular expressions all the time.  I rather suspect
that more people use Python for some sort of text processing than any other
single application domain.  Python should be good at it.

While I don't want to turn Python into Perl, I would like to see it do a
better job of what most people probably use the language for.  Here is a
very short list of things I think need attention:

    1. When using something like the simple file i/o idiom

       for line in f.readlines():
	   dofunstuff(line)

       the programmer should not have to care how big the file is.  It
       should just work in a reasonably efficient manner without gobbling up
       all of memory.  I realize this may require some change to the syntax
       of the common idiom.

    2. The re module needs to be sped up, if not to catch up with Perl, then
       to catch up with the deprecated regex module.  Depending how far
       people want to go with things, adding some language syntax to support
       regular expressions might be in order.  I don't see that as
       compelling as adding complex numbers however.  Another possibility,
       now that Barry Warsaw has opened the floodgates, is to add regular
       expression methods to strings.

    3. I've not yet used it, but I am told the pattern matching in
       Marc-Andre Lemburg's mxTextTools
       (http://starship.python.net/crew/lemburg/) is both powerful and
       efficient (though it certainly appears complex).  Perhaps it deserves
       consideration for incorporation into the core Python distribution.

I'm sure other people will come up with other suggestions.

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From akuchlin at mems-exchange.org  Tue Dec 28 23:00:11 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Tue, 28 Dec 1999 17:00:11 -0500 (EST)
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com>
References: <199912282141.PAA31426@dolphin.mojam.com>
Message-ID: <14441.13035.802146.730160@amarok.cnri.reston.va.us>

Skip Montanaro writes:
>What AMK says makes perfect sense until you examine some of the other things
>that are in the language, like the Ellipsis object and complex numbers.  If
>I recall correctly both were added as a result of the NumPy package
>development.

True, but note that you can compile Python with WITHOUT_COMPLEX
defined to remove complex numbers.

>    1. When using something like the simple file i/o idiom
>       for line in f.readlines():
>	   dofunstuff(line)
>       the programmer should not have to care how big the file is.

What about 'for line in fileinput.input()', which already exists?
(Hmmm... if you have an already open file object, I don't think you
can pass it to fileinput.input(); maybe that should be fixed.)

On a vaguely related note, since there are many things like parser
generators and XML stuff and mxTextTools, I've been speculating about
a text processing topic guide.  If you know of Python packages related
to text processing, please send me a private e-mail with a link.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Constraints often boost creativity.
    -- Jim Hugunin, 11 Feb 1999


From skip at mojam.com  Tue Dec 28 23:26:53 1999
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 28 Dec 1999 16:26:53 -0600 (CST)
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <14441.13035.802146.730160@amarok.cnri.reston.va.us>
References: <199912282141.PAA31426@dolphin.mojam.com>
	<14441.13035.802146.730160@amarok.cnri.reston.va.us>
Message-ID: <14441.14637.682862.999776@dolphin.mojam.com>

    Andrew> True, but note that you can compile Python with WITHOUT_COMPLEX
    Andrew> defined to remove complex numbers.

That's true, but that wasn't my point.  I'm not arguing for or against space
efficiency, just that the the rather timeworn argument about not doing
anything special to support text processing because Python is a general
purpose language is a red herring.

    >> 1. When using something like the simple file i/o idiom
    >> for line in f.readlines():
    >>   dofunstuff(line)
    >> the programmer should not have to care how big the file is.

    Andrew> What about 'for line in fileinput.input()', which already
    Andrew> exists?  (Hmmm... if you have an already open file object, I
    Andrew> don't think you can pass it to fileinput.input(); maybe that
    Andrew> should be fixed.)

Well, a couple reasons jump to mind:

   1. fileinput.FileInput isn't particularly efficient.  At its heart, its
      __getitem__ method makes a simple readline() call instead of buffering
      some amount of readlines(sizehint) bytes.  This can be fixed, but I'm
      not sure what would happen to its semantics.

   2. As you pointed out, it's not all that general.

My point, not at all well stated, is that the programmer shouldn't have to
worry (much?) about the conditions under which he does file i/o.   Right
now, if I know the file is small(ish), I can do

    for line in f.readlines():
        dofunstuff(line)

but I have to know that the file won't be big, because readlines() will
behave badly (perhaps even generate a MemoryError exception) if the file is
large.  In that case, I have to fall back to the safer (and slower)

    line = f.readline()
    while line:
        dofunstuff(line)
	line = f.readline()

or the more efficient, but more cumbersome

    lines = f.readlines(sizehint)
    while lines:
        for line in lines:
	    dofunstuff(line)
	lines = f.readlines(sizehint)

That's three separate idioms the programmer has to be aware of when writing
code to read a text file based upon the perceived need for speed, memory
usage and desired clarity:

    fast/memory-intensive/clear
    slow/memory-conserving/not-as-clear
    fast/memory-conserving/fairly-muddy

Any particular reason that the readline method can't return an iterator that
supports __getitem__ and buffers input?  (Again, remember this is for py2k,
so the potential breakage such a change might cause is a consideration, but
not a showstopper.)

    Andrew> On a vaguely related note, since there are many things like
    Andrew> parser generators and XML stuff and mxTextTools, I've been
    Andrew> speculating about a text processing topic guide.  If you know of
    Andrew> Python packages related to text processing, please send me a
    Andrew> private e-mail with a link.

This sounds like a good idea to me.

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From captainrobbo at yahoo.com  Wed Dec 29 09:34:43 1999
From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Wed, 29 Dec 1999 00:34:43 -0800 (PST)
Subject: [Python-Dev] Better text processing support in py2k?
Message-ID: <19991229083443.27817.qmail@web6005.mail.yahoo.com>

--- Skip Montanaro <skip at mojam.com> wrote:
>     fast/memory-intensive/clear
>     slow/memory-conserving/not-as-clear
>     fast/memory-conserving/fairly-muddy
> 
> Any particular reason that the readline method can't
> return an iterator that
> supports __getitem__ and buffers input?  (Again,
> remember this is for py2k,
> so the potential breakage such a change might cause
> is a consideration, but
> not a showstopper.)

Why not generalize fileinput to do buffering instead?

More generally, Java has the notion of 'stackable
streams' - e.g. construct a 'BufferedFile' around a
'File', maybe construct a 'Line-oriented file' around
that etc.  Each one takes a file-like object as an
argument to the constructor.  Things you might want to
do:
- buffering
- international encoding conversions
- line delimiters other than CR/LF/CRLF
- read/write Python objects (i.e. use pickle/marshal)
- easy interfaces to parsers

This took me a couple of hours to get used to (and at
the time I thought 'Yuk!' when I saw first saw four
nested constructors), but gives you very precise
control and a lot of versatility when handling files. 
It's an idiom Python does not use much but maybe it
should.

I'd argue that maybe some enhancements to fileinput.py
- adding some streams to provide building blocks for
these operations - would get us the power you want and
a lot more versatility besides.


=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://messenger.yahoo.com


From mal at lemburg.com  Wed Dec 29 17:55:21 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 29 Dec 1999 17:55:21 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <19991229083443.27817.qmail@web6005.mail.yahoo.com>
Message-ID: <386A3CF9.8AF0EA60@lemburg.com>

Andy Robinson wrote:
> 
> --- Skip Montanaro <skip at mojam.com> wrote:
> >     fast/memory-intensive/clear
> >     slow/memory-conserving/not-as-clear
> >     fast/memory-conserving/fairly-muddy
> >
> > Any particular reason that the readline method can't
> > return an iterator that
> > supports __getitem__ and buffers input?  (Again,
> > remember this is for py2k,
> > so the potential breakage such a change might cause
> > is a consideration, but
> > not a showstopper.)
> 
> Why not generalize fileinput to do buffering instead?
> 
> More generally, Java has the notion of 'stackable
> streams' - e.g. construct a 'BufferedFile' around a
> 'File', maybe construct a 'Line-oriented file' around
> that etc.  Each one takes a file-like object as an
> argument to the constructor.  Things you might want to
> do:
> - buffering
> - international encoding conversions
> - line delimiters other than CR/LF/CRLF
> - read/write Python objects (i.e. use pickle/marshal)
> - easy interfaces to parsers

If all goes well we'll have something like this
in Python 1.6 at least for the encoding/decoding
part file reading and writing. You basically take
a file object and then wrap some StreamCodecs around
it to get the functionality you need. Very simple
and very intuitive.

> This took me a couple of hours to get used to (and at
> the time I thought 'Yuk!' when I saw first saw four
> nested constructors), but gives you very precise
> control and a lot of versatility when handling files.
> It's an idiom Python does not use much but maybe it
> should.
> 
> I'd argue that maybe some enhancements to fileinput.py
> - adding some streams to provide building blocks for
> these operations - would get us the power you want and
> a lot more versatility besides.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Get ready to party !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From bckfnn at pipmail.dknet.dk  Wed Dec 29 19:51:52 1999
From: bckfnn at pipmail.dknet.dk (Finn Bock)
Date: Wed, 29 Dec 1999 18:51:52 GMT
Subject: [Python-Dev] zipfile.py
In-Reply-To: <3857B97E.3684224F@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com>
Message-ID: <386a582d.6762574@pipmail.dknet.dk>

James C. Ahlstrom wrote:

>  ftp://ftp.interet.com/pub/pylib.html

I feel that it smell a bit too much like a tool and too little like an general
programming api.

- It can only add disk files. The ability to write data to a zip entry through 
  a file-like object or from a string would make it more like an API, IMHO
-  Some kind of access to the TOC entry fields (date, size, compressed
  size etc) also seems like a nice feature.
- The data for an entry must be available in memory. Could be a problem 
  for huge files, but most like not in practical use.

I admit that I am fond of the api from java.util.zip.ZipFile and
java.util.zip.ZipOutputStream.

Regards,
Finn Bock


From tim_one at email.msn.com  Thu Dec 30 07:08:58 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 01:08:58 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com>
Message-ID: <000001bf528c$5cbdb9a0$a02d153f@tim>

[Skip Montanaro, wants nicer text facilities]
> ...
> I rather suspect that more people use Python for some sort of
> text processing than any other single application domain.

Hmm.  You're probably right, but I'm an exception.

> Python should be good at it.

And I guess I'm an exception mostly *because* Perl is better at easy text
crunching and Icon is better at hard text-crunching -- that is, I use the
right tool for the job <wink>.

> While I don't want to turn Python into Perl, I would like to see
> it do a better job of what most people probably use the language
> for.  Here is a very short list of things I think need attention:
>
>     1. [*A* clear way to do memory- and time-efficient textfile
>         input]

I agree, but unsure how to fix it.  The best way to write this now is

    # f is some open file object.
    while 1:
        lines = f.readlines(BUFSIZE)
        if not lines:
            break
        for line in lines:
            process(line)

and it's not something anyone figures out on their own -- or enjoys typing
or explaining afterwards.

Perl gets its line-at-a-time speed by peeking and poking C FILE structs
directly in compiler- and platform-specific ways -- ways that vendors
*should* have done in their own fgets implementations, but almost never do.
I have no idea whether it works well with Perl's nascent notions of
threading, but in the absence of that "the system" doesn't know Perl is
cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one
line at a time -- even mixing in C-level ungetc calls works (well, sometimes
<0.1 wink -- they don't always peek and poke enough fields>)).

The Python QIO extension module is much easier to port but less compatible
(it doesn't use stdio, so QIO-opened files don't play well with others) and
slower (although that's likely repairable -- he's got two passes over the
buffer where one hairier pass should suffice).

>     2. The re module needs to be sped up, if not to catch up with
>        Perl, then to catch up with the deprecated regex module.

The irony here is that the re engine is very often unboundedly faster than
the regex engine -- provided you're chewing over large strings.  Some tests
/F ran showed that the length-independent *overhead* of invoking re is about
10x higher than for regex.  Presumably the bulk of that is due to re.py,
i.e. that you get to the re engine via going thru Python layers on your way
in and out, while regex was pure C.

In any case, /F is working on a new engine (for Unicode), and I believe he
has this all well in hand.

> Depending how far people want to go with things, adding some
> language syntax to support regular expressions might be in order.
> ...
>     3. I've not yet used it, but I am told the pattern matching in
>        Marc-Andre Lemburg's mxTextTools
>       (http://starship.python.net/crew/lemburg/)
>        is both powerful and efficient (though it certainly appears
>        complex).  Perhaps it deserves consideration for
>        incorporation into the core Python distribution.

It's not complex, it's complicated -- and *that's* what makes it un-Pythonic
<wink>.  Tony Ibbs has written a friendly wrapper around mxTextTools that
suppresses much of the non-essential complication.  OTOH, if you go into
this with a regexp mindset, it will run much slower than a real regexp
package, because the bulk of the latter is devoted to doing optimization;
mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls
if you e.g. try to implement naive backtracking).

You should go to the REBOL site and look at the description of REBOL's PARSE
verb in the FAQ ... mumble, mumble ... at

    http://www.rebol.com/faq.html#11550948

Here's an example pulled from that page (this is a REBOL code fragment):

    digit: charset "0123456789"
    expr: [term ["+" | "-"] expr | term]
    term: [factor ["*" | "/"] term | factor]
    factor: [primary "**" factor | primary]
    primary: [value | "(" expr ")"]
    value: [digit value | digit]

    parse "1 + 2 ** 9" expr

There hasn't been a pattern scheme this clean, convenient or powerful since
SNOBOL4.  It exploits REBOL's Forth-like (lack of!) syntax, and
Smalltalk-like penchant for passing around thunks (anonymous closures --
"[...]" in REBOL builds a lexically-scoped entity called "a block", which
can be treated as code (executed) or data (manipulated like a Python list)
at will).

Now the example doesn't show this, but you can freely mix computations into
the middle of the patterns; only *some* of the words in the blocks have
special meaning to PARSE.  The fragment above is already way beyond what can
be accomplished with regexps, but that's just the start of it.  Perl too is
slamming in more & more ways to get user code to interact with its regexp
engine.

So REBOL has a *very* nice approach to this; I believe it's unreasonably
clumsy to mimic in Python primarily because of forward references (note e.g.
that the block attached to "expr" above refers to "term" before the latter
has been bound -- but the stuff inside [...] is just a closure so that
doesn't matter -- it only matters that term gets bound before expr is
*executed*).  I hit a similar snag years ago when trying to mimic SNOBOL4's
approach in Python.

Perl's endless abuse of regexps is making that language more absurd by the
month.

The other major approach to mixing patterns with computation is due to Icon,
another language where a regexp mindset is fatal.  On a whim, I whipped up
the attached, which illustrates a bit of the Icon approach in Pythonic terms
(but without language support for generators, the *heart* of it can't really
be captured).  Here's an example of how this could be used to implement (the
simplest form of) string.split:

def mysplit(str):
    s = Searcher(str)
    white = CharSet(" \t\n")
    result = []
    s.many(white)            # consume initial whitespace
    while s.notmany(white):  # consume non-whitespace
        result.append(s.get_match())
        s.many(white)
    return result

>>> mysplit("   \t Hey,   that's\tpretty\n\n neat!  ")
['Hey,', "that's", 'pretty', 'neat!']
>>>

The primary thing to note is that there's no seam between analyzing the
string and doing computation on the partial results -- "the program is the
pattern".  This is what Icon does to perfection, Perl is moving toward, and
REBOL is arriving at from a different direction.  It's The Future <0.9
wink>.

Without generators it's difficult to work backtracking into the Searcher
class, but, as above, in my experience the backtracking feature of regexps
is rarely *needed*!  For example, at various points "split" wants to suck up
all the whitespace characters, and that's *it* -- the backtracking
possibility in the regexp \s+ is often a bug just waiting for unexpected
*context* to trigger it.  A hairy regexp is pure hell; but what simpler
regexps can do don't require all that funky regexp machinery.

BTW, the mxTextTools engine could be used to get blazing implementations of
the primary Searcher methods (it excels at simple analysis).  OTOH, making
lots of calls to analyze short strings is slow.  The only clean solutions to
that are Perl's and Icon's (build everyting into one language so the
compiler can optimize stuff away), and REBOL's (make no distinction between
code and data, so that code can be analyzed & optimized at runtime -- and
build the entire implementation around making closures and calls
supernaturally fast).

the-less-you-use-regexps-the-less-you-miss-'em<wink>-ly y'rs  - tim

class CharSet:
    def __init__(self, seq):
        self.seq = seq
        d = {}
        for ch in seq:
            d[ch] = 1
        self.haskey = d.has_key

    def __call__(self, ch):
        return self.haskey(ch)

    def __add__(self, other):
        if isinstance(other, CharSet):
            other = other.seq
        return CharSet(self.seq + other)

def _normalize_index(i, n):
    assert n >= 0
    if i >= 0:
        return min(i, n)
    elif n == 0:
        return 0
    # want smallest q s.t. i + q*n >= 0
    # <->  q*n >= -i
    # <->  q >= -i/n
    # so q = ceiling(-i/n) = -floor(i/n)
    return i - (i/n)*n

class Searcher:
    def __init__(self, str, lo=0, hi=None):
        """Create object to search in str[lo:hi].

        lo defaults to 0.
        hi defaults to len(str).
        len(str) is repeatedly added to negative lo or hi until
        reaching a number >= 0.
        If lo > hi, a uselessly empty slice will be searched.
        The search cursor is initialized to lo.
        """

        self.s = str
        self.lo = _normalize_index(lo, len(str))
        if hi is None:
            self.hi = len(str)
        else:
            self.hi = _normalize_index(hi, len(str))
        if self.lo > self.hi:
            self.hi = self.lo
        self.i = self.lo
        self.lastmatch = None, None

    def any(self, charset, consume=1):
        """Try to match single character in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i = self.i
        if i < self.hi and charset(self.s[i]):
            if consume:
                self.__consume(i+1)
            return 1
        return 0

    def notany(self, charset, consume=1):
        """Try to match single character not in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i = self.i
        if i < self.hi and not charset(self.s[i]):
            if consume:
                self.__consume(i+1)
            return 1
        return 0

    def many(self, charset, consume=1):
        """Try to match one or more characters in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i, n, s = self.i, self.hi, self.s
        j = i
        while j < n and charset(s[j]):
            j = j+1
        if i < j:
            if consume:
                self.__consume(j)
            return 1
        return 0

    def notmany(self, charset, consume=1):
        """Try to match one or more characters not in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i, n, s = self.i, self.hi, self.s
        j = i
        while j < n and not charset(s[j]):
            j = j+1
        if i < j:
            if consume:
                self.__consume(j)
            return 1
        return 0

    def match(self, str, consume=1):
        """Try to match string "str".

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i = self.i
        j = i + len(str)
        if self.s[i:j] == str:
            if consume:
                self.__consume(j)
            return 1
        return 0

    def get_str(self):
        """Return subject string."""
        return self.s

    def get_lo(self):
        """Return low slice bound."""
        return self.lo

    def get_hi(self):
        """Return high slice bound."""
        return self.hi

    def get_pos(self):
        """Return current value of search cursor."""
        return self.i

    def get_match_indices(self):
        """Return slice indices of last "consumed" match."""
        return self.lastmatch

    def get_match(self):
        """Return last "consumed" matching substring."""
        i, j = self.lastmatch
        if i is None:
            return ValueError("no match to return!")
        return self.s[i:j]

    def set_pos(self, pos, consume=1):
        """Set search cursor to new value.  No return value.

        If optional arg "consume" is true, the last match is set to
        the slice between pos and the current cursor position.
        """

        p = _normalize_index(pos, len(self.s))
        if not self.lo <= p <= self.hi:
            raise ValueError("pos out of bounds: " + `pos`)
        if consume:
            self.__consume(p)
        else:
            self.i = p

    def move_pos(self, incr, consume=1):
        """Move the cursor by incr characters.  No return value.

        If the new value is outside the slice bounds, it's clipped.
        If optional arg "consume" is true, the last match is set to
        the slice between the old and new cursor positions.
        """

        newi = self.i + incr
        if newi < self.lo:
            newi = self.lo
        elif newi > self.hi:
            newi = self.hi
        if consume:
            self.__consume(newi)
        else:
            self.i = newi

    def __consume(self, newi):
        i, j = self.i, newi
        if i > j:
            i, j = j, i
        self.lastmatch = i, j
        self.i = newi


From tim_one at email.msn.com  Thu Dec 30 07:09:14 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 01:09:14 -0500
Subject: [Python-Dev] Fixed-decimal types
In-Reply-To: <199912231944.OAA23337@eric.cnri.reston.va.us>
Message-ID: <000201bf528c$657c3080$a02d153f@tim>

[Guido]
> ...
> Not arguing for this interpretation, just indicating that doing
> fixed precision arithmetic right is hard.

It's not so much hard as it is arbitrary.  The floating-point world is
standardized now, but the fixed-point world remains a mish-mash of
incompatible legacy schemes carried across generations of products for no
reason other than product-specific compatibility.  So despite that
fixed-point has a specialty audience, whatever rules Python chooses will
leave it incompatible with much of that audience's (mixed!) expectations.

If fixed-point is needed, and my FixedPoint.py isn't good enough (all other
fixed point pkgs I've seen for Python were braindead), then it should be
implemented such that developers can control both rounding and precision
propagation.  I'll attach suitable kernels; they haven't been tested but any
bugs discovered will be trivial to fix (there are no difficulties here, but
typos are likely); the kernels supply the bulk of what's required, whether
implemented in Python or C; various packages can wrap them to supply
whatever policies they like; see FixedPoint.py for exact string<->FixedPoint
and exact float->FixedPoint conversions; and that's the end of my
involvement in fixed-point <wink>.

Python should certainly *not* add a "scale factor" to its current long
implementation; fixed-point should be a distinct type, as scale-factor
fiddling is clumsy and pervasive (long arithmetic is challenging enough to
get correct and quick without this obfuscating distraction; and by leaving
scale factors out of it, it's much easier to plug in alternative bigint
implementations (like GMP)).

One other point:  some people are going to want BCD (binary-coded decimal),
which suffers the same mish-mash of legacy policies, but with a different
data representation.  The point is that many commercial applications spend
much more time doing I/O conversions than arithmetic, and BCD accepts slow
arithmetic (in the absence of special HW support) in return for fast scaling
& I/O conversion.

Forgetting the database-heads for a moment, decimal *floating*-point is what
calculators do, so that's what "real people" are most comfortable with.  The
IEEE-854 std (IEEE-754's younger and friendlier brother) specifies that
completely.  Add a means to boost "global" precision (a la REXX), and it's a
powerful tool even for experts (benefits approximating those of unbounded
rational arithmetic but with bounded & user-controllable expense).

can-never-have-too-many-numeric-types-but-always-have-
    too-few-literal-notations-ly y'rs  - tim


# Kernels for fixed-point decimal arithmetic.

# _add, _sub, _mul, _div all have arglist
#     n1, p1, n2, p2, p, round=DEFAULT_ROUND
# n1 and n2 are longs; p1, p2 and p ints >= 0.
# The inputs are exactly n1/10**p1 and n2/10**p2.
#
# The return value is the integer n such that n/10**p is the best
# approximation to the infinite-precision result.  In other words, p1
# and p2 are the input precisions and p is the desired output
# precision, where precision is the # of digits *after* the decimal
# point.
#
# What "best approximation" means is determined by the round function.
# In many cases rounding isn't required, but when it is
#     round(top, bot)
# is returned.  top and bot are longs, with bot > 0 guaranteed.  The
# infinite-precision result is top/bot.  round must return an integer
# (long) approximation to top/bot, using whichever rounding discipline
# you want.  By default, IEEE round-to-nearest/even is used; see the
# _roundXXX functions for examples of suitable rounding functions.
#
# Note:  The only code here that knows we're working in decimal is
# function _tento; simply change the "10L" in that to do fixed-point
# arithmetic in some other base.
#
# Example:
#
# >>> r7 = _div(1L, 0, 7L, 0, 20)  # 1/7
# >>> r7
# 14285714285714285714L
# >>> r5 = _div(1L, 0, 5L, 0, 20)  # 1/5
# >>> r5
# 20000000000000000000L
# >>> sum = _add(r7, 20, r5, 20, 20)  # 1/7 + 1/5 = 12/35
# >>> sum
# 34285714285714285714L
# >>> _mul(sum, 20, 35L, 0, 20)
# 1199999999999999999990L
# >>> _mul(sum, 20, 35L, 0, 18)
# 12000000000000000000L
# >>> _mul(sum, 20, 35L, 0, 0)
# 12L
# >>>

###################################################################
# Sample rounding functions.
###################################################################

# Round to minus infinity.

def _roundminf(top, bot):
    assert bot > 0
    return top / bot

# Round to plus infinity.

def _roundpinf(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    if r:
        q = q + 1
    return q

# IEEE nearest/even rounding (closest integer; in case of tie closest
# even integer).

def _roundne(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    c = cmp(r << 1, bot)
    # c < 0 <-> r < bot/2, etc
    if c > 0 or (c == 0 and (q & 1) == 1):
        q = q + 1
    return q

# "Add a half and chop" rounding (remainder < 1/2 toward 0; remainder
# >= half away from 0).

def _roundhalf(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    c = cmp(r << 1, bot)
    # c < 0 <-> r < bot/2, etc
    if c > 0 or (c == 0 and q >= 0):
        q = q + 1
    return q

# Round toward 0 (throw away remainder).

def _roundchop(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    if r and q < 0:
        q = q + 1
    return q

###################################################################
# Kernels for + - * /.
###################################################################

DEFAULT_ROUND = _roundne

def _add(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    # (n1/10**p1 + n2/10**p2) * 10**p ==
    # (n1*10**(max-p1) + n2*10**(max-p2))/10**max * 10**p
    max = p1    # until proven otherwise
    if p1 < p2:
        n1 = n1 * _tento(p2 - p1)
        max = p2
    elif p2 < p1:
        n2 = n2 * _tento(p1 - p2)
    n3 = n1 + n2
    p3 = p - max
    if p3 > 0:
        n3 = n3 * _tento(p3)
    elif p3 < 0:
        n3 = round(n3, _tento(-p3))
    return n3

def _sub(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    return _add(n1, p1, -n2, p2, p, round)

def _mul(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    # (n1/10**p1 * n2/10**p2) * 10**p ==
    # (n1*n2)/10**(p1+p2) * 10**p
    n3 = n1 * n2
    p3 = p - p1 - p2
    if p3 > 0:
        n3 = n3 * _tento(p3)
    elif p3 < 0:
        n3 = round(n3, _tento(-p3))
    return n3

def _div(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    if n2 == 0:
        raise ZeroDivisionError("scaled integer")
    # (n1/10**p1 / n2/10**p2) * 10**p ==
    # (n1/n2) * 10**(p2-p1+p)
    p3 = p2 - p1 + p
    if p3 > 0:
        n1 = n1 * _tento(p3)
    elif p3 < 0:
        n2 = n2 * _tento(-p3)
    if n2 < 0:
        n1 = -n1
        n2 = -n2
    return round(n1, n2)

def _tento(i, _cache={}):
    assert i >= 0
    try:
        return _cache[i]
    except KeyError:
        answer = _cache[i] = 10L ** i
        return answer


From fredrik at pythonware.com  Thu Dec 30 12:05:45 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 30 Dec 1999 12:05:45 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <000001bf528c$5cbdb9a0$a02d153f@tim>
Message-ID: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com>

Tim Peters is back from his vacation:
> > While I don't want to turn Python into Perl, I would like to see
> > it do a better job of what most people probably use the language
> > for.  Here is a very short list of things I think need attention:
> >
> >     1. [*A* clear way to do memory- and time-efficient textfile
> >         input]
> 
> I agree, but unsure how to fix it.  The best way to write this now is
> 
>     # f is some open file object.
>     while 1:
>         lines = f.readlines(BUFSIZE)
>         if not lines:
>             break
>         for line in lines:
>             process(line)
> 
> and it's not something anyone figures out on their own -- or enjoys typing
> or explaining afterwards.
> 
> Perl gets its line-at-a-time speed by peeking and poking C FILE structs
> directly in compiler- and platform-specific ways -- ways that vendors
> *should* have done in their own fgets implementations, but almost never do.
> I have no idea whether it works well with Perl's nascent notions of
> threading, but in the absence of that "the system" doesn't know Perl is
> cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one
> line at a time -- even mixing in C-level ungetc calls works (well, sometimes
> <0.1 wink -- they don't always peek and poke enough fields>)).
> 
> The Python QIO extension module is much easier to port but less compatible
> (it doesn't use stdio, so QIO-opened files don't play well with others) and
> slower (although that's likely repairable -- he's got two passes over the
> buffer where one hairier pass should suffice).

we have something called SIO which uses memory mapping
where possible, and just a more aggressive read-ahead for
other cases.  on a windows box, a traditional while/readline
loop runs 3-5 times faster than before.  with SRE instead of
re, a while/readline/match loop runs up to 10 times faster
than before.

note that this is without *any* changes to the Python
source code...

> >     2. The re module needs to be sped up, if not to catch up with
> >        Perl, then to catch up with the deprecated regex module.
> 
> The irony here is that the re engine is very often unboundedly faster than
> the regex engine -- provided you're chewing over large strings.  Some tests
> /F ran showed that the length-independent *overhead* of invoking re is about
> 10x higher than for regex.  Presumably the bulk of that is due to re.py,
> i.e. that you get to the re engine via going thru Python layers on your way
> in and out, while regex was pure C.

I've attached some old benchmarks.  I think the current code
base is a bit faster, but you get the idea.

> In any case, /F is working on a new engine (for Unicode), and I believe he
> has this all well in hand.

with a little luck, the new module will replace both pcre
and regex...

not to mention that it's fairly easy to write your own front-
end to the matching engine -- the expression parser and the
compiler are both written in good old python.

</F>

$ python sre_bench.py
          0     5    50   250  1000  5000 25000
----- ----- ----- ----- ----- ----- ----- -----
search for Python|Perl in Perl ->
sre8  0.007 0.008 0.010 0.010 0.020 0.073 0.349
sre16 0.007 0.007 0.008 0.010 0.020 0.075 0.353
re    0.097 0.097 0.101 0.103 0.118 0.175 0.480
regex 0.007 0.007 0.009 0.020 0.059 0.271 1.320

search for (Python|Perl) in Perl ->
sre8  0.007 0.007 0.007 0.010 0.020 0.074 0.344
sre16 0.007 0.007 0.008 0.010 0.020 0.074 0.347
re    0.110 0.104 0.111 0.115 0.125 0.184 0.559
regex 0.006 0.006 0.009 0.019 0.057 0.285 1.432

search for Python in Python ->
sre8  0.007 0.007 0.007 0.011 0.021 0.072 0.387
sre16 0.007 0.007 0.008 0.010 0.022 0.082 0.365
re    0.107 0.097 0.105 0.102 0.118 0.175 0.511
regex 0.009 0.008 0.010 0.018 0.036 0.139 0.708

search for .*Python in Python ->
sre8  0.008 0.007 0.008 0.011 0.021 0.079 0.379
sre16 0.008 0.008 0.008 0.011 0.022 0.075 0.402
re    0.102 0.108 0.119 0.183 0.400 1.545 7.284
regex 0.013 0.019 0.072 0.318 1.231 8.035 45.366

search for .*Python.* in Python ->
sre8  0.008 0.008 0.008 0.011 0.021 0.080 0.383
sre16 0.008 0.008 0.008 0.011 0.021 0.079 0.395
re    0.103 0.108 0.119 0.184 0.418 1.685 8.378
regex 0.013 0.020 0.073 0.326 1.264 9.961 46.511

search for .*(Python) in Python ->
sre8  0.007 0.008 0.008 0.011 0.021 0.077 0.378
sre16 0.007 0.008 0.008 0.011 0.021 0.077 0.444
re    0.108 0.107 0.134 0.240 0.637 2.765 13.395
regex 0.026 0.112 3.820 87.322 (skipped)

search for .*P.*y.*t.*h.*o.*n.* in Python ->
sre8  0.010 0.010 0.014 0.031 0.093 0.419 2.212
sre16 0.010 0.011 0.014 0.030 0.093 0.419 2.292
re    0.112 0.121 0.195 0.521 1.747 8.298 40.877
regex 0.026 0.048 0.248 1.148 4.550 24.720 ...

(searching for patterns in padded strings; sre8
is the sre engine compiled for 8-bit characters,
sre16 is the same engine compiled for 16-bit
characters)


From mal at lemburg.com  Thu Dec 30 12:52:50 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 30 Dec 1999 12:52:50 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <000001bf528c$5cbdb9a0$a02d153f@tim>
Message-ID: <386B4792.A551022A@lemburg.com>

Tim Peters wrote:
> 
> [Skip Montanaro, wants nicer text facilities]
> > While I don't want to turn Python into Perl, I would like to see
> > it do a better job of what most people probably use the language
> > for.  Here is a very short list of things I think need attention:
> >
> >     1. [*A* clear way to do memory- and time-efficient textfile
> >         input]
>
> ...
> 
> The Python QIO extension module is much easier to port but less compatible
> (it doesn't use stdio, so QIO-opened files don't play well with others) and
> slower (although that's likely repairable -- he's got two passes over the
> buffer where one hairier pass should suffice).

What is QIO ?
 
> > Depending how far people want to go with things, adding some
> > language syntax to support regular expressions might be in order.
> > ...
> >     3. I've not yet used it, but I am told the pattern matching in
> >        Marc-Andre Lemburg's mxTextTools
> >       (http://starship.python.net/crew/lemburg/)
> >        is both powerful and efficient (though it certainly appears
> >        complex).  Perhaps it deserves consideration for
> >        incorporation into the core Python distribution.
> 
> It's not complex, it's complicated -- and *that's* what makes it un-Pythonic
> <wink>.  Tony Ibbs has written a friendly wrapper around mxTextTools that
> suppresses much of the non-essential complication.  OTOH, if you go into
> this with a regexp mindset, it will run much slower than a real regexp
> package, because the bulk of the latter is devoted to doing optimization;
> mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls
> if you e.g. try to implement naive backtracking).

All true. mxTextTools provides the tools, not the magic. But this
is also its strength: you can optimize the hell out of your particular
parsing requirement without having to think about how the RE optimizer
works.

> You should go to the REBOL site and look at the description of REBOL's PARSE
> verb in the FAQ ... mumble, mumble ... at
> 
>     http://www.rebol.com/faq.html#11550948
> 
> Here's an example pulled from that page (this is a REBOL code fragment):
> 
>     digit: charset "0123456789"
>     expr: [term ["+" | "-"] expr | term]
>     term: [factor ["*" | "/"] term | factor]
>     factor: [primary "**" factor | primary]
>     primary: [value | "(" expr ")"]
>     value: [digit value | digit]
> 
>     parse "1 + 2 ** 9" expr
> 
> There hasn't been a pattern scheme this clean, convenient or powerful since
> SNOBOL4.  It exploits REBOL's Forth-like (lack of!) syntax, and
> Smalltalk-like penchant for passing around thunks (anonymous closures --
> "[...]" in REBOL builds a lexically-scoped entity called "a block", which
> can be treated as code (executed) or data (manipulated like a Python list)
> at will).

Looks nice indeed, but how does executable code fit into
that definition ? (mxTextTools allows you to write your own
parsing elements in Python, BTW; it should be possible to
use those mechanisms to achieve a similar intergration.)
 
> ...
>
> BTW, the mxTextTools engine could be used to get blazing implementations of
> the primary Searcher methods (it excels at simple analysis).  OTOH, making
> lots of calls to analyze short strings is slow.

That's why mxTextTools converts these search idioms into byte codes
which it executes at C level. Some future version will even "precompile"
the tuple input and then omit the type checks during the search...
that should give another noticeable speedup. Note that recursion
etc. can be done at C level too -- Python function calls are not
needed.

> The only clean solutions to
> that are Perl's and Icon's (build everyting into one language so the
> compiler can optimize stuff away), and REBOL's (make no distinction between
> code and data, so that code can be analyzed & optimized at runtime -- and
> build the entire implementation around making closures and calls
> supernaturally fast).

Just for kicks, here is the mysplit() function using mxTextTools:

from mx.TextTools import *

table = (
    # Match all whitespace
    (None,AllInSet,whitespace_set,+1),
    # Match and tag all non-whitespace
    ('text',AllInSet + AppendMatch,nonwhitespace_set,+1),
    # Loop until EOF
    (None,EOF,Here,-2),
    )

def mysplit(text):

    return tag(text,table)[1]

The timings:
 mysplit: 5.84 sec.
 string.split: 3.62 sec.

Note that you can customize the above to split text at any
character set you like, not just whitespace... without
compiling or writing C code. The function mx.TextTools.setsplit()
provides this functionality as pure C function.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Get ready to party !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jim at interet.com  Thu Dec 30 15:21:36 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 30 Dec 1999 09:21:36 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk>
Message-ID: <386B6A70.3C9A0042@interet.com>

Finn Bock wrote:
> 
> James C. Ahlstrom wrote:
> 
> >  ftp://ftp.interet.com/pub/pylib.html
> 
> I feel that it smell a bit too much like a tool and too little like an general
> programming api.

It was meant to be an API except for writepy(), which is clearly a tool.
 
> - It can only add disk files. The ability to write data to a zip entry through
>   a file-like object or from a string would make it more like an API, IMHO

I could add a method
     writestr(self, string, year, month, day, hour, minute, second, ...)
There are a lot of fields required which usually come from the file.

> -  Some kind of access to the TOC entry fields (date, size, compressed
>   size etc) also seems like a nice feature.

This access is provided directly by self.TOC, and the fields are
documented.

> - The data for an entry must be available in memory. Could be a problem
>   for huge files, but most like not in practical use.

I agree, but adding loops will make it slower.  What do others think?
 
> I admit that I am fond of the api from java.util.zip.ZipFile and
> java.util.zip.ZipOutputStream.

I don't know this API.  If writestr() is not sufficient, what
API would you like?

JimA


From bckfnn at pipmail.dknet.dk  Thu Dec 30 20:14:14 1999
From: bckfnn at pipmail.dknet.dk (Finn Bock)
Date: Thu, 30 Dec 1999 19:14:14 GMT
Subject: [Python-Dev] zipfile.py
In-Reply-To: <386B6A70.3C9A0042@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk> <386B6A70.3C9A0042@interet.com>
Message-ID: <386baec9.2867733@pipmail.dknet.dk>

[I wrote]

> - It can only add disk files. The ability to write data to a zip entry through
>   a file-like object or from a string would make it more like an API, IMHO

[JimA wrote]

>I could add a method
>     writestr(self, string, year, month, day, hour, minute, second, ...)
>There are a lot of fields required which usually come from the file.

Something like that seems fine to me. 

[I wrote]

> -  Some kind of access to the TOC entry fields (date, size, compressed
>   size etc) also seems like a nice feature.

[JimA answers]

>This access is provided directly by self.TOC, and the fields are
>documented.

Good enough. My bad, I was looking for getter methods. (me being a java dude)

[I wrote]

> I admit that I am fond of the api from java.util.zip.ZipFile and
> java.util.zip.ZipOutputStream.

[JimA asks]

>I don't know this API.  If writestr() is not sufficient, what
>API would you like?

This is only meant as a source for inspiration, certainly as a request for
change. writestr would answer my complaint nicely. Below, only one ZipEntry can
be actively read or written to at a time. All the small details of performance
and implementation complexity are ignored. 

class ZipFile:
    def getEntry(name):
          ...
          self.activeentry = ZipEntry(name)
          return self.activeentry

class ZipEntry:
     #enough methods and fields to fake file-ness to casual users like me.
     def write(list): ...
     def writelines(str): ...
     def read(size=None): ...
     def readlines(sizehint=-1): ...

     def seek(offset): ...
     def flush(): ...
     def close(str): ...

     def getSize(): ....
     def getCompressedSize(): ....
     def getFlags(): ....


regards,
finn


From tim_one at email.msn.com  Fri Dec 31 04:35:18 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 22:35:18 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <386B4792.A551022A@lemburg.com>
Message-ID: <000001bf5340$0fb20300$e12d153f@tim>

[M.-A. Lemburg]
> What is QIO ?

See DejaNews (I don't save URLs).  "Quick" line-oriented text input adapted
from INN.  Someone rewrote that as a Python extension module.

>>     http://www.rebol.com/faq.html#11550948

> Looks nice indeed, but how does executable code fit into
> that definition ?

See the URL above I didn't save <wink>.  PARSE's "pattern" argument is a
block.  Blocks can be (& often are) nested.  Whether any given block is code
or data is all the same to REBOL, so passing nested code blocks in PARSE's
pattern argument is easy.  Because blocks are lexically scoped, assignments
(etc) inside a block are (well, can be) visible to its context; etc.  It's a
very Lispish approach.  REBOL is essentially Scheme under the covers, but
with syntax much more like Forth's (whitespace-separated strings of
arbitrary non-whitespace characters, with few pre-assigned meanings or
restrictions -- in fact, it's impossible for a compiler to determine where a
REBOL function call begins or ends!  can't be known until runtime).

> (mxTextTools allows you to write your own parsing elements
> in Python, BTW; it should be possible to use those mechanisms
> to achieve a similar intergration.)

It can't capture the flavor -- although I don't know that it needs to
<wink>.  There's no distinction between "the pattern language" and "the
computational language" in REBOL or Icon, and it's hard to explain what a
maddening distinction that can be once you've lived without it.  mxTextTools
embedding would feel more like Icon, where the matching engine is fully
exposed to the programmer (REBOL hides it, allowing only "approved"
interactions).

>> OTOH, making lots of calls to analyze short strings is slow.

> That's why mxTextTools converts these search idioms into byte
> codes which it executes at C level. Some future version will
> even "precompile" the tuple input and then omit the type checks
> during the search...that should give another noticeable speedup.
> Note that recursion etc. can be done at C level too -- Python
> function calls are not needed.

That's also the curse of having distinct languages; e.g., Python already had
recursion, but you needed to reimplement it in a different way with
different syntax and different rules in your pattern language.  In Icon etc,
there's no difference between a recursive pattern and a recursive function,
except in *what* it computes.  The machinery is all the same, and both more
powerful and easier to learn because of that.

> ...
> Just for kicks, here is the mysplit() function using mxTextTools:
>
> from mx.TextTools import *
>
> table = (
>     # Match all whitespace
>     (None,AllInSet,whitespace_set,+1),
>     # Match and tag all non-whitespace
>     ('text',AllInSet + AppendMatch,nonwhitespace_set,+1),
>     # Loop until EOF
>     (None,EOF,Here,-2),
>     )
>
> def mysplit(text):
>
>     return tag(text,table)[1]
>
> The timings:
>  mysplit: 5.84 sec.
>  string.split: 3.62 sec.
>
> Note that you can customize the above to split text at any
> character set you like, not just whitespace... without
> compiling or writing C code.

That's equally true of the example I posted <wink>.  Now what if I wanted to
stop splitting right after I find a keyword, recognized as such because it's
a key in some passed-in dictionary?  In my example, I make an obvious local
code change, from

    while s.notmany(white):  # consume non-whitespace
        result.append(s.get_match())
        s.many(white)

to

    while s.notmany(white):  # consume non-whitespace
        word = s.get_match()
        result.append(word)
        if dictionary.has_key(word):
            break
        s.many(white)

What does it do to your example?  Or what if the target string isn't "a
string" (the code I posted only assumes the "str" object responds to
indexing and slicing -- any buffer object is fine -- so my example doesn't
change at all)?  Or what if you need to pass the tokens on as they're found,
pipeline style?  Etc.  This is why I do complex string processing in Icon
<0.9 wink>.

OTOH, at what it does well, mxTextTools runs quicker than Icon.  Its biggest
problem has always been that e.g. nobody knows what the hell

     (None,EOF,Here,-2),

*means* at first glance -- or third <wink>.

an-extreme-on-the-transparency-vs-speed-curve-ly y'rs  - tim


From mal at lemburg.com  Fri Dec 31 12:18:57 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 31 Dec 1999 12:18:57 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <000001bf5340$0fb20300$e12d153f@tim>
Message-ID: <386C9121.E9D9DC01@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > What is QIO ?
> 
> See DejaNews (I don't save URLs).  "Quick" line-oriented text input adapted
> from INN.  Someone rewrote that as a Python extension module.

Ok, thanks.
 
> >>     http://www.rebol.com/faq.html#11550948
> 
> > Looks nice indeed, but how does executable code fit into
> > that definition ?
> 
> See the URL above I didn't save <wink>.  PARSE's "pattern" argument is a
> block.  Blocks can be (& often are) nested.  Whether any given block is code
> or data is all the same to REBOL, so passing nested code blocks in PARSE's
> pattern argument is easy.  Because blocks are lexically scoped, assignments
> (etc) inside a block are (well, can be) visible to its context; etc.  It's a
> very Lispish approach.  REBOL is essentially Scheme under the covers, but
> with syntax much more like Forth's (whitespace-separated strings of
> arbitrary non-whitespace characters, with few pre-assigned meanings or
> restrictions -- in fact, it's impossible for a compiler to determine where a
> REBOL function call begins or ends!  can't be known until runtime).

If I understand the concept correctly, I think Python could do
pretty much the same thing. The bummer is of course the need
for new keywords and byte codes (although these could be
split out into a separate text scanning engine). Using Python
function calls would slow down things to an extent that would
render the added functionality useless, well IMHO anyways ;-)

> > (mxTextTools allows you to write your own parsing elements
> > in Python, BTW; it should be possible to use those mechanisms
> > to achieve a similar intergration.)
> 
> It can't capture the flavor -- although I don't know that it needs to
> <wink>.  There's no distinction between "the pattern language" and "the
> computational language" in REBOL or Icon, and it's hard to explain what a
> maddening distinction that can be once you've lived without it.  mxTextTools
> embedding would feel more like Icon, where the matching engine is fully
> exposed to the programmer (REBOL hides it, allowing only "approved"
> interactions).

Of course its hard for a Turing Machine to capture the flavor
of any high level language :-) When you're programming
the mxTextTools Tagging Engine directly you feel like writing
assembler... but things are moving in the right direction:
Tony Ibbs has a nice meta-language and M.C. Fletcher his
SimpleParse to cover up these insufficiencies.
 
> >> OTOH, making lots of calls to analyze short strings is slow.
> 
> > That's why mxTextTools converts these search idioms into byte
> > codes which it executes at C level. Some future version will
> > even "precompile" the tuple input and then omit the type checks
> > during the search...that should give another noticeable speedup.
> > Note that recursion etc. can be done at C level too -- Python
> > function calls are not needed.
> 
> That's also the curse of having distinct languages; e.g., Python already had
> recursion, but you needed to reimplement it in a different way with
> different syntax and different rules in your pattern language.  In Icon etc,
> there's no difference between a recursive pattern and a recursive function,
> except in *what* it computes.  The machinery is all the same, and both more
> powerful and easier to learn because of that.

Agreed.
 
> > ...
> > Just for kicks, here is the mysplit() function using mxTextTools:
> >
> > from mx.TextTools import *
> >
> > table = (
> >     # Match all whitespace
> >     (None,AllInSet,whitespace_set,+1),
> >     # Match and tag all non-whitespace
> >     ('text',AllInSet + AppendMatch,nonwhitespace_set,+1),
> >     # Loop until EOF
> >     (None,EOF,Here,-2),
> >     )
> >
> > def mysplit(text):
> >
> >     return tag(text,table)[1]
> >
> > The timings:
> >  mysplit: 5.84 sec.
> >  string.split: 3.62 sec.
> >
> > Note that you can customize the above to split text at any
> > character set you like, not just whitespace... without
> > compiling or writing C code.
> 
> That's equally true of the example I posted <wink>.  Now what if I wanted to
> stop splitting right after I find a keyword, recognized as such because it's
> a key in some passed-in dictionary?  In my example, I make an obvious local
> code change, from
> 
>     while s.notmany(white):  # consume non-whitespace
>         result.append(s.get_match())
>         s.many(white)
> 
> to
> 
>     while s.notmany(white):  # consume non-whitespace
>         word = s.get_match()
>         result.append(word)
>         if dictionary.has_key(word):
>             break
>         s.many(white)
> 
> What does it do to your example? 

You'd replace the 'text' tagobj with a callable object and
write AllInSet + CallTag as command. The Tagging Engine will
then call the object with arguments (taglist,text,l,r,subtags)
and let it decide what to do.

In your example it would check the dictionary and raise an
exception in case a keyword is found to stop any further
scanning. If it's not a keyword, it would simply append
the found string to the taglist and return None.

Here's the code:

from mx.TextTools import *

import exceptions

stoplist = {'abc':1, 'def':1}

class KeywordFound(exceptions.StandardError):
    def __init__(self, taglist):
        self.taglist = taglist

def callable(taglist,text,l,r,subtags):

    taglist.append(text[l:r])
    if stoplist.has_key(text[l:r]):
        raise KeywordFound(taglist)

table = (
    # Match all whitespace
    (None,AllInSet,whitespace_set,+1),
    # Match and tag all non-whitespace
    (callable,AllInSet + CallTag,nonwhitespace_set,+1),
    # Loop until EOF
    (None,EOF,Here,-2),
    )

def mysplitex(text):

    try:
        return tag(text,table)[1]
    except KeywordFound,data:
        return data.taglist

> Or what if the target string isn't "a
> string" (the code I posted only assumes the "str" object responds to
> indexing and slicing -- any buffer object is fine -- so my example doesn't
> change at all)? 

The current version only handles string objects, but I am
already beginning to convert all the APIs in mxTextTools to
"s#" or "t#" style (can't decide which to use... "s#" is great
for processing raw data, while "t#" more closely refers to
text processing).

> Or what if you need to pass the tokens on as they're found,
> pipeline style?  Etc.  This is why I do complex string processing in Icon
> <0.9 wink>.

You can have all that extra magic via callable tag objects
or callable matching functions. It's not exactly nice to
write, but I'm sure that a meta-language could do the 
conversions for you.
 
> OTOH, at what it does well, mxTextTools runs quicker than Icon.  Its biggest
> problem has always been that e.g. nobody knows what the hell
> 
>      (None,EOF,Here,-2),
> 
> *means* at first glance -- or third <wink>.

The structure of those tag tables is very simple:

(tagobject, command, argument[, jump offset in case of failure
			     [, jump offset in case of success]])
                               
Please remember that this is byte code, not some higher level
abstraction. The design is very much inverted from what you'd
usually do: design a nice language and then try to find suitable
set of byte codes to make it work as intended.

Anyway, I'll keep focussing on the speed aspect of mxTextTools;
others can focus on abstractions, so that eventually everybody
will be happy :-)

Happy New Year,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Get ready to party !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim_one at email.msn.com  Fri Dec 31 23:53:49 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Fri, 31 Dec 1999 17:53:49 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com>
Message-ID: <000701bf53e1$e7119760$472d153f@tim>

[Fredrik Lundh, whose very nice eMatter book is on sale until
  the end of the 20th century (as real people think of it),
  although the eMatter distribution scheme has lots of problems
  [just an editorial note from a bot who has to-- for unknown
   reasons Fatbrain "is working on" --delete the Fatbrain
   registry tree and reregister the book almost every time he
   tries to open it <wink>
  ]
]

> we have something called SIO which uses memory mapping
> where possible, and just a more aggressive read-ahead for
> other cases.  on a windows box, a traditional while/readline
> loop runs 3-5 times faster than before.  with SRE instead of
> re, a while/readline/match loop runs up to 10 times faster
> than before.
>
> note that this is without *any* changes to the Python
> source code...

If so, there's potential for significantly more speed.  Python does its
line-at-a-time input with a character-at-a-time macro-in-a-loop, the same
way naive vendors (read "almost all vendors") implement fgets.  It's
replacing that inner loop with direct peeking into the FILE buffer that gets
Perl its dramatic speed -- despite that Perl has fancier input functionality
(the oft-requested automagical "input record separator").  So it sounds like
the Perl trick is orthogonal to SIO's tricks; Perl isn't doing mmaps or
read-aheads or anything else fancy under the covers -- it only optimizes the
inner loop!

> ...
> with a little luck, the new module will replace both pcre
> and regex...

If something more tangible than luck would help to make this come true, feel
free to mention it <wink>.

> not to mention that it's fairly easy to write your own front-
> end to the matching engine -- the expression parser and the
> compiler are both written in good old python.

Ah, good news / bad news.  Perl refugees aren't accustomed to "precompiling"
regexp objects, so write code that will cause regexps to get recompiled over
& over.  Even if you cache the results under the covers, the overhead of the
Python call to the regexp compiler will likely take as long as the engine
takes to search.

Personally, in such cases, I think they should learn how to use the language
<0.5 wink>.


From tim_one at email.msn.com  Fri Dec 31 23:53:56 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Fri, 31 Dec 1999 17:53:56 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <386C9121.E9D9DC01@lemburg.com>
Message-ID: <000901bf53e1$eb4248c0$472d153f@tim>

>> This is why I do complex string processing in Icon <0.9 wink>.

[MAL]
> You can have all that extra magic via callable tag objects
> or callable matching functions. It's not exactly nice to
> write, but I'm sure that a meta-language could do the
> conversions for you.

That wasn't my point:  I do it in Icon because it *is* "exactly nice to
write", and doesn't require any yet-another meta-language.  It's all
straightforward, in a way that separate schemes pasted together can never be
(simply because they *are* "separate schemes pasted together" <wink>).

The point of my Python examples wasn't that they could do something
mxTextTools can't do, but that they were *Python* examples:  every variation
I mentioned (or that you're likely to think of) was easy to handle for any
Python programmer because the "control flow" and "data type" etc aspects
could be handled exactly the way they always are in *non* pattern-matching
Python code too, rather than recoded in pattern-scheme-specific different
ways (e.g., where I had a vanailla "if/break", you set up a special
exception to tickle the matching engine).

I'm not attacking mxTextTools, so don't feel compelled to defend it --
people using regexps in those examples are dead in the water.  mxTextTools
is very good at what it does; if we have a real disagreement, it's probably
that I'm less optimistic about the prospects for higher-level wrappers
(e.g., MikeF's SimpleParse is much slower than "a real" BNF parsing system
(ARBNFPS), in part because he isn't doing all the optimizations ARBNFPS
does, but also in part because ARBNFPS uses an underlying engine more
optimized to its specific task than mxTextTool's more-general engine *can*
be).  So I don't see mxTextTools as being the answer to everything -- and if
you hadn't written it, you would agree with that on first glance <wink>.

> Anyway, I'll keep focussing on the speed aspect of mxTextTools;
> others can focus on abstractions, so that eventually everybody
> will be happy :-)

You and I will be, anyway <wink>.


From guido at CNRI.Reston.VA.US  Wed Dec  1 18:32:08 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 12:32:08 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Fri, 19 Nov 1999 14:59:11 CST."
             <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> 
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> 
Message-ID: <199912011732.MAA10419@eric.cnri.reston.va.us>

> My first Python-Dev post.  :-)

Welcome!

> >We had some discussion a while back about enabling thread support by
> >default, if the underlying OS supports it obviously.  

I agree with this.  MacOS seems to be the only OS without threads
these days.

> What's the consensus about Python microthreads -- a likely candidate
> for incorporation in 1.6 (or later)?

What are microthreads?  If you think about threads implemented in the
Python VM instead of in the OS, forget it.

> Also, we have a couple minor convenience functions for Python in an 
> MSDEV environment, an exposure of OutputDebugString for writing to 
> the DevStudio log window and a means of tripping DevStudio C/C++ layer
> breakpoints from Python code (currently experimental).  The msvcrt 
> module seems like a likely candidate for these, would these be 
> welcome additions?

Sure -- send patches.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From petrilli at amber.org  Wed Dec  1 18:39:00 1999
From: petrilli at amber.org (Christopher Petrilli)
Date: Wed, 1 Dec 1999 12:39:00 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: <199912011732.MAA10419@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Wed, Dec 01, 1999 at 12:32:08PM -0500
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us>
Message-ID: <19991201123900.A7419@trump.amber.org>

Guido van Rossum [guido at CNRI.Reston.VA.US] wrote:
> > >We had some discussion a while back about enabling thread support by
> > >default, if the underlying OS supports it obviously.  
> 
> I agree with this.  MacOS seems to be the only OS without threads
> these days.

I believe the new GUISI package has pthread-API compatible threads
implemented, which talk to the underlying ThreadManager.  With MacOSX
being impending before 1.6 (i.e. early 2000), I'd say this is a good
way to go.  Threads are VERY useful for a lot of problem domains.

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org


From guido at CNRI.Reston.VA.US  Wed Dec  1 18:54:53 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 12:54:53 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Wed, 01 Dec 1999 12:39:00 EST."
             <19991201123900.A7419@trump.amber.org> 
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us>  
            <19991201123900.A7419@trump.amber.org> 
Message-ID: <199912011754.MAA10465@eric.cnri.reston.va.us>

> > I agree with this.  MacOS seems to be the only OS without threads
> > these days.
> 
> I believe the new GUISI package has pthread-API compatible threads
> implemented, which talk to the underlying ThreadManager.  With MacOSX
> being impending before 1.6 (i.e. early 2000), I'd say this is a good
> way to go.  Threads are VERY useful for a lot of problem domains.

What's GUISI?  The son of GUSI?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Wed Dec  1 18:55:19 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 12:55:19 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Wed, 01 Dec 1999 12:32:08 EST."
             <199912011732.MAA10419@eric.cnri.reston.va.us> 
References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com>  
            <199912011732.MAA10419@eric.cnri.reston.va.us> 
Message-ID: <199912011755.MAA10476@eric.cnri.reston.va.us>

> > Also, we have a couple minor convenience functions for Python in an 
> > MSDEV environment, an exposure of OutputDebugString for writing to 
> > the DevStudio log window and a means of tripping DevStudio C/C++ layer
> > breakpoints from Python code (currently experimental).  The msvcrt 
> > module seems like a likely candidate for these, would these be 
> > welcome additions?
> 
> Sure -- send patches.

I hadn't seen Mark Hammond's response -- I take it back.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Wed Dec  1 19:15:26 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 13:15:26 -0500
Subject: [Python-Dev] Another 1.6 wish
In-Reply-To: Your message of "Sat, 20 Nov 1999 11:04:28 +1100."
             <005f01bf32ea$d0b82b90$0501a8c0@bobcat> 
References: <005f01bf32ea$d0b82b90$0501a8c0@bobcat> 
Message-ID: <199912011815.NAA10506@eric.cnri.reston.va.us>

> This is really a pointer to the fact that some or all of the win32api
> should be moved into the core - registry access is the thing people
> most want, but there are plenty of other useful things that people
> reguarly use...
> 
> Guido objects to the coding style, but hopefully that wont be a big
> issue.  IMO, the coding style isnt "bad" - it is just more an "MS"
> flavour than a "Python" flavour - presumably people reading the code
> will have some experience with Windows, so it wont look completely
> foreign to them.  The good thing about taking it "as-is" is that it
> has been fairly well bashed on over a few years, so is really quite
> stable.  The final "coding style" issue is that there are no "doc
> strings" - all documentation is embedded in C comments, and extracted
> using a tool called "autoduck" (similar to "autodoc").  However, Im
> sure we can arrange something there, too.

That's a good summary of the status quo.  I would appreciate it if
win32all could become part of the core.  However the coding style
issues need to be addressed (I also believe that it needs to be
compiled in C++ mode).  One concern that Mark doesn't mention is that
there are some safety issues -- you can abuse some of the calls to
cause segfaults, whether intentional or by mistake, and that's not a
good thing.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Wed Dec  1 19:55:40 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 13:55:40 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Wed, 24 Nov 1999 09:43:57 EST."
             <383BF9AD.E183FB98@interet.com> 
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org>  
            <383BF9AD.E183FB98@interet.com> 
Message-ID: <199912011855.NAA10662@eric.cnri.reston.va.us>

> I would like to argue that on Windows, import of dynamic libraries is
> broken.  If a file something.pyd is imported, then sys.path is searched
> to find the module.  If a file something.dll is imported, the same thing
> happens.  But Windows defines its own search order for *.dll files which
> Python ignores.  I would suggest that this is wrong for files named
> *.dll,
> but OK for files named *.pyd.

I think you misunderstand some of the issues.

Python cannot import every .dll file.  Only .dll files that conform to
the convention for Python extension modules can be imported.  (The
convention is that it must export an init<module> function.)

On most other platforms, shared libraries must have a specific
extension (e.g. .so on most Unix).  Python allows you to drop such a
file into any directory where is looks for modules, and it will then
direct the dynamic load support to load that specific file.

This seems logical -- Python extensions must live in directories that
Python searches (Python must do its own search because the search
order is significant).

On Windows, Python uses the same strategy.  The only modification is
that it is allowed to give the file a different extension, namely
.pyd, to indicate that this really is a Python extension and not a
regular DLL.  This was mostly introduced because it is apparently
common to have an existing DLL "foo.dll" and write a Python wrapper
for it that is also called "foo".  Clearly, two files foo.dll are too
confusing, so we let you name the wrapper foo.pyd.  But because the
file format is essentially that of a DLL, we don't *require* this
renaming; some ways of creating DLLs in the first place may make it
difficult to do.

> A SysAdmin should be able to install and maintain *.dll as she has
> been trained to do.  This makes maintaining Python installations
> simpler and more un-surprising.

I don't see that a SysAdmin needs to do much DLL management.  This is
up to installer scripts.  Anyway how hard can it be for a SysAdmin to
leave DLLs in specific directories alone?

> I have no solution to the backward compatibilty problem.  But the
> code is only a couple lines.  A LoadLibrary() call does its own
> path searching.

But at what point should this LoadLibrary() call be called?  The
import statement contains no clue that a DLL is requested -- the
sys.path search reveals that.

I claim that there is nothing with the current strategy.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw at cnri.reston.va.us  Wed Dec  1 20:01:12 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Wed, 1 Dec 1999 14:01:12 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs
References: <199911161700.MAA02716@eric.cnri.reston.va.us>
	<14389.31511.706588.20840@anthem.cnri.reston.va.us>
Message-ID: <14405.28792.184298.298597@anthem.cnri.reston.va.us>

>>>>> "BAW" == Barry A Warsaw <bwarsaw at cnri.reston.va.us> writes:

    BAW> There was a suggestion to start augmenting the checkin emails
    BAW> to include the diffs of the checkin.  This would let you keep
    BAW> a current snapshot of the tree without having to do a direct
    BAW> `cvs update'.

The voting has stopped, with the "yeah" vote slightly head of the
"nay" vote.  We'll go with context diffs, and we'll be implementing
Greg Stein's approach with the xml-checkins list: truncating diffs to
H number of lines at the top and T number of lines at the bottom, so
as not to overwhelm incoming email.

I'll try to get this going sometime today (no promises).  You'll
likely see a number of tests coming through python-checkins in the
meantime.  I'll send a message out when it's done.

-Barry


From da at ski.org  Wed Dec  1 20:34:56 1999
From: da at ski.org (David Ascher)
Date: Wed, 1 Dec 1999 11:34:56 -0800 (Pacific Standard Time)
Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues
In-Reply-To: <14405.25141.297349.76968@gargle.gargle.HOWL>
Message-ID: <Pine.WNT.4.04.9912011132130.222-100000@rigoletto.ski.org>

On Wed, 1 Dec 1999, Geoffrey Furnish wrote:

[...]

> Well, like I said above, I haven't analyzed your posts for technical
> details, so I can't say whether you made avoidable mistakes.  But I
> definitely do agree with you that it is roughly 100 times harder than
> it needs to be, to use Python from C++.  The charter of this sig is to 
> fix that, by developing the additional software that would allow
> Python's compiled interface to be exploited from C++ "with ease".
> 
> The first and most basic issue, is compiling Python so it initializes
> C++ global objects correctly.  There is a patch on the sig's www site
> to help with that.

Any opinions from this esteemed body re: integrating said patch in the
main tree?

--david


From jim at interet.com  Wed Dec  1 20:47:14 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 14:47:14 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org>  
	            <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us>
Message-ID: <38457B42.85552AC@interet.com>

Guido van Rossum wrote:
> 
> > I would like to argue that on Windows, import of dynamic libraries is
> > broken.  If a file something.pyd is imported, then sys.path is searched
> > to find the module.  If a file something.dll is imported, the same thing
> > happens.  But Windows defines its own search order for *.dll files which
> > Python ignores.  I would suggest that this is wrong for files named
> > *.dll,
> > but OK for files named *.pyd.
> 
> I think you misunderstand some of the issues.
> 
> Python cannot import every .dll file.  Only .dll files that conform to
> the convention for Python extension modules can be imported.  (The
> convention is that it must export an init<module> function.)

Of course I meant that the test is LoadLibrary(module) followed
by GetProcAddress(h, "init" + module).  Both must succeed.

> This seems logical -- Python extensions must live in directories that
> Python searches (Python must do its own search because the search
> order is significant).

The PYTHONPATH search path is what I am trying to get away
from.  If I eliminate PYTHONPATH I still can not use the
Windows DLL search path (which is superior) because DLLs
are searched on PYTHONPATH too; thus my post.  I don't believe
it is important for Python module.dll to be located on PYTHONPATH.

> > A SysAdmin should be able to install and maintain *.dll as she has
> > been trained to do.  This makes maintaining Python installations
> > simpler and more un-surprising.
> 
> I don't see that a SysAdmin needs to do much DLL management.  This is
> up to installer scripts.  Anyway how hard can it be for a SysAdmin to
> leave DLLs in specific directories alone?

The problem is maintaining PYTHONPATH plus having DLL's on a
non-standard search path.  Yes, PythonDev[:] and professional
SysAdmins can do it.  But it is not as simple as it could be.
Someone has to write the install scripts.  And what if something
doesn't work?  Think of Python being used as a teaching language
for the 8th grade.  Think of the 8th grade teacher trying to get
all this right.  The only thing that works is simplicity.

> But at what point should this LoadLibrary() call be called?  The
> import statement contains no clue that a DLL is requested -- the
> sys.path search reveals that.

Just after built-in and frozen modules.

> I claim that there is nothing with the current strategy.

Thank you for thoughtfully considering and commenting at length
on this issue.  Lets ignore it for the moment.  The other
problems with PYTHONPATH are more pressing.  But if those
issues are solved, this one will stick out.

JimA


From da at ski.org  Wed Dec  1 20:59:44 1999
From: da at ski.org (David Ascher)
Date: Wed, 1 Dec 1999 11:59:44 -0800 (Pacific Standard Time)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <38457B42.85552AC@interet.com>
Message-ID: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>

On Wed, 1 Dec 1999, James C. Ahlstrom wrote:

> > This seems logical -- Python extensions must live in directories that
> > Python searches (Python must do its own search because the search
> > order is significant).
> 
> The PYTHONPATH search path is what I am trying to get away
> from.  If I eliminate PYTHONPATH I still can not use the
> Windows DLL search path (which is superior) because DLLs
> are searched on PYTHONPATH too; thus my post.  I don't believe
> it is important for Python module.dll to be located on PYTHONPATH.

Why is the DLL search path superior?  

In my experience, the DLL search path (PATH for short) is problematic
because it requires either using the System control panel or modifying
autoexec.bat, both of which can have massive systemic effects completely
unrelated to Python if a mistake is made during the modification.

On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH,
although I think there are significant variations in how that works across
platforms.  Most beginning unix users have no idea how to modify their
LD_LIBRARY_PATH, as they typically don't understand the configuration
mechanisms on Unix (system vs. user-specific, login vs. shell-specific,
different shell configuration languages, etc.).

I know it's not what you had in mind, but have you tried doing something
like:

  import sys, os, string
  sys.path.extend(string.split(os.environ['PATH'], ';'))

--david


From gmcm at hypernet.com  Wed Dec  1 21:19:13 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Wed, 1 Dec 1999 15:19:13 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>
References: <38457B42.85552AC@interet.com>
Message-ID: <1268042932-41354568@hypernet.com>

David Ascher wrote:
> On Wed, 1 Dec 1999, James C. Ahlstrom wrote:
> 
> > > This seems logical -- Python extensions must live in
> > > directories that Python searches (Python must do its own
> > > search because the search order is significant).
> > 
> > The PYTHONPATH search path is what I am trying to get away
> > from.  If I eliminate PYTHONPATH I still can not use the
> > Windows DLL search path (which is superior) because DLLs are
> > searched on PYTHONPATH too; thus my post.  I don't believe it
> > is important for Python module.dll to be located on PYTHONPATH.
> 
> Why is the DLL search path superior?  
> 
> In my experience, the DLL search path (PATH for short) 

Make that:
 [ os.path.dirname(sys.executable),
   os.getcwd(),
   win32api.GetSystemDirectory(),
   os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'), 
   win32api.GetWindowsDirectory()
 ] + string.split(os.environ['PATH'], ';')

> is
> problematic because it requires either using the System control
> panel or modifying autoexec.bat, both of which can have massive
> systemic effects completely unrelated to Python if a mistake is
> made during the modification.

Hear, hear!

[snip]


- Gordon


From jim at interet.com  Wed Dec  1 21:36:04 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 15:36:04 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>
Message-ID: <384586B4.48905B32@interet.com>

David Ascher wrote:

> Why is the DLL search path superior?
> 
> In my experience, the DLL search path (PATH for short) is problematic
> because it requires either using the System control panel or modifying
> autoexec.bat, both of which can have massive systemic effects completely
> unrelated to Python if a mistake is made during the modification.

I agree that altering PATH is problematic.  So is altering PYTHONPATH
and for exactly the same reason.  That is why I think PYTHONPATH is
a bad idea.

The reason the DLL search path is superior is that it is not just PATH.
It defines a path which includes the install directory of the
application
plus the system directories, and this path is discovered at runtime.  So
it is not necessary to set a global PYTHONPATH, nor make registry
entries,
nor do anything at all.  It Just Works.

The Windows DLL search path is:

1) The directory of the executable program.  That means you can just
   throw all your DLL's in with the *.exe's, and it all Just Works.

2) The current directory.  Also useful.

3) The Windows system directory (call GetSystemDirectory() to get this).
4) The Windows directory (call GetWindowsDirectory() to get this).

   These two directories are used for system files.  Think of /sbin,
/bin.
   Windows apps usually throw some of their DLL's here, especially if
they
   are of general interest.

5) The directories in PATH.  This is relatively useless, and AFAIK it
   is seldom used in a real installation.  It is a left-over from DOS.
   That is also why it appears last.

> On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH,
> although I think there are significant variations in how that works across
> platforms.  Most beginning unix users have no idea how to modify their
> LD_LIBRARY_PATH, as they typically don't understand the configuration
> mechanisms on Unix (system vs. user-specific, login vs. shell-specific,
> different shell configuration languages, etc.).

I agree.

> 
> I know it's not what you had in mind, but have you tried doing something
> like:
> 
>   import sys, os, string
>   sys.path.extend(string.split(os.environ['PATH'], ';'))

Adding PATH (or anything else) to PYTHONPATH is making it worse.  Have
you tried "import sys; print sys.path" on Windows?  It is junk.

JimA


From jim at interet.com  Wed Dec  1 21:44:00 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 15:44:00 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <38457B42.85552AC@interet.com> <1268042932-41354568@hypernet.com>
Message-ID: <38458890.BCB36FE2@interet.com>

Gordon McMillan wrote:

> Make that:
>  [ os.path.dirname(sys.executable),
>    os.getcwd(),
>    win32api.GetSystemDirectory(),
>    os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'),
>    win32api.GetWindowsDirectory()
>  ] + string.split(os.environ['PATH'], ';')

Very nice!  "../SYSTEM" needed on NT I guess.

JimA


From fredrik at pythonware.com  Wed Dec  1 21:56:16 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 1 Dec 1999 21:56:16 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com>
Message-ID: <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>

James C. Ahlstrom <jim at interet.com> wrote:
> Adding PATH (or anything else) to PYTHONPATH is making it worse.  Have
> you tried "import sys; print sys.path" on Windows?  It is junk.

not on my machine.

it would help if you stopped assuming that every-
one have the same problems as you have.  we've
distributed several python apps on windows, and
frankly, I don't understand what you're talking
about.

</F>


From jim at interet.com  Wed Dec  1 22:26:37 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 01 Dec 1999 16:26:37 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>
Message-ID: <3845928D.C0462322@interet.com>

Fredrik Lundh wrote:

> > you tried "import sys; print sys.path" on Windows?  It is junk.
> 
> not on my machine.

On my Windows machine I get:

['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib',
  '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin']

PYTHONPATH is N:/prd/winlease/vest.
os.path.dirname(sys.executable) is F:/bin.
The others are junk.  What do you get?  Did
you change sys.path from the default?

> it would help if you stopped assuming that every-
> one have the same problems as you have.  we've
> distributed several python apps on windows, and
> frankly, I don't understand what you're talking
> about.

We distribute our app by freezing all *.py files
into a DLL, and we don't set PYTHONPATH on the
target machine.  The files are located with the
executable file and are found there.  This works
fine and we don't have a problem with it.

It would help me a lot if you could describe how you
distribute your app.  Do you set PYTHONPATH on the
target machine?

JimA


From da at ski.org  Wed Dec  1 22:41:31 1999
From: da at ski.org (David Ascher)
Date: Wed, 1 Dec 1999 13:41:31 -0800 (Pacific Standard Time)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <384586B4.48905B32@interet.com>
Message-ID: <Pine.WNT.4.04.9912011251250.254-100000@rigoletto.ski.org>

On Wed, 1 Dec 1999, James C. Ahlstrom wrote:

> > In my experience, the DLL search path (PATH for short) is problematic
> > because it requires either using the System control panel or modifying
> > autoexec.bat, both of which can have massive systemic effects completely
> > unrelated to Python if a mistake is made during the modification.
> 
> I agree that altering PATH is problematic.  So is altering PYTHONPATH
> and for exactly the same reason.  That is why I think PYTHONPATH is
> a bad idea.

I see.  Thanks for the explanation. I didn't know the complete story of
the "Windows DLL search path".  BTW, I think a huge difference b/w
PYTHONPATH and PATH is the system-wide nature of PATH, vs. the
Python-restriced nature of PYTHONPATH.

--david


From mhammond at skippinet.com.au  Wed Dec  1 23:29:38 1999
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Thu, 2 Dec 1999 09:29:38 +1100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <Pine.WNT.4.04.9912011251250.254-100000@rigoletto.ski.org>
Message-ID: <009c01bf3c4b$8f119090$0501a8c0@bobcat>

> I see.  Thanks for the explanation. I didn't know the
> complete story of
> the "Windows DLL search path".  BTW, I think a huge difference b/w
> PYTHONPATH and PATH is the system-wide nature of PATH, vs. the
> Python-restriced nature of PYTHONPATH.

And more to the point - and the critical distinction - is that
PYTHONPATH is actually specific to the Python _app_, not just Python
on the machine.

Sure - the standard Python installation puts a "default" PYTHONPATH
suitable for general purpose development - but any distributed
application _can_ define their own PYTHONPATH that is independant of
any other Python systems or applications.  People have been doing this
for years, including MS :-)

Sorry Jim, but count this as another vote against it - which isnt to
argue that the current system is perfect, simply (IMO) better than the
Windows path and DLL search order.

Mark.


From guido at CNRI.Reston.VA.US  Thu Dec  2 00:00:21 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:00:21 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Wed, 01 Dec 1999 16:26:37 EST."
             <3845928D.C0462322@interet.com> 
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>  
            <3845928D.C0462322@interet.com> 
Message-ID: <199912012300.SAA10861@eric.cnri.reston.va.us>

> Fredrik Lundh wrote:
> 
> > > you tried "import sys; print sys.path" on Windows?  It is junk.
> > 
> > not on my machine.
> 
> On my Windows machine I get:
> 
> ['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib',
>   '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin']
> 
> PYTHONPATH is N:/prd/winlease/vest.
> os.path.dirname(sys.executable) is F:/bin.
> The others are junk.  What do you get?  Did
> you change sys.path from the default?

You must not have used the standard Python installer; if you had used
it you wouldn't have had this problem (and perhaps we wouldn't have
had this discussion).

The problem is that you apparently have installed python.exe in
f:\bin.  "Modern" Python versions execute some code at startup that
comes up with a suitable value for sys.path; the Windows version of
this code is in PC/getpathp.c -- I recommend that you study it.  This
code tries to find the Python install directory by looking for a
"landmark" file relative to the executable path, and then adds a bunch
of directory entries to the path relative to the install directory.
If it fails, it defaults to "." for the install directory.  The
entries '.\\DLLs', '.\\lib', '.\\lib\\plat-win', '.\\lib\\lib-tk' are
all a result of this failing.

As long as this works, there is no need for the user (or anyone) to
ever set the PYTHONPATH variable -- that variable is only needed to
add directories in front of sys.path for stuff that getpathp.c doesn't
know about (e.g. PIL, Numeric, etc.).  With packagized versions of
those modules, even that won't be necessary, because the packages will
be dropped in the Python install directory (typically C:\Program
Files\Python).

I believe that most of your desire to get rid of PYTHONPATH comes from
your insistence to bypass the default installer.  There's probably a
way to install your app in such a way that the getpathp.c algorithm
actually succeeds?  There's also a separate env variable, PYTHONHOME,
which overrides the Python install directory; if getpathp.c sees that
it is set, it will bypass the search relative to the executable's
path.

I take blame for not documenting all this well enough.  However I wish
you stopped criticizing the design -- I think the design is quite
solid.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Thu Dec  2 00:09:43 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:09:43 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Wed, 01 Dec 1999 14:47:14 EST."
             <38457B42.85552AC@interet.com> 
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org> <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us>  
            <38457B42.85552AC@interet.com> 
Message-ID: <199912012309.SAA10873@eric.cnri.reston.va.us>

> > This seems logical -- Python extensions must live in directories that
> > Python searches (Python must do its own search because the search
> > order is significant).
> 
> The PYTHONPATH search path is what I am trying to get away
> from.  If I eliminate PYTHONPATH I still can not use the
> Windows DLL search path (which is superior) because DLLs
> are searched on PYTHONPATH too; thus my post.  I don't believe
> it is important for Python module.dll to be located on PYTHONPATH.

But I do.

First of all, I'm not sure whether you're talking here about sys.path
or PYTHONPATH.  As I explained in a previous post, you should normally
not have to set PYTHONPATH at all.  Let's assume you really meant
sys.path.

Let's assume sys.path is [A, B].  Let's assume there's a foo.py and a
foo.dll.  If foo.py lives in A and foo.dll lives in B, then import foo
should load foo.py.  If it's the other way around, it should load
foo.dll.  If we were to use the default DLL search path, there's no
way that we can get this behavior: either you have to look for a DLL
first, which means there's no way for foo.py to override foo.dll, or
you have to look for a DLL last, and then there's no way for a foo.dll
to override foo.py.  It is desirable that both overrides are possible:
we want to be able to have foo.dll override foo.py, because perhaps
foo.py should only be used when for some reason foo.dll can't be
loaded (say foo.py does the same thing only slower); but we also want
to be able to have foo.py override foo.dll (by simply placing it in a
directory that's earlier on the path) e.g. in a situation where the
dll version does something undesirable and we want to create a safe
substitute.  (Deleting files is not always an option.)

> The problem is maintaining PYTHONPATH plus having DLL's on a
> non-standard search path.

I've commented already that PYTHONPATH maintenance is probably a red
herring due to your non-standard install.  I'm not sure what the
problem is with having a DLL on a non-std path?

> Yes, PythonDev[:] and professional
> SysAdmins can do it.  But it is not as simple as it could be.
> Someone has to write the install scripts.

The distutil-sig (a.k.a. Greg Ward :-) is taking care of this as we
speak.

> And what if something
> doesn't work?  Think of Python being used as a teaching language
> for the 8th grade.  Think of the 8th grade teacher trying to get
> all this right.  The only thing that works is simplicity.

We will provide an installer that Just Works [tm].

> > But at what point should this LoadLibrary() call be called?  The
> > import statement contains no clue that a DLL is requested -- the
> > sys.path search reveals that.
> 
> Just after built-in and frozen modules.

See my long comment above.

> > I claim that there is nothing with the current strategy.
> 
> Thank you for thoughtfully considering and commenting at length
> on this issue.  Lets ignore it for the moment.  The other
> problems with PYTHONPATH are more pressing.  But if those
> issues are solved, this one will stick out.

And those other issues should be resolved in a different way than what
you have been proposing.  See other post.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Thu Dec  2 00:11:28 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:11:28 -0500
Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues
In-Reply-To: Your message of "Wed, 01 Dec 1999 11:34:56 PST."
             <Pine.WNT.4.04.9912011132130.222-100000@rigoletto.ski.org> 
References: <Pine.WNT.4.04.9912011132130.222-100000@rigoletto.ski.org> 
Message-ID: <199912012311.SAA10888@eric.cnri.reston.va.us>

> > The first and most basic issue, is compiling Python so it initializes
> > C++ global objects correctly.  There is a patch on the sig's www site
> > to help with that.
> 
> Any opinions from this esteemed body re: integrating said patch in the
> main tree?

I presume you meant me :-)

I'll give it a try tonight.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy at cnri.reston.va.us  Thu Dec  2 00:24:06 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 1 Dec 1999 18:24:06 -0500 (EST)
Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01
Message-ID: <14405.44566.832799.96438@goon.cnri.reston.va.us>

It looks like there has been some mail glitch that result in no
digests being sent between 11/26 and 12/01 and no messages being
archived between 11/24 and 12/01.  Does anyone keep a personal archive
that has those messages?  I'd like to read them.

Jeremy


From guido at CNRI.Reston.VA.US  Thu Dec  2 00:28:14 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 01 Dec 1999 18:28:14 -0500
Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01
In-Reply-To: Your message of "Wed, 01 Dec 1999 18:24:06 EST."
             <14405.44566.832799.96438@goon.cnri.reston.va.us> 
References: <14405.44566.832799.96438@goon.cnri.reston.va.us> 
Message-ID: <199912012328.SAA12879@eric.cnri.reston.va.us>

> It looks like there has been some mail glitch that result in no
> digests being sent between 11/26 and 12/01 and no messages being
> archived between 11/24 and 12/01.  Does anyone keep a personal archive
> that has those messages?  I'd like to read them.

I do :-)

I'll provide Jeremy with an archive.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw at cnri.reston.va.us  Thu Dec  2 05:24:03 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Wed, 1 Dec 1999 23:24:03 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs
References: <199911161700.MAA02716@eric.cnri.reston.va.us>
	<14389.31511.706588.20840@anthem.cnri.reston.va.us>
Message-ID: <14405.62563.345566.500106@anthem.cnri.reston.va.us>

Okay folks, I think I've got the diff thing working now.  The trick
(for you CVS heads) was that you can't do a `cvs diff' while you're
executing a loginfo script.  Lock contention (repeat after me: "I Love
CVS!").  Anyway, let's see how you all like it.

Note that based on a suggestion by Greg Stein, seconded by GvR, I do
not send out the entire diff of every file (which could potentially be
huge).  I send out 20 lines from the head of the diff and 20 lines
from the tail, and suppress everything inbetween.  Those numbers can
be easily tweaked, and I'm not sure what the ideal is.  Let's see what
the emails look like when stuff starts getting checked in.

Enjoy,
-Barry


From jack at oratrix.nl  Thu Dec  2 12:00:45 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Thu, 02 Dec 1999 12:00:45 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order 
In-Reply-To: Message by Guido van Rossum <guido@CNRI.Reston.VA.US> ,
	     Wed, 01 Dec 1999 18:09:43 -0500 , <199912012309.SAA10873@eric.cnri.reston.va.us> 
Message-ID: <19991202110045.96F33370CF2@snelboot.oratrix.nl>

On the Mac I've introduced "magic cookies" into sys.path, which allow you to 
do interesting searches (like searching for a DLL or PYC-resource in the 
application itself) at known places in the import process.

There isn't a cookie for "search along the standard MacOS dll search path" 
(which is somewhat similar to the Windows dll search path) because I haven't 
seen a reason for it, but there's nothing to stop it. And if you'd insert that 
cookie it would be perfectly clear (at least, it should be) that only dll 
modules will be found in that step, not .py modules.

Actually I'm so happy with the magic cookie scheme that I've advocated at 
various times in the past that something similar also be used for determining 
where builtin modules and frozen modules appear in sys.path...
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From guido at CNRI.Reston.VA.US  Thu Dec  2 12:59:34 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 06:59:34 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Thu, 02 Dec 1999 12:00:45 +0100."
             <19991202110045.96F33370CF2@snelboot.oratrix.nl> 
References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> 
Message-ID: <199912021159.GAA13732@eric.cnri.reston.va.us>

> On the Mac I've introduced "magic cookies" into sys.path, which
> allow you to do interesting searches (like searching for a DLL or
> PYC-resource in the application itself) at known places in the
> import process.

> There isn't a cookie for "search along the standard MacOS dll search
> path" (which is somewhat similar to the Windows dll search path)
> because I haven't seen a reason for it, but there's nothing to stop
> it. And if you'd insert that cookie it would be perfectly clear (at
> least, it should be) that only dll modules will be found in that
> step, not .py modules.

> Actually I'm so happy with the magic cookie scheme that I've
> advocated at various times in the past that something similar also
> be used for determining where builtin modules and frozen modules
> appear in sys.path...

I see the magic cookies as a poor man's (but more compatible!) version
of a chain of importers as advocated by Greg Stein and other imputil
fans.  I like the idea, except that I think that the chain should be
manipulatable more easily than the current imputil implementation.
(I'll have more comments on Greg's comments later, when I've actually
read them through.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein at lyra.org  Thu Dec  2 13:09:40 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 2 Dec 1999 04:09:40 -0800 (PST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <199912021159.GAA13732@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912020404500.18236-100000@nebula.lyra.org>

On Thu, 2 Dec 1999, Guido van Rossum wrote:
>...
> I see the magic cookies as a poor man's (but more compatible!) version
> of a chain of importers as advocated by Greg Stein and other imputil
> fans.  I like the idea, except that I think that the chain should be
> manipulatable more easily than the current imputil implementation.
> (I'll have more comments on Greg's comments later, when I've actually
> read them through.)

Anything in sys.path that is not a string pointing to a directory is not
very compatible. My current proposal keeps the existing semantics for
sys.path (the proposal adds functionality thru other mechanisms, rather
than changing/interfering with existing ones).

I look forward to your comments! I'll definitely provide new solutions
where you find problems :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik at pythonware.com  Thu Dec  2 13:53:03 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 2 Dec 1999 13:53:03 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <19991202110045.96F33370CF2@snelboot.oratrix.nl>  <199912021159.GAA13732@eric.cnri.reston.va.us>
Message-ID: <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com>

Guido van Rossum <guido at CNRI.Reston.VA.US> wrote:
> > Actually I'm so happy with the magic cookie scheme that I've
> > advocated at various times in the past that something similar also
> > be used for determining where builtin modules and frozen modules
> > appear in sys.path...
> 
> I see the magic cookies as a poor man's (but more compatible!) version
> of a chain of importers as advocated by Greg Stein and other imputil
> fans.  I like the idea, except that I think that the chain should be
> manipulatable more easily than the current imputil implementation.

I know this has been asked before, but cannot recall
any of the arguments against it: how about replacing
Jack's magic cookies with importer objects?

(in other words, if a path item is a string, import as
usual.  otherwise, ask the importer for a code object
or maybe better, a module object).

</F>


From jack at oratrix.nl  Thu Dec  2 14:23:31 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Thu, 02 Dec 1999 14:23:31 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order 
In-Reply-To: Message by "Fredrik Lundh" <fredrik@pythonware.com> ,
	     Thu, 2 Dec 1999 13:53:03 +0100 , <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> 
Message-ID: <19991202132331.E3F8D370CF2@snelboot.oratrix.nl>

> > I see the magic cookies as a poor man's (but more compatible!) version
> > of a chain of importers as advocated by Greg Stein and other imputil
> > fans. [...]
> 
> I know this has been asked before, but cannot recall
> any of the arguments against it: how about replacing
> Jack's magic cookies with importer objects?

For the record: I definitely agree with both comments here. The only thing 
that would need solving (but maybe it already is? Greg?) is the external 
representation of an importer, as I'd definitely want to be able to name them 
in PYTHONPATH (or the mac equivalent).
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From jim at interet.com  Thu Dec  2 15:19:31 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 09:19:31 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <009c01bf3c4b$8f119090$0501a8c0@bobcat>
Message-ID: <38467FF3.D938EE4@interet.com>

Mark Hammond wrote:

> Sure - the standard Python installation puts a "default" PYTHONPATH
> suitable for general purpose development - but any distributed
> application _can_ define their own PYTHONPATH that is independant of
> any other Python systems or applications.  People have been doing this
> for years, including MS :-)

How is this done?
 
> Sorry Jim, but count this as another vote against it - which isnt to
> argue that the current system is perfect, simply (IMO) better than the
> Windows path and DLL search order.

Sigh.....

JimA


From jim at interet.com  Thu Dec  2 16:49:10 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 10:49:10 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>  
	            <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us>
Message-ID: <384694F6.E5D74221@interet.com>

Guido van Rossum wrote:

> You must not have used the standard Python installer; if you had used
> it you wouldn't have had this problem (and perhaps we wouldn't have
> had this discussion).

Correct, I did not use the standard Python installer.  I compiled
Python from the source distribution.  There are good reasons for this
in my case.

First, my real issue is how to DISTRIBUTE Python programs, not to get
Python working on my own machine.  We have 12 machines on a network.
It is not acceptable to run a Python installation script on every one
of them just to run a simple Python program.  OK, I guess I could do 12,
but what about a larger company?  And we ship to hundreds of customers.
I can distribute simple C or C++ programs without a hassle, why not
Python?
It is not acceptable to ask our customers to run a separate Python
installer.
We have our own Wise installer to install our software.  Every
commercial
vendor has Wise, Install Shield or other installer in place.  No
commercial
vendor is going to abandon Wise et al. and move to The Official Python
Installer because it will not have the features of Wise (such as binary
patches across the network), and because what it does won't be
documented,
and because it is Just Different.

Second, I can not run ANY installer on my development machine, Python or
otherwise.  This is a general Windows problem not specific to Python.
Right now our help system is broken on every office machine except the
one where the help system installer was run (where we develop help).
If I run a Python installer, it may Just Work here.  So testing is
fine, but when I distribute the program to customers where the install
program has not been run it fails.  The installer made registry entries,
installed files, etc.  And what did it do??  No one knows.  And how do I
install at a customer site if I don't have documentation on what the
Help
installer or Python installer did??  No one knows.  Who fixes it if
something goes wrong??  Hours on the phone to Help System customer
support.
Does it work on Windows 2000??  No one knows.

> f:\bin.  "Modern" Python versions execute some code at startup that
> comes up with a suitable value for sys.path; the Windows version of
> this code is in PC/getpathp.c -- I recommend that you study it.  This

> [ Highly useful discussion of startup...]

Thank you, I will study this.

> know about (e.g. PIL, Numeric, etc.).  With packagized versions of
> those modules, even that won't be necessary, because the packages will
> be dropped in the Python install directory (typically C:\Program
> Files\Python).

Yes, this is essential.  Packages must be easily installed.  I was
hoping
for single file package archive files.

> I believe that most of your desire to get rid of PYTHONPATH comes from
> your insistence to bypass the default installer.

Correct, I refuse to execute the default installer.  And I am
a patient person who loves Python, so I will read getpathp.c
to see what is happening.  But other commercial developers, students,
teachers, SysAdmins etc. are not so patient.  In the interest of
promoting Python, there should be documentation on the official
way to easily install Python programs.

> There's probably a
> way to install your app in such a way that the getpathp.c algorithm
> actually succeeds?  There's also a separate env variable, PYTHONHOME,

Perhaps, and if there is it should be prominently documented in the
How to Distribute Your App section of the manual.  I
am worried about supporting versioning, but I will think about it.

> I take blame for not documenting all this well enough.  However I wish
> you stopped criticizing the design -- I think the design is quite
> solid.

Thank you for the explanation.  I will study the design again.  I
always wondered what PYTHONHOME did.

JimA


From guido at CNRI.Reston.VA.US  Thu Dec  2 17:03:09 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 11:03:09 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Thu, 02 Dec 1999 10:49:10 EST."
             <384694F6.E5D74221@interet.com> 
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us>  
            <384694F6.E5D74221@interet.com> 
Message-ID: <199912021603.LAA14455@eric.cnri.reston.va.us>

> Perhaps, and if there is it should be prominently documented in the
> How to Distribute Your App section of the manual.  I
> am worried about supporting versioning, but I will think about it.

Join the distutil-SIG, they are discussing just this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal at lemburg.com  Thu Dec  2 16:48:40 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 02 Dec 1999 16:48:40 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <19991202110045.96F33370CF2@snelboot.oratrix.nl>  <199912021159.GAA13732@eric.cnri.reston.va.us> <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com>
Message-ID: <384694D8.DCA3D75E@lemburg.com>

Fredrik Lundh wrote:
> 
> Guido van Rossum <guido at CNRI.Reston.VA.US> wrote:
> > > Actually I'm so happy with the magic cookie scheme that I've
> > > advocated at various times in the past that something similar also
> > > be used for determining where builtin modules and frozen modules
> > > appear in sys.path...
> >
> > I see the magic cookies as a poor man's (but more compatible!) version
> > of a chain of importers as advocated by Greg Stein and other imputil
> > fans.  I like the idea, except that I think that the chain should be
> > manipulatable more easily than the current imputil implementation.
> 
> I know this has been asked before, but cannot recall
> any of the arguments against it: how about replacing
> Jack's magic cookies with importer objects?
> 
> (in other words, if a path item is a string, import as
> usual.  otherwise, ask the importer for a code object
> or maybe better, a module object).

Plus, for backward compatibility, make sure that str(importerobj)
returns something which resembles a non-existing directory.

Note that the builtin importer skips non-string entries
in sys.path, so the above will only be needed for existing
import hooks.

Still, I would like to rephrase my 0.02EUR which I already
posted twice... why not start to think about what these
importers would do first ? If there are only a handful of
wishes we could just add them to the builtin machinery and
be done with it...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    29 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido at CNRI.Reston.VA.US  Thu Dec  2 17:28:28 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 11:28:28 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Fri, 19 Nov 1999 22:43:32 EST."
             <1269053086-27079185@hypernet.com> 
References: <1269053086-27079185@hypernet.com> 
Message-ID: <199912021628.LAA14506@eric.cnri.reston.va.us>

> No success whatsoever in either direction across Samba. In 
> fact the mtime of my Linux home directory as seen from NT is 
> Jan 1, 1980.

That's only the case for an NT mount point (something of the form
\\host\name; I notice that os.stat() only believes it exists if you
append a backslash: \\host\name\).  For interior directories, at least
with the Samba version that I'm using, os.stat() seems to give correct
results.

I think that this whole issue (that doing a stat on a directory to
find out whether files in it were modified doesn't give usable
results) is widely blown out of proportion.

The only useful bit of info is that mtimes may have an up to 2 second
granularity, and that anything as recent as 2 seconds should be
considered as newer than the cache even if the cache is also less than
2 seconds.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at interet.com  Thu Dec  2 17:28:50 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 11:28:50 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9911231549120.10639-100000@nebula.lyra.org> <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us>  
	            <38457B42.85552AC@interet.com> <199912012309.SAA10873@eric.cnri.reston.va.us>
Message-ID: <38469E42.AF0A0D55@interet.com>

Guido van Rossum wrote:

> Let's assume sys.path is [A, B].  Let's assume there's a foo.py and a
> foo.dll.  If foo.py lives in A and foo.dll lives in B, then import foo
> ...

Thank you for the detailed discussion showing that sys.path is
needed so a choice can be made whether to load foo.dll or
foo.py.  As you correctly point out, a separate search path
defeats this behavior.

But I don't think the usefulness of the feature compensates for
its resultant complexity.  Specifically, it will be hard to
create this behavior in archive files.

As I envision archive files (which of course is subject to change)
they contain *.pyc files and not DLL's.  The DLL's must be in a
./DLL directory since the OS can not load them from strings.  So
if every *.pyc is in an archive file, your only choice is whether
to load all DLL's first or last.  That is, archive.pyl is either
before or after ./DLL.

If a package (probably with lots of subdirectories) author depends on
having a search path within a package which discriminates between
pyc and DLL files with equal names, then that search path plus the
existence of the DLL's must be recorded in the archive.

This is much more complicated than just an archive with all *.pyc
files entered in a dotted name space:
  foo
  foo.sub1
  foo.sub2
  foo.sub2.pkx

I would question whether equally named foo.dll and foo.py is worth it.
The alternative (which is IMHO more common) is to code the choice in
Python in the module that cares about it.

> > And what if something
> > doesn't work?  Think of Python being used as a teaching language
> > for the 8th grade.  Think of the 8th grade teacher trying to get
> > all this right.  The only thing that works is simplicity.
> 
> We will provide an installer that Just Works [tm].

OK for this case.  Not enough for Python program distribution. 

JimA


From jim at interet.com  Thu Dec  2 17:30:49 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 02 Dec 1999 11:30:49 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us>  
	            <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us>
Message-ID: <38469EB9.5EDB9617@interet.com>

Guido van Rossum wrote:
> 
> > Perhaps, and if there is it should be prominently documented in the
> > How to Distribute Your App section of the manual.  I
> > am worried about supporting versioning, but I will think about it.
> 
> Join the distutil-SIG, they are discussing just this.

I already belong to the distutil-SIG and have seen no such
discussion.

Jim


From guido at CNRI.Reston.VA.US  Thu Dec  2 18:17:52 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 12:17:52 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Thu, 02 Dec 1999 11:30:49 EST."
             <38469EB9.5EDB9617@interet.com> 
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org> <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us>  
            <38469EB9.5EDB9617@interet.com> 
Message-ID: <199912021717.MAA14682@eric.cnri.reston.va.us>

[Jim]
> > > Perhaps, and if there is it should be prominently documented in the
> > > How to Distribute Your App section of the manual.  I
> > > am worried about supporting versioning, but I will think about it.

[me]
> > Join the distutil-SIG, they are discussing just this.

[Jim again]
> I already belong to the distutil-SIG and have seen no such
> discussion.

Sorry, you're right (except for a brief exchange between you and Paul
Dubois :-).  But I think they should, it falls under their charter.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Thu Dec  2 18:30:02 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 2 Dec 1999 12:30:02 -0500 (EST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <199912021717.MAA14682@eric.cnri.reston.va.us>
References: <Pine.WNT.4.04.9912011152170.222-100000@rigoletto.ski.org>
	<384586B4.48905B32@interet.com>
	<011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com>
	<3845928D.C0462322@interet.com>
	<199912012300.SAA10861@eric.cnri.reston.va.us>
	<384694F6.E5D74221@interet.com>
	<199912021603.LAA14455@eric.cnri.reston.va.us>
	<38469EB9.5EDB9617@interet.com>
	<199912021717.MAA14682@eric.cnri.reston.va.us>
Message-ID: <14406.44186.574647.651111@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > Sorry, you're right (except for a brief exchange between you and Paul
 > Dubois :-).  But I think they should, it falls under their charter.

  This was deliberatly postponed until after extension packages are
supported and in place.  I know Greg is interested in application
installation as well as package installation.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From gmcm at hypernet.com  Thu Dec  2 18:53:03 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Thu, 2 Dec 1999 12:53:03 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912021628.LAA14506@eric.cnri.reston.va.us>
References: Your message of "Fri, 19 Nov 1999 22:43:32 EST."             <1269053086-27079185@hypernet.com> 
Message-ID: <1267965342-1446902@hypernet.com>

[Gordon]
> > No success whatsoever in either direction across Samba. In fact
> > the mtime of my Linux home directory as seen from NT is Jan 1,
> > 1980.
[Guido]
> That's only the case for an NT mount point (something of the form
> \\host\name; I notice that os.stat() only believes it exists if
> you append a backslash: \\host\name\).  For interior directories,
> at least with the Samba version that I'm using, os.stat() seems
> to give correct results.

Correct (as I discovered not long after I posted). (I find that 
from NT I have to stat some file _in_ the directory to get an 
updated mtime from the stat _of_ the directory).
 
> I think that this whole issue (that doing a stat on a directory
> to find out whether files in it were modified doesn't give usable
> results) is widely blown out of proportion.

This has come up twice: re caching importers and dircache.py 
(used only by dircmp). We've arrived at the fact that it _can_ 
be made to work on Windows boxes. NFS? Andrew (anyone 
still use that)?

IOW, do we want to trust it? Do we want to document that it 
might not be trustworthy in some situations? Make it optional-
for-wizards? Kill it?
 
IOOW, what's the proper proportion ;-)?

> The only useful bit of info is that mtimes may have an up to 2
> second granularity, and that anything as recent as 2 seconds
> should be considered as newer than the cache even if the cache is
> also less than 2 seconds.


From guido at CNRI.Reston.VA.US  Thu Dec  2 21:43:46 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 15:43:46 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Fri, 19 Nov 1999 05:29:50 PST."
             <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> 
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> 
Message-ID: <199912022043.PAA15108@eric.cnri.reston.va.us>

Here's the promised response to Greg's response to my wishlist.

> On Thu, 18 Nov 1999, Guido van Rossum wrote:
> > Gordon McMillan wrote:
> >...
> > > I think imputil's emulation of the builtin importer is more of a 
> > > demonstration than a serious implementation. As for speed, it 
> > > depends on the test. 
> > 
> > Agreed.  I like some of imputil's features, but I think the API
> > need to be redesigned.
> 
> It what ways? It sounds like you've applied some thought. Do you have any
> concrete ideas yet, or "just a feeling" :-)  I'm working through some
> changes from JimA right now, and would welcome other suggestions. I think
> there may be some outstanding stuff from MAL, but I'm not sure (Marc?)

I actually think that the way the PVM (Python VM) calls the importer
ought to be changed.  Assigning to __builtin__.__import__ is a crock.
The API for __import__ is a crock.

> >...
> > So here's a challenge: redesign the import API from scratch.
> 
> I would suggest starting with imputil and altering as necessary. I'll use
> that viewpoint below.
> 
> > Let me start with some requirements.
> > 
> > Compatibility issues:
> > ---------------------
> > 
> > - the core API may be incompatible, as long as compatibility layers
> > can be provided in pure Python
> 
> Which APIs are you referring to? The "imp" module? The C functions? The
> __import__ and reload builtins?

> I'm guessing some of imp, the two builtins, and only one or two C
> functions.

All of those.

> > - support for rexec functionality
> 
> No problem. I can think of a number of ways to do this.

Agreed, I think that imputil can do this.

> > - support for freeze functionality
> 
> No problem. A function in "imp" must be exposed to Python to support this
> within the imputil framework.

Agreed.  It currently exports init_frozen() which is about the right
functionality.

> > - load .py/.pyc/.pyo files and shared libraries from files
> 
> No problem. Again, a function is needed for platform-specific loading of
> shared libraries.

Is it useful to expose the platform differences?  The current
imp.load_dynamic() should suffice.

> > - support for packages
> 
> No problem. Demo's in current imputil.
> 
> > - sys.path and sys.modules should still exist; sys.path might
> > have a slightly different meaning
> 
> I would suggest that both retain their *exact* meaning. We introduce
> sys.importers -- a list of importers to check, in sequence. The first
> importer on that list uses sys.path to look for and load modules. The
> second importer loads builtins and frozen code (i.e. modules not on
> sys.path).

This is looking like the redesign I was looking for.  (Note that
imputil's current chaining is not good since it's impossible to remove
or reorder importers, which I think is a required feature; an explicit
list would solve this.)

Actually, the order is the other way around, but by now you should
know that.  It makes sense to have separate ones for builtin and
frozen modules -- these have nothing in common.

There's another issue, which isn't directly addressed by imputil,
although with clever use of inheritance it might be doable.  I'd like
more support for this however.  Quite orthogonally to the issue of
having separate importers, I might want to recognize new extensions.
Take the example of the ILU folks.  They want to be able to drop a
file "foo.isl" in any directory on sys.path and have the ILU stubber
automatically run if you try to import foo (the client stubs) or
foo__skel (the server skeleton).

This doesn't fit in the sys.importers strategy, because they want to
be able to drop their .isl files in any directory along sys.path.
(Or, more likely, they want to have control over where in sys.modules
the directory/directories with .isl files are placed.)  This requires
an ugly modification to the _fs_import() function.  (Which should have
been a method, by the way, to make overriding it in a subclass of
PathImporter easier!)

I've been thinking here along the lines of a strategy where the
standard importer (the one that walks sys.path) has a set of hooks
that define various things it could look for, e.g. .py files, .pyc
files, .so or .dll files.  This list of hooks could be changed to
support looking for .isl files.

There's an old, subtle issue that could be solved through this as
well: whether or not a .pyc file without a .py file should be accepted
or not.  Long ago (in Python 0.9.8) a .pyc file alone would never be
loaded.  This was changed at the request of a small but vocal minority
of Python developers who wanted to distribute .pyc files without .py
files.  It has occasionally caused frustration because sometimes
developers move .py files around but forget to remove the .pyc files,
and then the .pyc file is silently picked up if it occurs on sys.path
earlier than where the .py was moved to.

Having a set of hooks for various extensions would make it possible to
have a default where lone .pyc files are ignored, but where one can
insert a .pyc importer in the list of hooks that does the right thing
here.  (Of course, it may be possible that this whole feature of lone
.pyc files should be replaced since the same need is easily taken care
of by zip importers.

I also want to support (Jim A notwithstanding :-) a feature whereby
different things besides directories can live on sys.path, as long as
they are strings -- these could be added from the PYTHONPATH env
variable.  Every piece of code that I've ever seen that uses sys.path
doesn't care if a directory named in sys.path doesn't exist -- it may
try to stat various files in it, which also don't exist, and as far as
it is concerned that is just an indication that the requested module
doesn't live there.

Again, we would have to dissect imputil to support various hooks that
deal with different kind of entities in sys.path.  The default hook
list would consist of a single item that interprets the name as a
directory name; other hooks could support zip files or URLs.  Jack's
"magic cookies" could also be supported nicely through such a
mechanism.

> Users can insert/append new importers or alter sys.path as before.
> 
> sys.modules continues to record name:module mappings.

Yes.

Note that the interpretation of __file__ could be problematic.  To
what value do you set __file__ for a module loaded from a zip archive?

> > - $PYTHONPATH and $PYTHONHOME should still be supported
> 
> No problem.
> 
> > (I wouldn't mind a splitting up of importdl.c into several
> > platform-specific files, one of which is chosen by the configure
> > script; but that's a bit of a separate issue.)
> 
> Easy enough. The standard importer can select the appropriate
> platform-specific module/function to perform the load. i.e. these can move
> to Modules/ and be split into a module-per-platform.

Again: what's the advantage of exposing the platform specificity?

> > New features:
> > -------------
> > 
> > - Integrated support for Greg Ward's distribution utilities (i.e. a
> >   module prepared by the distutil tools should install painlessly)
> 
> I don't know the specific requirements/functionality that would be
> required here (does Greg? :-), but I can't imagine any problem with this.

Probably more support is required from the other end: once it's common
for modules to be imported from zip files, the distutil code needs to
support the creation and installation of such zip files.  Also, there
is a need for the install phase of distutil to communicate the
location of the zip file to the Python installation.

> > - Good support for prospective authors of "all-in-one" packaging tool
> >   authors like Gordon McMillan's win32 installer or /F's squish.  (But
> >   I *don't* require backwards compatibility for existing tools.)
> 
> Um. *No* problem. :-)

:-)

> > - Standard import from zip or jar files, in two ways:
> > 
> >   (1) an entry on sys.path can be a zip/jar file instead of a directory;
> >       its contents will be searched for modules or packages

Note that this is what I mention above for distutil support.

> While this could easily be done, I might argue against it. Old
> apps/modules that process sys.path might get confused.

Above I argued that this shouldn't be a problem.

> If compatibility is not an issue, then "No problem."
> 
> An alternative would be an Importer instance added to sys.importers that
> is configured for a specific archive (in other words, don't add the zip
> file to sys.path, add ZipImporter(file) to sys.importers).

This would be harder for distutil: where does Python get the initial
list of importers?

> Another alternative is an Importer that looks at a "sys.py_archives" list.
> Or an Importer that has a py_archives instance attribute.

OK, but again distutil needs to be able to add to this list when it
installs a package.  (Note that package deinstallation should also be
supported!)

(Of course I don't require this to affect Python processes that are
already running; but it should be possible to easily change the
default search path for all newly started instances of a given Python
installation.)

> >   (2) a file in a directory that's on sys.path can be a zip/jar file;
> >       its contents will be considered as a package (note that this is
> >       different from (1)!)
> 
> No problem. This will slow things down, as a stat() for *.zip and/or *.jar
> must be done, in addition to *.py, *.pyc, and *.pyo.

Fine, this is where the caching comes in handy.

> >   I don't particularly care about supporting all zip compression
> >   schemes; if Java gets away with only supporting gzip compression
> >   in jar files, so can we.
> 
> I presume we would support whatever zlib gives us, and no more.

That's it. :-)

> > - Easy ways to subclass or augment the import mechanism along
> >   different dimensions.  For example, while none of the following
> >   features should be part of the core implementation, it should be
> >   easy to add any or all:
> > 
> >   - support for a new compression scheme to the zip importer
> 
> Presuming ZipImporter is a class (derived from Importer), then this
> ability is wholly dependent upon the author of ZipImporter providing the
> hook.

Agreed.  But since we're likely going to provide this as a standandard
feature, we must ensure that it provides this hook.

> The Importer class is already designed for subclassing (and its interface 
> is very narrow, which means delegation is also *very* easy; see
> imputil.FuncImporter).

But maybe it's *too* narrow; some of the hooks I suggest above seem to
require extra interfaces -- at least in some of the subclasses of the
Importer base class.

Note: I looked at the doc string for get_code() and I don't understand
what the difference is between the modname and fqname arguments.  If I
write "import foo.bar", what are modname and fqname?  Why are both
present?  Also, while you claim that the API is narrow, the multiple
return values (also the different types for the second item) make it
complicated.

> >   - support for a new archive format, e.g. tar
> 
> A cakewalk. Gordon, JimA, and myself each have archive formats. :-)
> 
> >   - a hook to import from URLs or other data sources (e.g. a
> >     "module server" imported in CORBA) (this needn't be supported
> >     through $PYTHONPATH though)
> 
> No problem at all.
> 
> >   - a hook that imports from compressed .py or .pyc/.pyo files
> 
> No problem at all.
> 
> >   - a hook to auto-generate .py files from other filename
> >     extensions (as currently implemented by ILU)
> 
> No problem at all.

See above -- I think this should be more integrated with sys.path than
you are thinking of.  The more I think about it, the more I see that
the problem is that for you, the importer that uses sys.path is a
final subclass of Importer (i.e. it is itself not further subclassed).
Several of the hooks I want seem to require additional hooks in the
PathImporter rather than new importers.

> >   - a cache for file locations in directories/archives, to improve
> >     startup time
> 
> No problem at all.
> 
> >   - a completely different source of imported modules, e.g. for an
> >     embedded system or PalmOS (which has no traditional filesystem)
> 
> No problem at all.
> 
> In each of the above cases, the Importer.get_code() method just needs to
> grab the byte codes from the XYZ data source. That data source can be
> cmopressed, across a network, on-the-fly generated, or whatever. Each
> importer can certainly create a cache based on its concept of "location".
> In some cases, that would be a mapping from module name to filesystem
> path, or to a URL, or to a compiled-in, frozen module.

See above for sys.path integration remark.

> > - Note that different kinds of hooks should (ideally, and within
> >   reason) properly combine, as follows: if I write a hook to recognize
> >   .spam files and automatically translate them into .py files, and you
> >   write a hook to support a new archive format, then if both hooks are
> >   installed together, it should be possible to find a .spam file in an
> >   archive and do the right thing, without any extra action.  Right?
> 
> Ack. Very, very difficult.

Actually, I take most of this back.  Importers that deal with new
extension types often have to go through a file system to transform
their data to .py files, and this is just too complicated.  However it
would be still nice if there was code sharing between the code that
looks for .py and .pyc files in a zip archive and the code that does
the same in a filesystem.  Hm, maybe even that shouldn't be necessary,
the zip file probably should contain only .pyc files...

(Unrelated remark: I should really try to release the set of modules
we've written here at CNRI to deal with zip files.  Unfortunately zip
files are hairy and so is our code.)

> The imputil scheme combines the concept of locating/loading into one step.
> There is only one "hook" in the imputil system. Its semantic is "map this
> name to a code/module object and return it; if you don't have it, then
> return None."

That's fine.  I actually don't recall where the find-then-load API
came from, I think it may be an artefact of the original
implementation strategy.  It is currently used as follows: we try to
see if there's a .pyc and then we try to see if there's a .py; if both
exist we compare the timestamps etc. to choose which one.  But that's
still a red herring.

> Your compositing example is based on the capabilities of the
> find-then-load paradigm of the existing "ihooks.py". One module finds
> something (foo.spam) and the other module loads it (by generating a .py).

I still don't understand why ihooks.py had to be so complicated.  I
guess I just had much less of an understanding of the issues.  (It was
also partly a compromise with an alternative design by Ken Manheimer,
who basically forced me to support packages, originally through ni.py.)

> All is not lost, however. I can easily envision the get_code() hook as
> allowing any kind of return type. If it isn't a code or module object,
> then another hook is called to transform it.
> [ actually, I'd design it similarly: a *series* of hooks would be called
>   until somebody transforms the foo.spam into a code/module object. ]

OK.  This could be a feature of a subclass of Importer.

> The compositing would be limited ony by the (Python-based) Importer
> classes. For example, my ZipImporter might expect to zip up .pyc files
> *only*. Obviously, you would want to alter this to support zipping any
> file, then use the suffic to determine what to do at unzip time.
> 
> > - It should be possible to write hooks in C/C++ as well as Python
> 
> Use FuncImporter to delegate to an extension module.

Maybe not so great, since it sounds like the C code can't benefit from
any of the infrastructure that imputil offers.  I'm not sure about
this one though.

> This is one of the benefits of imputil's single/narrow interface.

Plus its vague specs? :-)

> > - Applications embedding Python may supply their own implementations,
> >   default search path, etc., but don't have to if they want to piggyback
> >   on an existing Python installation (even though the latter is
> >   fraught with risk, it's cheaper and easier to understand).
> 
> An application would have full control over the contents of sys.importers.
> 
> For a restricted execution app, it might install an Importer that loads
> files from *one* directory only which is configured from a specific
> Win32 Registry entry. That importer could also refuse to load shared
> modules. The BuiltinImporter would still be present (although the app
> would certainly omit all but the necessary builtins from the build).
> Frozen modules could be excluded.

Actually there's little reason to exclude frozen modules or any
.py/.pyc modules -- by definition, bytecode can't be dangerous.  It's
the builtins and extensions that need to be censored.

We currently do this by subclassing ihooks, where we mask the test for
builtins with a comparison to a predefined list of names.

> > Implementation:
> > ---------------
> > 
> > - There must clearly be some code in C that can import certain
> >   essential modules (to solve the chicken-or-egg problem), but I don't
> >   mind if the majority of the implementation is written in Python.
> >   Using Python makes it easy to subclass.
> 
> I posited once before that the cost of import is mostly I/O rather than
> CPU, so using Python should not be an issue. MAL demonstrated that a good
> design for the Importer classes is also required. Based on this, I'm a
> *strong* advocate of moving as much as possible into Python (to get
> Python's ease-of-coding with little relative cost).

Agreed.  However, how do you explain the slowdown (from 9 to 13
seconds I recall) though?  Are you a lousy coder? :-)

> The (core) C code should be able to search a path for a module and import
> it. It does not require dynamic loading or packages. This will be used to
> import exceptions.py, then imputil.py, then site.py.

It does, however, need to import builtin modules.  imputil currently
imports imp, sys, strop and __builtin__, struct and marshal; note that
struct can easily be a dynamic loadable module, and so could strop in
theory.  (Note that strop will be unnecessary in 1.6 if you use string
methods.)

I don't think that this chicken-or-egg problem is particularly
problematic though.

> The platform-specific module that perform dynamic-loading must be a
> statically linked module (in Modules/ ... it doesn't have to be in the
> Python/ directory).

See earlier comments.

> site.py can complete the bootstrap by setting up sys.importers with the
> appropriate Importer instances (this is where an application can define
> its own policy). sys.path was initially set by the import.c bootstrap code
> (from the compiled-in path and environment variables).

I thing that algorithm (currently in getpath.c / getpathp.c) might
also be moved to Python code -- imported frozen.  Sadly, rebuilding
with a new version of a frozen module might be more complicated than
rebuilding with a new version of a C module, but writing and
maintaining this code in Python would be *sooooooo* much easier that I
think it's worth it.

> Note that imputil.py would not install any hooks when it is loaded. That
> is up to site.py. This implies the core C code will import a total of
> three modules using its builtin system. After that, the imputil mechanism
> would be importing everything (site.py would .install() an Importer which
> then takes over the __import__ hook).

(Three not counting the builtin modules.)

> Further note that the "import" Python statement could be simplified to use
> only the hook. However, this would require the core importer to inject
> some module names into the imputil module's namespace (since it couldn't
> use an import statement until a hook was installed). While this
> simplification is "neat", it complicates the run-time system (the import
> statement is broken until a hook is installed).

Same chicken-or-egg.  We can be pragmatic.

For a developer, I'd like a bit of robustness (all this makes it
rather hard to debug a broken imputil, and that's a fair amount of
code!).

> Therefore, the core C code must also support importing builtins. "sys" and
> "imp" are needed by imputil to bootstrap.
> 
> The core importer should not need to deal with dynamic-load modules.

Same question.  Since that all has to be coded in C anyway, why not?

> To support frozen apps, the core importer would need to support loading
> the three modules as frozen modules.

I'd like to see a description of how someone like Jim A would build a
single-file application using the new mechanism.  This could
completely replace freeze.  (Freeze currently requires a C compiler;
that's bad.)

> The builtin/frozen importing would be exposed thru "imp" for use by
> imputil for future imports. imputil would load and use the (builtin)
> platform-specific module to do dynamic-load imports.

Sure.

> > - In order to support importing from zip/jar files using compression,
> >   we'd at least need the zlib extension module and hence libz itself,
> >   which may not be available everywhere.
> 
> Yes. I don't see this as a requirement, though. We wouldn't start to use
> these by default, would we? Or insist on zlib being present? I see this as
> more along the lines of "we have provided a standardized Importer to do
> this, *provided* you have zlib support."

Agreed.  Zlib support is easy to get, but there are probably platforms
where it's not.  (E.g. maybe the Mac?  I suppose that on the Mac,
there would be some importer classes to import from a resource fork.)

> > - I suppose that the bootstrap is solved using a mechanism very
> >   similar to what freeze currently used (other solutions seem to be
> >   platform dependent).
> 
> The bootstrap that I outlined above could be done in C code. The import
> code would be stripped down dramatically because you'll drop package
> support and dynamic loading.

Not the dynamic loading.  But yes the package support.

> Alternatively, you could probably do the path-scanning in Python and
> freeze that into the interpreter. Personally, I don't like this idea as it
> would not buy you much at all (it would still need to return to C for
> accessing a number of scanning functions and module importing funcs).
> 
> > - I also want to still support importing *everything* from the
> >   filesystem, if only for development.  (It's hard enough to deal with
> >   the fact that exceptions.py is needed during Py_Initialize();
> >   I want to be able to hack on the import code written in Python
> >   without having to rebuild the executable all the time.
> 
> My outline above does not freeze anything. Everything resides in the
> filesystem. The C code merely needs a path-scanning loop and functions to
> import .py*, builtin, and frozen types of modules.

Good.  Though I think there's also a need for freezing everything.
And when we go the route of the zip archive, the zip archive handling
code needs to be somewhere -- frozen seems to be a reasonable choice.

> If somebody nukes their imputil.py or site.py, then they return to Python
> 1.4 behavior where the core interpreter uses a path for importing (i.e. no
> packages). They lose dynamically-loaded module support.

But if the path guessing is also done by site.py (as I propose) the
path will probably be wrong.  A warning should be printed.

> > Let's first complete the requirements gathering.  Are these
> > requirements reasonable?  Will they make an implementation too
> > complex?  Am I missing anything?
> 
> I'm not a fan of the compositing due to it requiring a change to semantics
> that I believe are very useful and very clean. However, I outlined a
> possible, clean solution to do that (a secondary set of hooks for
> transforming get_code() return values).

As you may see from my responses, I'm a big fan of having several
different sets of hooks.  I do withdraw the composition requirement
though.

> The requirements are otherwise reasonable to me, as I see that they can
> all be readily solved (i.e. they aren't burdensome).
> 
> While this email may be long, I do not believe the resulting system would
> be complex. From the user-visible side of things, nothing would be
> changed. sys.path is still present and operates as before. They *do* have
> new functionality they can grow into, though (sys.importers). The
> underlying C code is simplified, and the platform-specific dynamic-load
> stuff can be distributed to distinct modules, as needed
> (e.g. BeOS/dynloadmodule.c and PC/dynloadmodule.c).
> 
> > Finally, to what extent does this impact the desire for dealing
> > differently with the Python bytecode compiler (e.g. supporting
> > optimizers written in Python)?  And does it affect the desire to
> > implement the read-eval-print loop (the >>> prompt) in Python?
> 
> If the three startup files require byte-compilation, then you could have
> some issues (i.e. the byte-compiler must be present).

Another chicken-or-egg.  No biggie.

> Once you hit site.py, you have a "full" environment and can easily detect
> and import a read-eval-print loop module (i.e. why return to Python? just 
> start things up right there).

You mean "why return to C?"  I agree.  It would be cool if somehow
IDLE and Pythonwin would also be bootstrapped using the same
mechanisms.  (This would also solve the question "which interactive
environment am I using?" that some modules and apps want to see
answered because they need to do things differently when run under
IDLE,for example.)

> site.py can also install new optimizers as desired, a new Python-based
> parser or compiler, or whatever...  If Python is built without a parser or
> compiler (I hope that's an option!), then the three startup modules would
> simply be frozen into the executable.

More power to hooks!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Thu Dec  2 22:22:33 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 2 Dec 1999 16:22:33 -0500 (EST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us>
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org>
	<199912022043.PAA15108@eric.cnri.reston.va.us>
Message-ID: <14406.58137.359127.921135@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > variable.  Every piece of code that I've ever seen that uses sys.path
 > doesn't care if a directory named in sys.path doesn't exist -- it may
 > try to stat various files in it, which also don't exist, and as far as

  Not the case -- I know you've looked at some of my code in the KOE
that ensures only real directories are on the path, and each is only
there once (pathhack.py).  Given that sys.path is often too long and
includes duplicate entries in a large system (often one entry with and
one without a trailing / for a given directory), it useful to be able
to distinguish between things that should be interpretable as paths
and things that aren't.  It should not be hard to declare that
"cookies" or whatever have some special form, like "<cookie>".

 > (Unrelated remark: I should really try to release the set of modules
 > we've written here at CNRI to deal with zip files.  Unfortunately zip
 > files are hairy and so is our code.)

  It doesn't help that that code just plain stinks.  I maintain that
no one here understands the whole of it.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jcw at equi4.com  Thu Dec  2 22:41:46 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Thu, 02 Dec 1999 22:41:46 +0100
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us>
Message-ID: <3846E79A.446EAFD5@equi4.com>

Guido van Rossum wrote:

[...]
> Note that the interpretation of __file__ could be problematic.  To
> what value do you set __file__ for a module loaded from a zip archive?

Makefiles use "archive(entry)" (this also supports nesting if needed).

[...] 
> I'd like to see a description of how someone like Jim A would build a
> single-file application using the new mechanism.  This could
> completely replace freeze.  (Freeze currently requires a C compiler;
> that's bad.)
[...]

This may be off-topic, but has anyone considered what it would take to
load shared libs out of an archive?  One way is to extract on-the-fly to
a temporary area.  A refinement is to leave extracted files there as
cache, and perhaps even to extract to a file with a name derived from
its MD5 digest (this way multiple users and even Python installations
can share the cache).  Would it be useful to define a "standard" area?

-- Jean-Claude


From gmcm at hypernet.com  Fri Dec  3 00:15:50 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Thu, 2 Dec 1999 18:15:50 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us>
References: Your message of "Fri, 19 Nov 1999 05:29:50 PST."             <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> 
Message-ID: <1267945992-2611810@hypernet.com>

[Guido]
 big snip
> Note that the interpretation of __file__ could be problematic. 
> To what value do you set __file__ for a module loaded from a zip
> archive?

I just left it alone (ie, as it was when I picked up the .pyc). 
Turns out OK, because then when the end user files a bug 
report, the developer can track it down.

> Note: I looked at the doc string for get_code() and I don't
> understand what the difference is between the modname and fqname
> arguments.  If I write "import foo.bar", what are modname and
> fqname?  

As I recall:
 import foo.bar
 -> get_code(None, 'foo', 'foo') # returns foo
 -> get_code(<self>, 'bar', 'foo.bar')

> Why are both present?  

I think so the importer can choose between being tree 
structured or flat.

> I'd like to see a description of how someone like Jim A would
> build a single-file application using the new mechanism.  This
> could completely replace freeze.  (Freeze currently requires a C
> compiler; that's bad.)

I have something working for Linux now. I froze exceptions.py. 
I hacked getpath.c so prefix = exec_prefix = executable's 
directory and the starting path is [prefix]. Although I did it 
differently, you could regard imputil.py and archive.py as 
frozen, too. (On WIndows it's somewhat different, because the 
result uses the stock python15.dll.) This somewhat 
oversimplifies; and I haven't really thought out all the ways 
people might try to use sym links. I'm inclined to think the 
starting path should contain both the executable's real 
directory and the sym link's directory.

> ....  I do withdraw the composition
> requirement though.

Hooray!


- Gordon


From gstein at lyra.org  Fri Dec  3 01:19:14 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 2 Dec 1999 16:19:14 -0800 (PST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <384694D8.DCA3D75E@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>

On Thu, 2 Dec 1999, M.-A. Lemburg wrote:
>...
> Still, I would like to rephrase my 0.02EUR which I already
> posted twice... why not start to think about what these
> importers would do first ? If there are only a handful of
> wishes we could just add them to the builtin machinery and
> be done with it...

I'd rather see the builtin machinery move to Python, regardless of what
system is used and/or what features are added.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Fri Dec  3 04:19:40 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 2 Dec 1999 19:19:40 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912021654100.18529-100000@nebula.lyra.org>

On Thu, 2 Dec 1999, Guido van Rossum wrote:
>...
> Sometime, Greg Stein wrote:
>...
> > On Thu, 18 Nov 1999, Guido van Rossum wrote:
>...
> > > Agreed.  I like some of imputil's features, but I think the API
> > > need to be redesigned.
> > 
> > It what ways? It sounds like you've applied some thought. Do you have any
> > concrete ideas yet, or "just a feeling" :-)  I'm working through some
> > changes from JimA right now, and would welcome other suggestions. I think
> > there may be some outstanding stuff from MAL, but I'm not sure (Marc?)
> 
> I actually think that the way the PVM (Python VM) calls the importer
> ought to be changed.  Assigning to __builtin__.__import__ is a crock.
> The API for __import__ is a crock.

Something like sys.set_import_hook() ?

The other alternative that I see would be to have the C code scan
sys.importers, assuming each are callable objects, and call them with the
appropriate params (e.g. module name). Of course, to move this scanning
into Python would require something like sys.set_import_hook() unless
Python looks for a hard-coded module and entrypoint.

>...
> > Which APIs are you referring to? The "imp" module? The C functions? The
> > __import__ and reload builtins?
> 
> > I'm guessing some of imp, the two builtins, and only one or two C
> > functions.
> 
> All of those.

We can provide Python code to provide compatibility for "imp" and the two
hooks. Nothing we can do to the C code, though. I'm not sure what the
import API looks like from C, and whether they could all stay. A brief
glance looks like most could stay.
[ removing any would change Python's API version, which might be "okay" ]

>...
> > > - load .py/.pyc/.pyo files and shared libraries from files
> > 
> > No problem. Again, a function is needed for platform-specific loading of
> > shared libraries.
> 
> Is it useful to expose the platform differences?  The current
> imp.load_dynamic() should suffice.

This comes up several times throughout this message, and in some off-list
mail Guido and I have exchanged. Namely, "should dynamic loading be part
of the core, or performed via a module?"

I would rather see it become a module, rather than inside the core
(despite the fact that the module would have to be compiled into the
interpreter). I believe this provides more flexibility for people looking
to replace/augment/update/fix dynamic loading on various architectures.
Rather than changing the core, a person can just drop in another module.
The isolation between the core and modules is nicer, aesthetically, to me.

The modules would also be exposing Just Another Importer Function, rather
than a specialized API in the builtin imp module. Also note that it is
easier to keep a module *out* of a Python-based application, than it is to
yank functions out of the core of Python. Frozen apps, embedded apps, etc
could easily leave out dynamic loading.

Are there strict advantages? Not any that I can think of right now (beyond
a bit of ease-of-use mentioned above). It just feels better to me.

>...
> > > - sys.path and sys.modules should still exist; sys.path might
> > > have a slightly different meaning
> > 
> > I would suggest that both retain their *exact* meaning. We introduce
> > sys.importers -- a list of importers to check, in sequence. The first
> > importer on that list uses sys.path to look for and load modules. The
> > second importer loads builtins and frozen code (i.e. modules not on
> > sys.path).
> 
> This is looking like the redesign I was looking for.  (Note that
> imputil's current chaining is not good since it's impossible to remove
> or reorder importers, which I think is a required feature; an explicit
> list would solve this.)

The chaining is an aspect of the current, singular import hook that Python
uses. In the past, I've suggested the installation of a "manager" that
maintains a list. sys.importers is similar in practice.

Note that this Manager would be present with the sys.set_import_hook()
scheme, while the Manager is implied if the core scans sys.importers.

> Actually, the order is the other way around, but by now you should
> know that.  It makes sense to have separate ones for builtin and
> frozen modules -- these have nothing in common.

Yes, JimA pointed this out. The latest imputil has corrected this.

I combined the builtin and frozen Importers because they were just so
similar. I didn't want to iterate over two Importers when a single one
sufficed quite well.

*shrug* Could go either way, really.

> There's another issue, which isn't directly addressed by imputil,
> although with clever use of inheritance it might be doable.  I'd like
> more support for this however.  Quite orthogonally to the issue of
> having separate importers, I might want to recognize new extensions.

Correct: while imputil doesn't address this, the standard/default Importer
classes *definitely* can.

>...
> the directory/directories with .isl files are placed.)  This requires
> an ugly modification to the _fs_import() function.  (Which should have
> been a method, by the way, to make overriding it in a subclass of
> PathImporter easier!)

I yanked that code out of the DirectoryImporter so that the PathImporter
could use it. I could see a reorg that creates a FileSystemImporter that
defines the method, and the other two just subclass from that.

> I've been thinking here along the lines of a strategy where the
> standard importer (the one that walks sys.path) has a set of hooks
> that define various things it could look for, e.g. .py files, .pyc
> files, .so or .dll files.  This list of hooks could be changed to
> support looking for .isl files.

Agreed. It should be easy to have a mapping of extension to handler.

One issue: should there be an ordering to the extensions? Exercise for the
reader to alter the data structures...

> There's an old, subtle issue that could be solved through this as
> well: whether or not a .pyc file without a .py file should be accepted
> or not.  Long ago (in Python 0.9.8) a .pyc file alone would never be
> loaded.  This was changed at the request of a small but vocal minority
> of Python developers who wanted to distribute .pyc files without .py
> files.  It has occasionally caused frustration because sometimes
> developers move .py files around but forget to remove the .pyc files,
> and then the .pyc file is silently picked up if it occurs on sys.path
> earlier than where the .py was moved to.

I think, "too bad for them."  :-)

Having just a .pyc is a very nice feature. But how can you tell whether it
was meant to be a plain .pyc or a mis-ordered one? To truly resolve that,
you would need to scan the whole path, looking for a .py. However, maybe
somebody put the .pyc there on purpose, to override the .py!

--- begin slightly-off-topic ---

Here is a neat little Bash script that allows you to use a .pyc as a CGI
(to avoid parse overhead). Normally, you can't just drop a .pyc into the
cgi-bin directory because the OS doesn't know how to execute it. Not a
problem, I say... just append your .pyc to the following Bash script and
execute! :-)

#!/bin/bash
exec - 3< $0 ; exec python -c 'import os,marshal ; f = os.fdopen(3, "rb")
; f.readline() ; f.readline() ; f.seek(8, 1) ; _c = marshal.load(f) ; del
os, marshal, f ; exec _c' $@

(the script should be two lines; and no... you can't use readlines(2))

The above script will preserve stdin, stdout, and stderr. If the caller
also use 3< ... well, that got overridden :-)

The script doesn't work on Windows for two reasons, though: 1) Bash, 2)
the "rb" mode followed by readline()

Detailed info at the bottom of http://www.lyra.org/greg/python/

--- end of off-topic ---

> Having a set of hooks for various extensions would make it possible to
> have a default where lone .pyc files are ignored, but where one can
> insert a .pyc importer in the list of hooks that does the right thing
> here.  (Of course, it may be possible that this whole feature of lone
> .pyc files should be replaced since the same need is easily taken care
> of by zip importers.

Maybe. I'd still like to see plain .pyc files, but I know I can work
around any change you might make here :-)

(i.e. whatever you'd like to do... go for it)

> I also want to support (Jim A notwithstanding :-) a feature whereby
> different things besides directories can live on sys.path, as long as
> they are strings -- these could be added from the PYTHONPATH env
> variable.  Every piece of code that I've ever seen that uses sys.path
> doesn't care if a directory named in sys.path doesn't exist -- it may
> try to stat various files in it, which also don't exist, and as far as
> it is concerned that is just an indication that the requested module
> doesn't live there.

I'm not in favor of this, but it is more-than-doable. Again: your
discretion...

> Again, we would have to dissect imputil to support various hooks that
> deal with different kind of entities in sys.path.  The default hook
> list would consist of a single item that interprets the name as a
> directory name; other hooks could support zip files or URLs.  Jack's
> "magic cookies" could also be supported nicely through such a
> mechanism.

Specifically, the PathImporter would get "dissected" :-). No problem.

> > Users can insert/append new importers or alter sys.path as before.
> > 
> > sys.modules continues to record name:module mappings.
> 
> Yes.
> 
> Note that the interpretation of __file__ could be problematic.  To
> what value do you set __file__ for a module loaded from a zip archive?

You don't (certainly in a way that is nice/compatible for modules that
refer to it). This is why I don't like __file__ and __path__. They just
don't make sense in archives or frozen code. Python code that relies on
them will create problems when that code is placed into different
packaging mechanisms.

>...
> > > (I wouldn't mind a splitting up of importdl.c into several
> > > platform-specific files, one of which is chosen by the configure
> > > script; but that's a bit of a separate issue.)
> > 
> > Easy enough. The standard importer can select the appropriate
> > platform-specific module/function to perform the load. i.e. these can move
> > to Modules/ and be split into a module-per-platform.
> 
> Again: what's the advantage of exposing the platform specificity?

See above.

>...
> Probably more support is required from the other end: once it's common
> for modules to be imported from zip files, the distutil code needs to
> support the creation and installation of such zip files.  Also, there
> is a need for the install phase of distutil to communicate the
> location of the zip file to the Python installation.

I'm quite confident that something can be designed that would satisfy the
needs here. Something akin to .pth files that a zip importer could read.

>...
> > > - Standard import from zip or jar files, in two ways:
> > > 
> > >   (1) an entry on sys.path can be a zip/jar file instead of a directory;
> > >       its contents will be searched for modules or packages
> 
> Note that this is what I mention above for distutil support.
> 
> > While this could easily be done, I might argue against it. Old
> > apps/modules that process sys.path might get confused.
> 
> Above I argued that this shouldn't be a problem.

For most code, no, but as Fred mentioned (and I surmise), there are things
out there assuming that sys.path contains strings which specify
directories.

Sure, we can do this (your discretion), but my feeling is to avoid it.

> > If compatibility is not an issue, then "No problem."
> > 
> > An alternative would be an Importer instance added to sys.importers that
> > is configured for a specific archive (in other words, don't add the zip
> > file to sys.path, add ZipImporter(file) to sys.importers).
> 
> This would be harder for distutil: where does Python get the initial
> list of importers?

Default is just the two: BuiltinImporter and PathImporter. Adding
ZipImporters (or anything else) at startup is TBD, but shouldn't pose a
problem.

>...
> > >   (2) a file in a directory that's on sys.path can be a zip/jar file;
> > >       its contents will be considered as a package (note that this is
> > >       different from (1)!)
> > 
> > No problem. This will slow things down, as a stat() for *.zip and/or *.jar
> > must be done, in addition to *.py, *.pyc, and *.pyo.
> 
> Fine, this is where the caching comes in handy.

IFF caching is enabled for the particular platform and installation.

>...
> > The Importer class is already designed for subclassing (and its interface 
> > is very narrow, which means delegation is also *very* easy; see
> > imputil.FuncImporter).
> 
> But maybe it's *too* narrow; some of the hooks I suggest above seem to
> require extra interfaces -- at least in some of the subclasses of the
> Importer base class.

Correct -- the *subclasses*. I still maintain the imputil design of a
single hook (get_code) is Right.

I'll make a swipe at PathImporter in the next few weeks to add the
capability for new extensions.

> Note: I looked at the doc string for get_code() and I don't understand
> what the difference is between the modname and fqname arguments.  If I
> write "import foo.bar", what are modname and fqname?  Why are both
> present?  Also, while you claim that the API is narrow, the multiple
> return values (also the different types for the second item) make it
> complicated.

Gordon detailed this in another note...

Yes, the multiple return values make it a bit more complicated, but I
can't think of any reasonable alternatives.

A bit more doc should do the trick, I'd guess.

>...
> > >   - a hook to auto-generate .py files from other filename
> > >     extensions (as currently implemented by ILU)
> > 
> > No problem at all.
> 
> See above -- I think this should be more integrated with sys.path than
> you are thinking of.  The more I think about it, the more I see that
> the problem is that for you, the importer that uses sys.path is a
> final subclass of Importer (i.e. it is itself not further subclassed).
> Several of the hooks I want seem to require additional hooks in the
> PathImporter rather than new importers.

Correct -- I've currently designed/implemented PathImporter as "final".

I don't forsee a problem turning it into something that can be hooked at
run-time, or subclassed at code-time. A detailing of the features needed 
would be handy:

* allow alternative file suffixes, with functions or subclasses to map the
  file into a code/module object.

>...
> > > - Note that different kinds of hooks should (ideally, and within
> > >   reason) properly combine, as follows: if I write a hook to recognize
> > >   .spam files and automatically translate them into .py files, and you
> > >   write a hook to support a new archive format, then if both hooks are
> > >   installed together, it should be possible to find a .spam file in an
> > >   archive and do the right thing, without any extra action.  Right?
> > 
> > Ack. Very, very difficult.
> 
> Actually, I take most of this back.  Importers that deal with new
> extension types often have to go through a file system to transform
> their data to .py files, and this is just too complicated.  However it
> would be still nice if there was code sharing between the code that
> looks for .py and .pyc files in a zip archive and the code that does
> the same in a filesystem.  Hm, maybe even that shouldn't be necessary,
> the zip file probably should contain only .pyc files...

Gordon replies to this... All of the archives that myself, Gordon, and
JimA have been using only store .pyc files. I don't see much code sharing
between the filesystem and archive import code.

>...
> > All is not lost, however. I can easily envision the get_code() hook as
> > allowing any kind of return type. If it isn't a code or module object,
> > then another hook is called to transform it.
> > [ actually, I'd design it similarly: a *series* of hooks would be called
> >   until somebody transforms the foo.spam into a code/module object. ]
> 
> OK.  This could be a feature of a subclass of Importer.

That would be my preference, rather than loading more into the Importer
base class itself.

>...
> > > - It should be possible to write hooks in C/C++ as well as Python
> > 
> > Use FuncImporter to delegate to an extension module.
> 
> Maybe not so great, since it sounds like the C code can't benefit from
> any of the infrastructure that imputil offers.  I'm not sure about
> this one though.

There isn't any infrastructure that needs to be accessed. get_code() is
the call-point, and there is no mechanism provided to the callee to call
back into the imputil system.

> > This is one of the benefits of imputil's single/narrow interface.
> 
> Plus its vague specs? :-)

Ouch. I thought I was actually doing quite a bit better than normal with
that long doc-string on get_code :-(

>...
> > For a restricted execution app, it might install an Importer that loads
> > files from *one* directory only which is configured from a specific
> > Win32 Registry entry. That importer could also refuse to load shared
> > modules. The BuiltinImporter would still be present (although the app
> > would certainly omit all but the necessary builtins from the build).
> > Frozen modules could be excluded.
> 
> Actually there's little reason to exclude frozen modules or any
> .py/.pyc modules -- by definition, bytecode can't be dangerous.  It's
> the builtins and extensions that need to be censored.
> 
> We currently do this by subclassing ihooks, where we mask the test for
> builtins with a comparison to a predefined list of names.

True. My concern is an invader misusing one "type" of module for another.
For example, let's say you've provided a selection of modules each
exporting function FOO, and the user can configure which module to use.
Can they do damage if some unrelated, frozen module also exports FOO?

Minor issue, anyhow. All the functionality is there.

>...
> > I posited once before that the cost of import is mostly I/O rather than
> > CPU, so using Python should not be an issue. MAL demonstrated that a good
> > design for the Importer classes is also required. Based on this, I'm a
> > *strong* advocate of moving as much as possible into Python (to get
> > Python's ease-of-coding with little relative cost).
> 
> Agreed.  However, how do you explain the slowdown (from 9 to 13
> seconds I recall) though?  Are you a lousy coder? :-)

Heh :-)

I have not spent *any* time working on optimization. Currently, each
Importer in the chain redoes some work of the prior Importer. A bit of
restructuring would split the common work out to a Manager, which then
calls a method in the Importer (and passes all the computed work). Of
course, a bit of profiling wouldn't hurt either. Some of the "imp"
interfaces could possibly be refined to better support the BuiltinImporter
or the dynamic load features.

The question is still valid, though -- at the moment, I can't explain it
because I haven't looked into it.

> > The (core) C code should be able to search a path for a module and import
> > it. It does not require dynamic loading or packages. This will be used to
> > import exceptions.py, then imputil.py, then site.py.

Note: after writing this, I realized there is really no need for the core
to do the imputil import. site.py can easily do that.

> It does, however, need to import builtin modules.  imputil currently

Correct.

> imports imp, sys, strop and __builtin__, struct and marshal; note that
> struct can easily be a dynamic loadable module, and so could strop in
> theory.  (Note that strop will be unnecessary in 1.6 if you use string
> methods.)

I knew about strop, but imputil would be harder to use today if it relied
on the string methods. So... I've delayed that change.

The struct module is used in a couple teeny cases, dealing with
constructing a network-order, 4-byte, binary integer value. It would be
easy enough to just do that with a bit of Python code instead.

> I don't think that this chicken-or-egg problem is particularly
> problematic though.

Right.

In my ideal world, the core couldn't do a dynamic load, so that would need
to be considered within the bootstrap process.

>...
> > site.py can complete the bootstrap by setting up sys.importers with the
> > appropriate Importer instances (this is where an application can define
> > its own policy). sys.path was initially set by the import.c bootstrap code
> > (from the compiled-in path and environment variables).
> 
> I thing that algorithm (currently in getpath.c / getpathp.c) might
> also be moved to Python code -- imported frozen.  Sadly, rebuilding
> with a new version of a frozen module might be more complicated than
> rebuilding with a new version of a C module, but writing and
> maintaining this code in Python would be *sooooooo* much easier that I
> think it's worth it.

I think we can find a better way to freeze modules and to use them.
Especially for the cases where we have specific "core" functions
implemented in Python. (e.g. freezing parsers, compilers, and/or the
read-eval loop)

I don't forsee an issue that the build process becomes more complicated.
If we nuke "makesetup" in favor of a Python script, then we could create a
stub Python executable which runs the build script which writes the Setup
file and the getpath*.c file(s).

> > Note that imputil.py would not install any hooks when it is loaded. That
> > is up to site.py. This implies the core C code will import a total of
> > three modules using its builtin system. After that, the imputil mechanism
> > would be importing everything (site.py would .install() an Importer which
> > then takes over the __import__ hook).
> 
> (Three not counting the builtin modules.)

Correct, although I'll modify my statement to "two plus the builtins".

> > Further note that the "import" Python statement could be simplified to use
> > only the hook. However, this would require the core importer to inject
> > some module names into the imputil module's namespace (since it couldn't
> > use an import statement until a hook was installed). While this
> > simplification is "neat", it complicates the run-time system (the import
> > statement is broken until a hook is installed).
> 
> Same chicken-or-egg.  We can be pragmatic.
> 
> For a developer, I'd like a bit of robustness (all this makes it
> rather hard to debug a broken imputil, and that's a fair amount of
> code!).

True. I threw that out as an alternative, and then presented the counter
argument :-)

>...
> > Therefore, the core C code must also support importing builtins. "sys" and
> > "imp" are needed by imputil to bootstrap.
> > 
> > The core importer should not need to deal with dynamic-load modules.
> 
> Same question.  Since that all has to be coded in C anyway, why not?

It simplifies the core's import code to not deal with that stuff at all.

> > To support frozen apps, the core importer would need to support loading
> > the three modules as frozen modules.
> 
> I'd like to see a description of how someone like Jim A would build a
> single-file application using the new mechanism.  This could
> completely replace freeze.  (Freeze currently requires a C compiler;
> that's bad.)

The portable mechanism for freezing will always need a compiler. Platform
specific mechanisms (e.g. append to the .EXE, or use the linker to create
a new ELF section) can optimize the freeze process in different ways.

I don't have a design in my head for the freeze issues -- I've been
considering that the mechanism would remain about the same. However, I can
easily see that different platforms may want to use different freeze
processes... hmm...

>...
> > Yes. I don't see this as a requirement, though. We wouldn't start to use
> > these by default, would we? Or insist on zlib being present? I see this as
> > more along the lines of "we have provided a standardized Importer to do
> > this, *provided* you have zlib support."
> 
> Agreed.  Zlib support is easy to get, but there are probably platforms
> where it's not.  (E.g. maybe the Mac?  I suppose that on the Mac,
> there would be some importer classes to import from a resource fork.)

Exactly. And importer classes to load from a Win32 resources (modifying a
.EXE's resources post-link is cleaner than the append solution)

>...
> > My outline above does not freeze anything. Everything resides in the
> > filesystem. The C code merely needs a path-scanning loop and functions to
> > import .py*, builtin, and frozen types of modules.
> 
> Good.  Though I think there's also a need for freezing everything.
> And when we go the route of the zip archive, the zip archive handling
> code needs to be somewhere -- frozen seems to be a reasonable choice.

Sure.

> > If somebody nukes their imputil.py or site.py, then they return to Python
> > 1.4 behavior where the core interpreter uses a path for importing (i.e. no
> > packages). They lose dynamically-loaded module support.
> 
> But if the path guessing is also done by site.py (as I propose) the
> path will probably be wrong.  A warning should be printed.

All right. Doesn't Python already print a warning if it can't find
site.py?

> > > Let's first complete the requirements gathering.  Are these
> > > requirements reasonable?  Will they make an implementation too
> > > complex?  Am I missing anything?
> > 
> > I'm not a fan of the compositing due to it requiring a change to semantics
> > that I believe are very useful and very clean. However, I outlined a
> > possible, clean solution to do that (a secondary set of hooks for
> > transforming get_code() return values).
> 
> As you may see from my responses, I'm a big fan of having several
> different sets of hooks.

Yes. However, I've only recognized one so far. Propose more... I'm
confident we can update the PathImporter design to accomodate (and retain
the underlying imputil paradigm).

> I do withdraw the composition requirement
> though.

:-)

>...
> > Once you hit site.py, you have a "full" environment and can easily detect
> > and import a read-eval-print loop module (i.e. why return to Python? just 
> > start things up right there).
> 
> You mean "why return to C?"  I agree.  It would be cool if somehow

Heh. Yah, that's what I meant :-)

> IDLE and Pythonwin would also be bootstrapped using the same
> mechanisms.  (This would also solve the question "which interactive
> environment am I using?" that some modules and apps want to see
> answered because they need to do things differently when run under
> IDLE,for example.)

Haven't thought on this. Should be doable, I'd think.

> > site.py can also install new optimizers as desired, a new Python-based
> > parser or compiler, or whatever...  If Python is built without a parser or
> > compiler (I hope that's an option!), then the three startup modules would
> > simply be frozen into the executable.
> 
> More power to hooks!

:-) You betcha!

I believe my next order of business:

* update PathImporter with the file-extension hook
* dynload C code reorg, per the other email
* create new-model site.py and trash import.c
* review freeze mechanisms and process
* design mechanism for frozen core functionality (eg. getpath*.c)
  (coding and building design)
* shift core functions to Python, using above design

I'll just plow ahead, but also recognize that any/all may change. ie. I'll
build examples/finals/prototypes and Guido can pick/choose/reimplement/etc
as needed. I'm out next week, but should start on the above items by the
end of the month (will probably do another mod_dav release in there
somewhere).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik at pythonware.com  Fri Dec  3 11:10:10 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 3 Dec 1999 11:10:10 +0100
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com>
Message-ID: <023601bf3d78$0ec3dc30$f29b12c2@secret.pythonware.com>

Jean-Claude Wippler <jcw at equi4.com> wrote:
> This may be off-topic, but has anyone considered what it would take to
> load shared libs out of an archive?

well, we do that in a number of applications.

(lazy installers are really cool... if you've installed works,
you've seen some weird stuff -- for example, when the
application starts the first time, it's loading everything
from inside the installer.  the rest of the installation is
done from within the application itself, using archives
in the installation executable)

I think things like this are better left for the application
designers, though...

</F>


From mal at lemburg.com  Fri Dec  3 11:03:31 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 03 Dec 1999 11:03:31 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>
Message-ID: <38479573.B2CFDD2B@lemburg.com>

Greg Stein wrote:
> 
> On Thu, 2 Dec 1999, M.-A. Lemburg wrote:
> >...
> > Still, I would like to rephrase my 0.02EUR which I already
> > posted twice... why not start to think about what these
> > importers would do first ? If there are only a handful of
> > wishes we could just add them to the builtin machinery and
> > be done with it...
> 
> I'd rather see the builtin machinery move to Python, regardless of what
> system is used and/or what features are added.

In the long run that's probably the right direction, but right now
we are only talking a very small set of additional features,
which can easily be added to the existing code without too much
fuzz.

Plus it won't slow things down, which is important since
Python startup time is already an issue all by itself. The
imputil.py approach of doing (a whole bunch of) recursive Python
function calls to all kinds of importers will not speed this up,
I'm afraid. A on-disk lookup table would speed this up, but
it would also break the current logic in imputil.py, which
puts importer independence above all.

--

IMHO, we should retreat to a more centralized interface,
one which more resembles a manager rather than the agent
interface implemented in imputil.py. Add-ons can then
register themselves to say "hey, I can handle pyz-archives"
or "I know how to import .so modules" or "I provide a
search function which you can call to have me scan
my module container (directory, web-site, archive)".

The manager would take care of what to call and in which
order, plus delegate requests to add-ons which implement
the needed logic, e.g. add-ons for signature checking, unzipping
archives, file system lookup tables, etc.

It could also trace its actions and then keep an on-disk
knowledge base for what it did in the past to find certain
modules under certain conditions.

Anyway, all this is extra magic for some future version of
Python.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    28 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido at CNRI.Reston.VA.US  Fri Dec  3 14:45:07 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 03 Dec 1999 08:45:07 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: Your message of "Fri, 03 Dec 1999 11:03:31 +0100."
             <38479573.B2CFDD2B@lemburg.com> 
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>  
            <38479573.B2CFDD2B@lemburg.com> 
Message-ID: <199912031345.IAA16376@eric.cnri.reston.va.us>

[Greg]
> > I'd rather see the builtin machinery move to Python, regardless of what
> > system is used and/or what features are added.

[Marc]
> In the long run that's probably the right direction, but right now
> we are only talking a very small set of additional features,
> which can easily be added to the existing code without too much
> fuzz.

I disagree.  We should do the redisign right rather than tweaking the
existing code.

> Plus it won't slow things down, which is important since
> Python startup time is already an issue all by itself. The
> imputil.py approach of doing (a whole bunch of) recursive Python
> function calls to all kinds of importers will not speed this up,
> I'm afraid. A on-disk lookup table would speed this up, but
> it would also break the current logic in imputil.py, which
> puts importer independence above all.

I don't care about the current logic in imputil.  It's only a prototype!

> IMHO, we should retreat to a more centralized interface,
> one which more resembles a manager rather than the agent
> interface implemented in imputil.py. Add-ons can then
> register themselves to say "hey, I can handle pyz-archives"
> or "I know how to import .so modules" or "I provide a
> search function which you can call to have me scan
> my module container (directory, web-site, archive)".

This makes sense.

> The manager would take care of what to call and in which
> order, plus delegate requests to add-ons which implement
> the needed logic, e.g. add-ons for signature checking, unzipping
> archives, file system lookup tables, etc.
> 
> It could also trace its actions and then keep an on-disk
> knowledge base for what it did in the past to find certain
> modules under certain conditions.
> 
> Anyway, all this is extra magic for some future version of
> Python.

I would say the manager API design and a basic set of specific
handlers should go into 1.6.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik at pythonware.com  Fri Dec  3 15:14:00 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 3 Dec 1999 15:14:00 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us>
Message-ID: <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com>

MAL wrote:
> > IMHO, we should retreat to a more centralized interface,
> > one which more resembles a manager rather than the agent
> > interface implemented in imputil.py. Add-ons can then
> > register themselves to say "hey, I can handle pyz-archives"
> > or "I know how to import .so modules" or "I provide a
> > search function which you can call to have me scan
> > my module container (directory, web-site, archive)".

but why?  in my small-minded view of how python
works, an importer carries out a very simple task:

    given a name, check if you have a
    module with that name, and install
    it.  if you cannot, fail (in which case
    python asks the next importer along
    the path).

why do you have to complicate things beyond that?
why not just let Python provide a few base classes
and mixins for people who want to create custom
importers, and be done with it?

rationale, please.

</F>


From jim at interet.com  Fri Dec  3 15:34:40 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 03 Dec 1999 09:34:40 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org> <38479573.B2CFDD2B@lemburg.com>
Message-ID: <3847D500.53833D06@interet.com>

"M.-A. Lemburg" wrote:
> 
> Greg Stein wrote:

> > I'd rather see the builtin machinery move to Python, regardless of what
> > system is used and/or what features are added.
> 
> In the long run that's probably the right direction, but right now
> we are only talking a very small set of additional features,
> which can easily be added to the existing code without too much
> fuzz.

I volunteer to write a Python archive in either Python or C.  In
fact I currently have prototypes for both.  But I have to agree
with Greg here.  I think a Python importer is the way to go.  The
C code is 300 lines mostly in import.c and parallel to existing code.
The Python archive is about 100 lines and is prettier, easy to read,
alter and re-use (obviously).

> Plus it won't slow things down, which is important since
> Python startup time is already an issue all by itself. The

I think archive files should be able to be fast, and should
help, not hurt, startup time.  Provided that the use of sys.path
is curtailed, os.readdir() is not needed, and the
specifications are not complicated.

Although archive files are my special concern, I realize that
imputil is not just about archives.

JimA


From guido at CNRI.Reston.VA.US  Fri Dec  3 15:39:25 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 03 Dec 1999 09:39:25 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Thu, 02 Dec 1999 19:19:40 PST."
             <Pine.LNX.4.10.9912021654100.18529-100000@nebula.lyra.org> 
References: <Pine.LNX.4.10.9912021654100.18529-100000@nebula.lyra.org> 
Message-ID: <199912031439.JAA16524@eric.cnri.reston.va.us>

Greg,

Great response.  I think we know where we each stand.  Please go ahead
with a new design.  (That's trust, not carte blanche.)

Just one thought: the more I think about it, the less I like
sys.importers: functionality which is implemented through
sys.importers must necessarily be placed either in front of all of
sys.path or after it.  While this is helpful for "canned" apps that
want *everything* to be imported from a fixed archive, I think that
for regular Python installations sys.path should remain the point of
attack.  In particular, installing a new package (e.g. PIL) should
affect sys.path, regardless of the way of delivery of the modules
(shared libs, .py files, .pyc files, or a zip archive).

I'm not too worried about code that inspects sys.path and expects
certain invariants; that code is most likely interfering with the
import mechanism so should be revisited anyway.

On the lone .pyc issue: I'd like to see this disappear when using the
filesystem, I see no use for it there if we support .pyc files in zip
archives.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at interet.com  Fri Dec  3 15:44:54 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 03 Dec 1999 09:44:54 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com>
Message-ID: <3847D766.1E5FFAF3@interet.com>

Jean-Claude Wippler wrote:
> 
> Guido van Rossum wrote:
> 
> [...]
> > Note that the interpretation of __file__ could be problematic.  To
> > what value do you set __file__ for a module loaded from a zip archive?
> 
> Makefiles use "archive(entry)" (this also supports nesting if needed).

I discovered the hard way this entry is not optional.  I just
used the archive file name for __file__.

> This may be off-topic, but has anyone considered what it would take to
> load shared libs out of an archive?  One way is to extract on-the-fly to
> a temporary area.  A refinement is to leave extracted files there as
> cache, and perhaps even to extract to a file with a name derived from
> its MD5 digest (this way multiple users and even Python installations
> can share the cache).  Would it be useful to define a "standard" area?

IMHO putting shared libs in an archive is a bad idea because the OS
can not use them there.  They must be extracted as you say.  But then
storage is wasted by using space in the archive and the external file.
Deleting them after use wastes time.  Better to leave them out of the
archive and provide for them in the installer.  IMHO the
archive is a basic simple feature, and people make installers on top
of that.  Archives shouldn't try to do it all.

JimA


From mal at lemburg.com  Fri Dec  3 15:14:09 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 03 Dec 1999 15:14:09 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>  
	            <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us>
Message-ID: <3847D030.2C936E24@lemburg.com>

Guido van Rossum wrote:
> 
> [Greg]
> > > I'd rather see the builtin machinery move to Python, regardless of what
> > > system is used and/or what features are added.
> 
> [Marc]
> > In the long run that's probably the right direction, but right now
> > we are only talking a very small set of additional features,
> > which can easily be added to the existing code without too much
> > fuzz.
> 
> I disagree.  We should do the redisign right rather than tweaking the
> existing code.

Ok, then...
 
> > IMHO, we should retreat to a more centralized interface,
> > one which more resembles a manager rather than the agent
> > interface implemented in imputil.py. Add-ons can then
> > register themselves to say "hey, I can handle pyz-archives"
> > or "I know how to import .so modules" or "I provide a
> > search function which you can call to have me scan
> > my module container (directory, web-site, archive)".
> 
> This makes sense.
> 
> > The manager would take care of what to call and in which
> > order, plus delegate requests to add-ons which implement
> > the needed logic, e.g. add-ons for signature checking, unzipping
> > archives, file system lookup tables, etc.
> >
> > It could also trace its actions and then keep an on-disk
> > knowledge base for what it did in the past to find certain
> > modules under certain conditions.
> >
> > Anyway, all this is extra magic for some future version of
> > Python.
> 
> I would say the manager API design and a basic set of specific
> handlers should go into 1.6.

BTW, is there a timeline for the 1.6 release ? I mean which
things will have to be in 1.6 ?

Some recent topics as hints:

1. Unicode
2. Import Manager API + default handlers
3. Python style coercion at C type level
4. Rich comparisons
5. __doc__ string extraction tool

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    28 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Fri Dec  3 15:24:04 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 03 Dec 1999 15:24:04 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com>
Message-ID: <3847D284.8CBF2A9C@lemburg.com>

Fredrik Lundh wrote:
> 
> MAL wrote:
> > > IMHO, we should retreat to a more centralized interface,
> > > one which more resembles a manager rather than the agent
> > > interface implemented in imputil.py. Add-ons can then
> > > register themselves to say "hey, I can handle pyz-archives"
> > > or "I know how to import .so modules" or "I provide a
> > > search function which you can call to have me scan
> > > my module container (directory, web-site, archive)".
> 
> but why?  in my small-minded view of how python
> works, an importer carries out a very simple task:
> 
>     given a name, check if you have a
>     module with that name, and install
>     it.  if you cannot, fail (in which case
>     python asks the next importer along
>     the path).
> 
> why do you have to complicate things beyond that?
> why not just let Python provide a few base classes
> and mixins for people who want to create custom
> importers, and be done with it?

Because importing in Python has become *much* more
complicated over time. There are requests for new
features which touch subjects such as storage mechanisms,
lookups, signatures (for trusted code), lazy imports, etc.

A chain of simple minded importers won't work together
too well, duplicate work and downgrade performance
considerably due to the many recursive function calls.
Also, centralized caching strategies are hard to implement
across import handlers.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    28 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jeremy at cnri.reston.va.us  Fri Dec  3 17:47:54 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Fri, 3 Dec 1999 11:47:54 -0500 (EST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <14406.58137.359127.921135@weyr.cnri.reston.va.us>
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org>
	<199912022043.PAA15108@eric.cnri.reston.va.us>
	<14406.58137.359127.921135@weyr.cnri.reston.va.us>
Message-ID: <14407.62522.360386.757519@goon.cnri.reston.va.us>

>>>>> "FLD" == Fred L Drake, <fdrake at acm.org> writes:

  >> (Unrelated remark: I should really try to release the set of
  >> modules we've written here at CNRI to deal with zip files.
  >> Unfortunately zip files are hairy and so is our code.)

  FLD>   It doesn't help that that code just plain stinks.  I maintain
  FLD> that no one here understands the whole of it.

I'm all for improving the code and getting it out.  The real problem
is that interfaces have been glommed on for every new use of a Zip
file.  (You want to read one off a socket and extract files before
you've got the whole thing?  No problem! Add a new class.)  We need to
figure out the common patterns for using the archives and write a new
set of interfaces to support that.

Jeremy


From guido at CNRI.Reston.VA.US  Fri Dec  3 18:12:07 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 03 Dec 1999 12:12:07 -0500
Subject: [Python-Dev] What to do with our Zip code?
In-Reply-To: Your message of "Fri, 03 Dec 1999 11:47:54 EST."
             <14407.62522.360386.757519@goon.cnri.reston.va.us> 
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <14406.58137.359127.921135@weyr.cnri.reston.va.us>  
            <14407.62522.360386.757519@goon.cnri.reston.va.us> 
Message-ID: <199912031712.MAA17061@eric.cnri.reston.va.us>

[Jeremy, on our Zip code]
> I'm all for improving the code and getting it out.  The real problem
> is that interfaces have been glommed on for every new use of a Zip
> file.  (You want to read one off a socket and extract files before
> you've got the whole thing?  No problem! Add a new class.)  We need to
> figure out the common patterns for using the archives and write a new
> set of interfaces to support that.

If we gave you the code we currently have, would someone else in this
forum be willing to redesign it?  Eventually it would become part of
the Python distribution.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik at pythonware.com  Sat Dec  4 10:54:30 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 4 Dec 1999 10:54:30 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com>
Message-ID: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com>

M.-A. Lemburg <mal at lemburg.com> wrote:
> >     given a name, check if you have a
> >     module with that name, and install
> >     it.  if you cannot, fail (in which case
> >     python asks the next importer along
> >     the path).
> > 
> > why do you have to complicate things beyond that?
> > why not just let Python provide a few base classes
> > and mixins for people who want to create custom
> > importers, and be done with it?
> 
> Because importing in Python has become *much* more
> complicated over time. There are requests for new
> features which touch subjects such as storage mechanisms,
> lookups, signatures (for trusted code), lazy imports, etc.

sorry, I still don't understand it.  our applications already
use different storage mechanisms, databases, signatures,
lazy importing, version handling, etc, etc.  now, if *we*
have managed to build all that on top of an old version
of imputil.py, how come it's not sufficient for the rest
of you?

> A chain of simple minded importers won't work together
> too well

why?  it sure works for us...

> duplicate work

avoiding duplicate work is what object oriented design
is all about.  and last time I checked, Python had excellent
support for that.

> and downgrade performance considerably due to the
> many recursive function calls

now that's what I call premature optimization.  and this
scares the hell out of me: if the rest of the python-dev
crowd don't seriously believe that Python is (or can be
made) fast enough to implement things like this, why
the heck are you using Python at all?  am I the only
one here who doesn't believe in osterhout's talk about
"the great system vs. scripting language divide"?

</F>


From fredrik at pythonware.com  Sat Dec  4 10:54:42 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 4 Dec 1999 10:54:42 +0100
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com>
Message-ID: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com>

James C. Ahlstrom <jim at interet.com> wrote:
> IMHO putting shared libs in an archive is a bad idea because the OS
> can not use them there.  They must be extracted as you say.  But then
> storage is wasted by using space in the archive and the external file.
> Deleting them after use wastes time.  Better to leave them out of the
> archive and provide for them in the installer.  IMHO the
> archive is a basic simple feature, and people make installers on top
> of that.  Archives shouldn't try to do it all.

have you tried it?  if not, why do you think you should
be allowed to forbid others from doing it?

in "the inmates are running the asylum", alan cooper
points out that the *major* reason people all over the
world love web applications are that there are no
bloody installers.  and here you are advocating that
we all should be forced to use installers, when python
makes it trivial to write self-installing apps. double-argh!

(on the other hand, why do I complain? all pythonworks
customers is going to be able to do all this anyway...).

<rant size="major">

frankly, this "design by committee" (or is it "design by
people who've never even been close to implementing
something because they thought it was too hard, and
thus think they're qualified to argue against those of
us who didn't even realize that it was a hard problem"?)
trend I've been seeing in all kinds of python forums
makes me sooooo sad.  the more of this I see (dist-
utils-sig, doc-sig, here, c.l.python), the sadder I get,
and the more I sympathise with John Skaller who's
defining his own python-like universe...

if someone needs me, I'll be down in the pub having
a beer with the mad scientist, the shiny eff-bot, and
mr. nitpicker.  if we're not there, you'll find us in the
lab, working on new string matching facilities for 1.6,
SOAP [1], tkinter replacements for the masses, and
whatever else we can come up with...  see you!

</rant>

1) http://www.newsalert.com/bin/story?StoryId=Coenz0bWbu0znmdKXqq


From gstein at lyra.org  Sat Dec  4 11:42:27 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 02:42:27 -0800 (PST)
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
In-Reply-To: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com>
Message-ID: <Pine.LNX.4.10.9912040232240.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, Fredrik Lundh wrote:
> M.-A. Lemburg <mal at lemburg.com> wrote:
>...
> > Because importing in Python has become *much* more
> > complicated over time. There are requests for new
> > features which touch subjects such as storage mechanisms,
> > lookups, signatures (for trusted code), lazy imports, etc.
> 
> sorry, I still don't understand it.  our applications already
> use different storage mechanisms, databases, signatures,
> lazy importing, version handling, etc, etc.  now, if *we*
> have managed to build all that on top of an old version
> of imputil.py, how come it's not sufficient for the rest
> of you?

I agree. The imputil mechanism has been proven in combat to work for many
scenarios. I have not (yet) heard of a case where the model has proven
insufficient.

> > A chain of simple minded importers won't work together
> > too well
> 
> why?  it sure works for us...

Exactly. "Why?" Please provide an example.

>...
> > and downgrade performance considerably due to the
> > many recursive function calls
> 
> now that's what I call premature optimization.  and this
> scares the hell out of me: if the rest of the python-dev
> crowd don't seriously believe that Python is (or can be
> made) fast enough to implement things like this, why
> the heck are you using Python at all?  am I the only
> one here who doesn't believe in osterhout's talk about
> "the great system vs. scripting language divide"?

Don't worry Fredrik... I'm with you on this one. I do not believe there is
a problem with the speed. Nobody has yet profiled imputil to find out
where/how the time is being spent. Nobody has tried to speed it up.
Therefore, any claims about its performance are simply FUD.

I claim that its interface is correct, and you (Fredrik) stated it well:
"given a name, please give me a module if you can (otherwise None)."

Underneath that semantic, there are a lot of things that can be done to
alter the performance and organization. Claims about speed are entirely
premature.

Yes, I'm biased. But, in truth, I haven't seen a better mechanism yet.
I've tossed out a few ideas on how imputil could be improved (which are
solely based on guess, rather than empirical evidence of profiling
output). When those changes are completed and there is still an issue,
then I'll admit defeat and wait for somebody else to provide a new design.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From marangoz at python.inrialpes.fr  Sat Dec  4 12:15:53 1999
From: marangoz at python.inrialpes.fr (Vladimir Marangozov)
Date: Sat, 4 Dec 1999 12:15:53 +0100 (CET)
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
In-Reply-To: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> from "Fredrik Lundh" at Dec 04, 1999 10:54:42 AM
Message-ID: <199912041115.MAA00539@python.inrialpes.fr>

Fredrik Lundh wrote:
> 
[snip]
> 
> <rant size="major">
> 
> frankly, this "design by committee"...
[snip]
> ...  see you!
> 
> </rant>
> 

C'mon /F, it's a battle of ideas and that's the way it works before
filtering the good ones from the bad ones, then focusing on the
appropriate implementation.

I'm in sync with the discussion, although I haven't posted my partial
notes on it due to lack of time. But let me say that overall, this
discussion is a good thing and the more opinions we get, the better.

BTW, you just _can't_ leave like this and start playing solitaire at
the bar, first, because we need beer too and it's unlikely that you'll
find a bar we don't know already, and second, because it was you who
revived this discussion with 1 word, repeated 3 times:

> Subject: Re: [Python-Dev] Python 1.6 status
> Date: Wed, 17 Nov 1999 12:46:01 +0100
> 
> Guido van Rossum <guido at CNRI.Reston.VA.US> wrote:
> > - suggestions for new issues that maybe ought to be settled in 1.6
> 
> three things: imputil, imputil, imputil
> 
> </F>

Thus, with no visible argumentation (so don't shoot on others when they
argue instead of you), and with this one word, you pushed Guido to the
extreme of suggesting a complete redesign of the import machinery from
scratch, based on a "Grand Architecture" :-). Right? -- Right!

This is a fact and a fairly amount of the credits go entirely to you!

Since then, however, I haven't really seen your arguments, and I believe
that nobody here got exactly your point. I, for one, may well argue
against imputil as being just another brick on top of the grand mess.
But because I haven't made the time to write properly my notes, I don't
dare to express a partial opinion, not blame those who argue good or
bad in the meantime, when I'm silent.

So, why are you showing us your back when you have clearly something
to say, but like me, you haven't made the time to say it?  Please don't
waste my time with emotional rants ;-). Everybody here tries to contribute
according to its knowledge, experience and availability.

Later,
-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov at inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From mal at lemburg.com  Sat Dec  4 11:45:52 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 04 Dec 1999 11:45:52 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com>
Message-ID: <3848F0E0.B8132AD2@lemburg.com>

Fredrik Lundh wrote:
> 
> M.-A. Lemburg <mal at lemburg.com> wrote:
> > >     given a name, check if you have a
> > >     module with that name, and install
> > >     it.  if you cannot, fail (in which case
> > >     python asks the next importer along
> > >     the path).
> > >
> > > why do you have to complicate things beyond that?
> > > why not just let Python provide a few base classes
> > > and mixins for people who want to create custom
> > > importers, and be done with it?
> >
> > Because importing in Python has become *much* more
> > complicated over time. There are requests for new
> > features which touch subjects such as storage mechanisms,
> > lookups, signatures (for trusted code), lazy imports, etc.
> 
> sorry, I still don't understand it.  our applications already
> use different storage mechanisms, databases, signatures,
> lazy importing, version handling, etc, etc.  now, if *we*
> have managed to build all that on top of an old version
> of imputil.py, how come it's not sufficient for the rest
> of you?

I've tried to get (an older) imputil.py version up and running
too. It did work, but only after some considerable tweaking
and even with integrated cache mechanisms did not reach
the performance of the builtin importer (which doesn't
use the kinds of caching strategies I had built into
imputil.py). Getting the whole setup to work wasn't easy
at all, because of the way imputil importers delegate work
and things get even more confusing when it starts to "take
over" certain parts of packages by installing temselves
as importers for a particular package.
 
> > A chain of simple minded importers won't work together
> > too well
> 
> why?  it sure works for us...

An example: 

A path importer knows how to scan directories and how to use
a path to tell the correct order. It can maybe also import
.py/.pyc/.pyo files. Now what happens if it finds a shared
lib as module... the usual imputil way would be to delegate
the request to some other importer which can handle shared
libs... but wait: how does the shared lib importer know
where to look ? It will have to rescan the directories,
etc...
 
> > duplicate work
> 
> avoiding duplicate work is what object oriented design
> is all about.  and last time I checked, Python had excellent
> support for that.

See my example above.

The agent approach used by imputil does not support
OO design too well: even though you can avoid duplicate
programming work on the importers by using a few
base classes which implement dir scans, shared lib
imports, etc. the imputil design does not provide
means to avoid duplicate actions taken by the importers.

> > and downgrade performance considerably due to the
> > many recursive function calls
> 
> now that's what I call premature optimization.  and this
> scares the hell out of me: if the rest of the python-dev
> crowd don't seriously believe that Python is (or can be
> made) fast enough to implement things like this, why
> the heck are you using Python at all?  am I the only
> one here who doesn't believe in osterhout's talk about
> "the great system vs. scripting language divide"?

Looks like you are in ranting mode here ;-) Seriously,
I've checked my imputil.py version (with caches enabled)
against the builtin importer and noticed a performance
downgrade by factor >2. This was enough to convince me
of looking for other techniques to handle the problems
I had at the time... you know, relative imports and things.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    27 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Sat Dec  4 12:04:15 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 04 Dec 1999 12:04:15 +0100
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com>
Message-ID: <3848F52F.5F5B748F@lemburg.com>

Fredrik Lundh wrote:
> 
> <rant size="major">
> 
> frankly, this "design by committee" (or is it "design by
> people who've never even been close to implementing
> something because they thought it was too hard, and
> thus think they're qualified to argue against those of
> us who didn't even realize that it was a hard problem"?)

Huh ? Two points:

1. How can you be sure that people haven't tried
   implementing their ideas and for various reasons
   have come to some conclusion about those ideas ?

2. Would you seriously disqualify people from joining a
   discussion by the simple arguement that they
   have not implemented anything yet ?

Just take the Unicode discussion as example: it was
very lively and resulted in a decent proposal which
is now subject to further investigation by the
implementors ;-) Many people have joined in even though
they did not and/or will not implement anything. Still,
their arguments were very useful to show up weaknesses
in the proposal.

Now, let's rather have a beer in the pub around the corner
than go on ranting about :-).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    27 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Sat Dec  4 12:53:33 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 04 Dec 1999 12:53:33 +0100
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
References: <Pine.LNX.4.10.9912040232240.18529-100000@nebula.lyra.org>
Message-ID: <384900BD.D16E72BC@lemburg.com>

Greg Stein wrote:
> > > [me:]
> > > A chain of simple minded importers won't work together
> > > too well
> >
> > why?  it sure works for us...
> 
> Exactly. "Why?" Please provide an example.

See my reply to Fredrik.
 
> >...
> > > and downgrade performance considerably due to the
> > > many recursive function calls
> >
> > now that's what I call premature optimization.  and this
> > scares the hell out of me: if the rest of the python-dev
> > crowd don't seriously believe that Python is (or can be
> > made) fast enough to implement things like this, why
> > the heck are you using Python at all?  am I the only
> > one here who doesn't believe in osterhout's talk about
> > "the great system vs. scripting language divide"?
> 
> Don't worry Fredrik... I'm with you on this one. I do not believe there is
> a problem with the speed. Nobody has yet profiled imputil to find out
> where/how the time is being spent. Nobody has tried to speed it up.

Sorry, Greg, but that is simply not true. I've spend a few
days on trying to get more performance out of it and have
succeeded, but in the end it wasn't enough to convince me
of the approach.

> Therefore, any claims about its performance are simply FUD.

BTW, did anybody mention that an import manager  wouldn't
be able to provide an API which is useable for imputil
style importers ? I'm not argueing against the possibility
to use imputil style importers, just against making it the
sole method of adding wisdom to Python imports.

The imputil importers could well benefit from a manager
providing logic to do basic things like importing
shared libs, checking signatures, downloading modules
from the web, etc.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    27 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein at lyra.org  Sat Dec  4 13:15:13 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 04:15:13 -0800 (PST)
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
In-Reply-To: <384900BD.D16E72BC@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912040402120.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, M.-A. Lemburg wrote:
>...
> > Don't worry Fredrik... I'm with you on this one. I do not believe there is
> > a problem with the speed. Nobody has yet profiled imputil to find out
> > where/how the time is being spent. Nobody has tried to speed it up.
> 
> Sorry, Greg, but that is simply not true. I've spend a few
> days on trying to get more performance out of it and have
> succeeded, but in the end it wasn't enough to convince me
> of the approach.

You sent me your changes... I don't believe that you were aggressive
enough. As I've mentioned before, I think it is quite possible to retain
the general Importer style and get_code() interface, but to shift some
functionality out (to be computed once) to a higher-level mechanism. The
patches that you sent me did not do this, so I'm not surprised that you
hit a wall.

Ack. See? Now I'm getting into discussions about performance and
implementation without truly knowing where the timing is spent. Eyeballing
it, I have an idea, but it would be best too see a profile output. My
mantra is always "90% of the time you're wrong about where 90% of the time
is being spent."

I am unconcerned about performance, but will work on it so that I don't
need to continue this conversation. That burden is on me.

> > Therefore, any claims about its performance are simply FUD.
> 
> BTW, did anybody mention that an import manager  wouldn't
> be able to provide an API which is useable for imputil
> style importers ? I'm not argueing against the possibility
> to use imputil style importers, just against making it the
> sole method of adding wisdom to Python imports.

Since the core will delegate out to Python (note: current working theory),
then it certainly is not the "sole method" (since you can just replace the
Python code). But there must be a default mechanism.

The ihooks stuff was too complicated. imputil seems to be much easier. I'd
love to see a third mechanism.... so I can steal ideas :-)

> The imputil importers could well benefit from a manager
> providing logic to do basic things like importing
> shared libs, checking signatures, downloading modules
> from the web, etc.

For shared libs, yes. For the others: geez... I don't want to see that in
the core infrastructure. Shift that out to specialized Importers. The
infrstructure ought to be teeny and agnostic about how to map a module
name to a module.


Side note to python-dev people: I apologize... I realize that I'm
beginning to get a bit defensive here. I'm going to be at XML '99 until
Friday, so that should give me a breather. When I get back, I'll skip the
talk and do some code.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Sat Dec  4 13:32:04 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 04:32:04 -0800 (PST)
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <3848F0E0.B8132AD2@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912040416220.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, M.-A. Lemburg wrote:
> Fredrik Lundh wrote:
>...
> > sorry, I still don't understand it.  our applications already
> > use different storage mechanisms, databases, signatures,
> > lazy importing, version handling, etc, etc.  now, if *we*
> > have managed to build all that on top of an old version
> > of imputil.py, how come it's not sufficient for the rest
> > of you?
> 
> I've tried to get (an older) imputil.py version up and running
> too. It did work, but only after some considerable tweaking
> and even with integrated cache mechanisms did not reach
> the performance of the builtin importer (which doesn't
> use the kinds of caching strategies I had built into
> imputil.py).

1) yes, it was an older version and did not have the PathImporter class.
   As a by product, the DirectoryImporters that it *did* have were much
   slower. It still did not support builtins, frozen modules, or dynamic
   loads. All of that is present now, so it works "out of the box" much
   better.

2) Performance: as I wrote in the other email, I don't believe that is an
   argument against the design. The imputil approach *will* be slower than
   the current Python mechanism, but there is some more coding to do to
   truly see how much. The side benefits (e.g. ZipImporter and caching)
   may outweigh the result. Time will tell.

> Getting the whole setup to work wasn't easy
> at all, because of the way imputil importers delegate work
> and things get even more confusing when it starts to "take
> over" certain parts of packages by installing temselves
> as importers for a particular package.

I don't understand this. If it is relevant, then please expand. Thx.

> > > A chain of simple minded importers won't work together
> > > too well
> > 
> > why?  it sure works for us...
> 
> An example: 
> 
> A path importer knows how to scan directories and how to use
> a path to tell the correct order. It can maybe also import
> .py/.pyc/.pyo files. Now what happens if it finds a shared
> lib as module... the usual imputil way would be to delegate
> the request to some other importer which can handle shared
> libs... but wait: how does the shared lib importer know
> where to look ? It will have to rescan the directories,
> etc...

No, the "usual imputil way" is that the PathImporter understands searching
a path and loading stuff from that path. An Importer is a combination of
locating and loading (since they are, typically, tightly bound). The next
rev will allow user-plugging of support for new file types.

> > > duplicate work
> > 
> > avoiding duplicate work is what object oriented design
> > is all about.  and last time I checked, Python had excellent
> > support for that.
> 
> See my example above.
> 
> The agent approach used by imputil does not support
> OO design too well: even though you can avoid duplicate
> programming work on the importers by using a few
> base classes which implement dir scans, shared lib
> imports, etc. the imputil design does not provide
> means to avoid duplicate actions taken by the importers.

There is always a balance to be struck between independence and coupling.
I chose to reduce coupling and increase independence. If you shift a bunch
of stuff out of the Importers, then you will increase the coupling between
the imputil framework and the Importers. That coupling will then close off
future possibilities.

Within the framework itself (e.g. between _import_hook and get_code),
there is a lot of opportunity for change. Since that is behind the covers,
it is no big deal to shift functionality around. I plan to do so.

>...
> Looks like you are in ranting mode here ;-) Seriously,
> I've checked my imputil.py version (with caches enabled)
> against the builtin importer and noticed a performance
> downgrade by factor >2. This was enough to convince me
> of looking for other techniques to handle the problems
> I had at the time... you know, relative imports and things.

I have run a long series of tests. Without doing any performance work on
imputil, the ratio is 9 to 13. The 13 may have bumped up to about 15 or 16
when I added some dynamic loading code (I forget). Regardless, it is
definitely less than a 2X increase. And that is with zero optimization.

*shrug*

I'm done. I'll do some code in a couple weeks.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Sat Dec  4 14:12:32 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 05:12:32 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912031439.JAA16524@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912040456180.18529-100000@nebula.lyra.org>

On Fri, 3 Dec 1999, Guido van Rossum wrote:
>...
> Great response.  I think we know where we each stand.  Please go ahead
> with a new design.  (That's trust, not carte blanche.)

Accepted gratefully. Thx.

> Just one thought: the more I think about it, the less I like
> sys.importers: functionality which is implemented through
> sys.importers must necessarily be placed either in front of all of
> sys.path or after it.  While this is helpful for "canned" apps that
> want *everything* to be imported from a fixed archive, I think that
> for regular Python installations sys.path should remain the point of
> attack.  In particular, installing a new package (e.g. PIL) should
> affect sys.path, regardless of the way of delivery of the modules
> (shared libs, .py files, .pyc files, or a zip archive).

Okay. I'll design with respect to this model.

To be explicit/clear and to be sure I'm hearing you right: sys.path may
contain Importer instances. Given the name FOO, the system will step
through sys.path looking for the first occurence of FOO (looking in a
directory or delegating). FOO may be found with any number of
(configurable) file extensions, which are ordered (e.g. ".so" before
".py" before ".isl").

> I'm not too worried about code that inspects sys.path and expects
> certain invariants; that code is most likely interfering with the
> import mechanism so should be revisited anyway.

The Benevolent Dictator has spoken. So be it.

:-)

> On the lone .pyc issue: I'd like to see this disappear when using the
> filesystem, I see no use for it there if we support .pyc files in zip
> archives.

No problem. This actually creates a simplification in the system, as I'm
seeing it now. I'm also seeing opportunities for a code reorg which may
work towards MAL's issues with performance.

I hope to have something in two or three weeks. I also hope people can be
patient :-), but I certainly wouldn't mind seeing some alternative code!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gmcm at hypernet.com  Sat Dec  4 15:59:44 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Sat, 4 Dec 1999 09:59:44 -0500
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
In-Reply-To: <384900BD.D16E72BC@lemburg.com>
Message-ID: <1267803104-11215142@hypernet.com>

M.-A. Lemburg wrote:
> Greg Stein wrote:

> > Don't worry Fredrik... I'm with you on this one. I do not
> > believe there is a problem with the speed. Nobody has yet
> > profiled imputil to find out where/how the time is being spent.
> > Nobody has tried to speed it up.
> 
> Sorry, Greg, but that is simply not true. I've spend a few
> days on trying to get more performance out of it and have
> succeeded, but in the end it wasn't enough to convince me
> of the approach.
 
Remember those comparisons of Perl and Python, to which 
you added cgipython? I've added to the list a version that uses 
an old version of imputil (probably the one you optimized) and 
a compressed std lib. Note that my Linux python (1.5.2) is 
built in the RedHat style - even struct and strop are .so's; so 
that accounts for the majority of the open calls. This is a full 
Python (runs code.py if you don't pass it a script name). For 
lack of a better name, I've called it "pykit".

 First, the size of log files (in lines), i.e. number of system 
calls:
 
                Solaris     Linux    IRIX[1]
   Perl              88        85      70
   Python           425       316     257
   cgipython                  182 
   pykit                      136

 Next, the number of "open" calls:

                Solaris     Linux    IRIX
   Perl             16         10       9
   Python          107         71      48
   cgipython                   33 
   pykit                        9

 And the number of unsuccessful "open" calls:
 
                Solaris     Linux    IRIX
   Perl              6          1       3
   Python           77         49      32
   cgipython                   28
   pykit                        2
 
 Number of "mmap" calls:
 
                Solaris     Linux    IRIX
   Perl              25        25       1
   Python            36        24       1
   cgipython                   13
   pykit                       21

This test would show off more if it went beyond startup. An 
import of a standard lib module in my stock Python involves 2 
failed stats and 6 failed opens, then 2 successful opens and 2 
fstats before the module is loaded. None of these occur in 
pykit.

The downside (asking my Importer for a .so or a module not in 
the importer) takes no system calls, and involves a dozen or 
so lines of Python and a check of a dictionary.


- Gordon


From tismer at appliedbiometrics.com  Sat Dec  4 16:29:03 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sat, 04 Dec 1999 16:29:03 +0100
Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order)
References: <Pine.LNX.4.10.9912040402120.18529-100000@nebula.lyra.org>
Message-ID: <3849333F.1DF2A201@appliedbiometrics.com>


Greg Stein wrote:
...

> My mantra is always "90% of the time you're wrong about where 90% 
> of the time is being spent."

What a great sentence! We all know it, but many of us
(especially me) forget about it during 90% of our coding time.
Much better to spend this on design (as you did).

thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From jim at interet.com  Sat Dec  4 18:27:44 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Sat, 04 Dec 1999 12:27:44 -0500
Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT]
References: <Pine.LNX.4.10.9911190404580.10639-100000@nebula.lyra.org> <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com>
Message-ID: <38494F10.C644BA7@interet.com>

Fredrik Lundh wrote:
> 
> James C. Ahlstrom <jim at interet.com> wrote:
> > IMHO putting shared libs in an archive is a bad idea because the OS

Dear Fredrik,

I thought the point of Python-Dev was to propose designs and get
feedback, right?  Well, I got feedback :-).

OK, I agree to alter my archive format so it provides the
ability to store shared libs and not just *.pyd.  I will
add the string length and if needed a flag indicating the
name is a shared lib.

Now the details:

> have you tried it?  if not, why do you think you should
> be allowed to forbid others from doing it?

Yes I have tried it, and I am currently on my fourth version
of an archive format which is based on formats by Greg Stein
and Gordon McMillan.  I hope it meets with the favor of the
Grand Inquisition, and becomes the standard format.  But
maybe it won't.  Oh well.

> bloody installers.  and here you are advocating that
> we all should be forced to use installers, when python
> makes it trivial to write self-installing apps. double-argh!

I am not forcing anyone to do anything, only proposing that
shared libs are best handled directly by imputil and not
the class within imputil which handles archive files.  It
is just a geeky design issue, nothing more.

JimA


From jim at interet.com  Sat Dec  4 19:31:48 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Sat, 04 Dec 1999 13:31:48 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com>
Message-ID: <38495E14.9C2FB107@interet.com>

"M.-A. Lemburg" wrote:

> An example:
> 
> A path importer knows how to scan directories and how to use
> a path to tell the correct order. It can maybe also import
> .py/.pyc/.pyo files. Now what happens if it finds a shared
> lib as module... the usual imputil way would be to delegate
> the request to some other importer which can handle shared
> libs... but wait: how does the shared lib importer know
> where to look ? It will have to rescan the directories,
> etc...

The above refers to an earlier but still very recent version
of imputil.  On that basis is is perfectly accurate.  Here is
another example from my own experience almost identical to
the above:

One possible archive file format holds its list of archived
*.pyc file names as keys in a dictionary.  This is simple and
efficient, but fails to correctly address the problem of shared
libs (aka DLL's in Windows) with names identical to names of
*.pyc files in the archive.  For example, suppose foo.pyc is in the
archive, and foo.dll is in a directory.  Suppose sys.path is to be
used to decide whether to load foo.pyc or foo.dll.  Then an
"archive importer" will fail to do this.  Specifically you can't
see if foo.pyc is in the archive and then check sys.path, nor can
you do the reverse.  You must call the "archive importer" repeatedly
for each element of sys.path and search the directory at the same time.

JimA


From jim at interet.com  Sat Dec  4 20:51:47 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Sat, 04 Dec 1999 14:51:47 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912040456180.18529-100000@nebula.lyra.org>
Message-ID: <384970D3.26A9ECDB@interet.com>

Greg Stein wrote:
> 
> On Fri, 3 Dec 1999, Guido van Rossum wrote:

> > attack.  In particular, installing a new package (e.g. PIL) should
> > affect sys.path, regardless of the way of delivery of the modules
> > (shared libs, .py files, .pyc files, or a zip archive).

> To be explicit/clear and to be sure I'm hearing you right: sys.path may
> contain Importer instances. Given the name FOO, the system will step
> through sys.path looking for the first occurence of FOO (looking in a
> directory or delegating). FOO may be found with any number of
> (configurable) file extensions, which are ordered (e.g. ".so" before
> ".py" before ".isl").

This is basically a gripe about this design spec.  So if the answer
turns out to be "we need this functionality so shut up" then just
say that and don't flame me.

This spec is painful.  Suppose sys.path has 10 elements, and there
are six file extensions.  Then the simple algorithm is slow:
  for path in sys.path:		# Yikes, may not be a string!
    for ext in file_extensions:
      name = "%s.%s" % (module_name, ext)
      full_path = os.path.join(path, name)
      if os.path.isfile(full_path):
        # Process file here

And sys.path can contain class instances
which only makes things slower.  You could do a readdir() and cache
the results, but maybe that would be slower.  A better
algorithm might be faster, but a lot more complicated.

In the context of archive files, it is also painful.  It prevents
you from saving a single dictionary of module names.  Instead you
must have len(sys.path) dictionaries.  You could try to
save in the archive information about whether (say) a foo.dll was
present in the file system, but the list of extensions is extensible.

The above problem only exists to support equally-named modules; that
is, to support a run-time choice of whether to load foo.pyc, foo.dll,
foo.isl, etc.  I claim (without having written it) that the fastest
algorithm to solve the unique-name case is much faster than the fastest
algorithm to solve the choose-among-equal-names case.

Do we really need to support the equal-name case [Jim runs for
cover...]?
If so, how about inventing a new way to support it.  Maybe if equal
names exist, these must be pre-loaded from a known location?

JimA


From gstein at lyra.org  Sat Dec  4 22:59:00 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 13:59:00 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <384970D3.26A9ECDB@interet.com>
Message-ID: <Pine.LNX.4.10.9912041350200.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
> Greg Stein wrote:
>...
> > To be explicit/clear and to be sure I'm hearing you right: sys.path may
> > contain Importer instances. Given the name FOO, the system will step
> > through sys.path looking for the first occurence of FOO (looking in a
> > directory or delegating). FOO may be found with any number of
> > (configurable) file extensions, which are ordered (e.g. ".so" before
> > ".py" before ".isl").
> 
> This is basically a gripe about this design spec.  So if the answer
> turns out to be "we need this functionality so shut up" then just
> say that and don't flame me.
> 
> This spec is painful.  Suppose sys.path has 10 elements, and there
> are six file extensions.  Then the simple algorithm is slow:
>   for path in sys.path:		# Yikes, may not be a string!
>     for ext in file_extensions:
>       name = "%s.%s" % (module_name, ext)
>       full_path = os.path.join(path, name)
>       if os.path.isfile(full_path):
>         # Process file here

This is the algorithm that Python uses today, and my standard Importers
follow.

> And sys.path can contain class instances
> which only makes things slower.

IMO, we don't know this, or whether it is significant.

> You could do a readdir() and cache
> the results, but maybe that would be slower.  A better
> algorithm might be faster, but a lot more complicated.

Who knows. BUT: the import process is now in Python -- it makes it *much*
easier to run these experiments. We could not really do this when the
import process is "hard-coded" in C code.

> In the context of archive files, it is also painful.  It prevents
> you from saving a single dictionary of module names.  Instead you
> must have len(sys.path) dictionaries.  You could try to
> save in the archive information about whether (say) a foo.dll was
> present in the file system, but the list of extensions is extensible.

I am not following this. What/where is the "single dictionary of module
names" ? Are you referring to a cache? Or is this about building an
archive?

An archive would look just like we have now: map a name to a module. It
would not need multiple dictionaries.

> The above problem only exists to support equally-named modules; that
> is, to support a run-time choice of whether to load foo.pyc, foo.dll,
> foo.isl, etc.  I claim (without having written it) that the fastest
> algorithm to solve the unique-name case is much faster than the fastest
> algorithm to solve the choose-among-equal-names case.
> 
> Do we really need to support the equal-name case [Jim runs for
> cover...]?
> If so, how about inventing a new way to support it.  Maybe if equal
> names exist, these must be pre-loaded from a known location?

I don't understand what the problem is. I don't see one. We are still
mapping a name to a module. sys.path defines a precedence.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Sun Dec  5 02:17:57 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 4 Dec 1999 17:17:57 -0800 (PST)
Subject: [Python-Dev] pyc archives (was: .DLL vs .PYD search order)
In-Reply-To: <38495E14.9C2FB107@interet.com>
Message-ID: <Pine.LNX.4.10.9912041713580.18529-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
>...
> One possible archive file format holds its list of archived
> *.pyc file names as keys in a dictionary.  This is simple and
> efficient, but fails to correctly address the problem of shared
> libs (aka DLL's in Windows) with names identical to names of
> *.pyc files in the archive.  For example, suppose foo.pyc is in the
> archive, and foo.dll is in a directory.  Suppose sys.path is to be
> used to decide whether to load foo.pyc or foo.dll.  Then an
> "archive importer" will fail to do this.  Specifically you can't
> see if foo.pyc is in the archive and then check sys.path, nor can
> you do the reverse.  You must call the "archive importer" repeatedly
> for each element of sys.path and search the directory at the same time.

What? The archive is independent of each .pyc's original position in
sys.path. There is no reason/need to carry that information into an
archive.

If the archive contains "foo", then you're done. If it doesn't, then move
on to the next element of sys.path (directory or Importer instance) and
look there.

Basically: if you deploy an archive, then all of its files will take
precedence over any file found later on sys.path. This is exactly what
sys.path is about: establishing precedence.

If I understand you correctly, then you're trying to say there is some
sort of interleaving that must occur. If so, then I don't understand why.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik at pythonware.com  Mon Dec  6 13:20:34 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 6 Dec 1999 13:20:34 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> <384B7E32.F7B81D82@lemburg.com>
Message-ID: <004401bf3fe4$4cab6ea0$f29b12c2@secret.pythonware.com>

> > you obviously attempted to use imputil to implement
> > non-standard import behaviour on top of the standard
> > storage system -- while we've used it to implement
> > standard import behaviour on top of non-standard
> > storage systems.
> 
> No, I tried to make the imputil approach work as replacement
> for the standard builtin importer.

I'm confused.  earlier, you said (or rather, I think you
said) that you looked at imputil to see if it could "handle
the problems you had at the time"...  and now you say
that you tried to use it as a drop-in replacement for the
"standard path importer".  I must be missing something
here...

> After I got that to work, I added some caching
> to avoid duplicated stats. The resulting importer was
> around twice as slow as the builtin one for the following
> imports:
> 
> # the default one Python does at startup, plus:
> from mx import HTMLTools,DateTime,ODBC
> 
> This is a pretty common setup for my scripts, so its
> preformance is relevant to me.

did you try stuffing all your PYC's into an archive file,
and running them from there?

</F>


From fredrik at pythonware.com  Sun Dec  5 19:22:57 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 5 Dec 1999 19:22:57 +0100
Subject: [Python-Dev] Re: .DLL vs .PYD search order
References: <Pine.LNX.4.10.9912021617520.18529-100000@nebula.lyra.org>             <38479573.B2CFDD2B@lemburg.com>  <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com>
Message-ID: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com>

> I've checked my imputil.py version (with caches enabled)
> against the builtin importer and noticed a performance
> downgrade by factor >2. This was enough to convince me
> of looking for other techniques to handle the problems
> I had at the time... you know, relative imports and things.

hmm.  I think I see the problem here...

you obviously attempted to use imputil to implement
non-standard import behaviour on top of the standard
storage system -- while we've used it to implement
standard import behaviour on top of non-standard
storage systems.

I don't know if imputil is good enough for the former,
and I don't think I care...  I've spent too many nights
debugging code that relied on clever, non-standard
hacks.

</F>

PS. on the performance side of things, did you know
that 're' can be up to ten times slower than 'regex'?
but people don't complain -- probably because it
allows them to do things they couldn't do before...


From jim at interet.com  Mon Dec  6 20:40:01 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 06 Dec 1999 14:40:01 -0500
Subject: [Python-Dev] Re: pyc archives (was: .DLL vs .PYD search order)
References: <Pine.LNX.4.10.9912041713580.18529-100000@nebula.lyra.org>
Message-ID: <384C1111.92984B5A@interet.com>

Greg Stein wrote:
> 
> On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
> >...
> > One possible archive file format holds its list of archived
> > *.pyc file names as keys in a dictionary.  This is simple and
> > efficient, but fails to correctly address the problem of shared

> What? The archive is independent of each .pyc's original position in
> sys.path. There is no reason/need to carry that information into an
> archive.
> 
> If the archive contains "foo", then you're done. If it doesn't, then move
> on to the next element of sys.path (directory or Importer instance) and
> look there.
> 
> Basically: if you deploy an archive, then all of its files will take
> precedence over any file found later on sys.path. This is exactly what
> sys.path is about: establishing precedence.

Sorry, I am a little slow today.  My daughter got me up at 6 am to
work on her computer video editor.  No disk space, fragmentation,
2 gig limit on AVI files, ........

Are you saying this?  If foo is imported, the archive importer is
consulted first to see if it can provide foo.  If not, sys.path is
searched  for foo.pyc, foo.pyl etc., and if foo.pyl is found, then
its contents are added to the single archive importer dictionary.
The order of addition to the archive dictionary is determined by
sys.path, and duplicate names are not entered because they lie later
on sys.path.  But once a file is recognized as in an archive, it
effectively precedes all of sys.path.

Or this?  If foo is imported, sys.path is searched for
foo.pyc, foo.pyl, etc., and also all archive files found
at each element of sys.path are searched for foo.  If "bar"
is imported, it may be found in foo.pyl.  That is,
there is an instance of an archive importer for each element
of sys.path.

What if the user names an archive file not on sys.path?  What
order does it have?

JimA


From jim at interet.com  Mon Dec  6 19:34:41 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 06 Dec 1999 13:34:41 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912041350200.18529-100000@nebula.lyra.org>
Message-ID: <384C01C1.8D1AFFFF@interet.com>

Greg Stein wrote:
> 
> On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
> >         # Process file here
> 
> This is the algorithm that Python uses today, and my standard Importers
> follow.

Agreed.
 
> > And sys.path can contain class instances
> > which only makes things slower.
> 
> IMO, we don't know this, or whether it is significant.

Agreed.
 
> > You could do a readdir() and cache
> > the results, but maybe that would be slower.  A better
> > algorithm might be faster, but a lot more complicated.
> 
> Who knows. BUT: the import process is now in Python -- it makes it *much*
> easier to run these experiments. We could not really do this when the
> import process is "hard-coded" in C code.

Agreed.
 
> > In the context of archive files, it is also painful.  It prevents
> > you from saving a single dictionary of module names.  Instead you
> > must have len(sys.path) dictionaries.  You could try to
> > save in the archive information about whether (say) a foo.dll was
> > present in the file system, but the list of extensions is extensible.
> 
> I am not following this. What/where is the "single dictionary of module
> names" ? Are you referring to a cache? Or is this about building an
> archive?
> 
> An archive would look just like we have now: map a name to a module. It
> would not need multiple dictionaries.

The "single dictionary of names" is in the single archive importer
instance and has nothing to do with creating the archive.  It
is currently programmed this way.

Suppose the user specifies by name 12 archive files to be searched.
That is, the user hacks site.py to add archive names to the importer.
The "single dictionary" means that the archive importer takes the 12
dictionaries in the 12 files and merges them together into one
dictionary
in order to speed up the search for a name.  The good news is you can
always just call the archive importer to get a module.  The bad news is
you can't do that for each entry on sys.path because there is no
necessary identity between archive files and sys.path.  The user
specified the archive files by name, and they may or may not be on
sys.path, and the user may or may not have specified them in the
same order as sys.path even if they are.

Suppose archive files must lie on sys.path and are processed in order.
Then to find them you must know their name.  But IMHO you want to
avoid doing a readdir() on each element of sys.path and looking for
files *.pyl.

Suppose archive file names in general are the known name "lib.pyl"
for the Python library, plus the names "package.pyl" where "package"
can be the name of a Python package as a single archive file.  Then
if the user tries to import foo, imputil will search along sys.path
looking for foo.pyc, foo.pyl, etc.  If it finds foo.pyl, the archive
importer will add it to its list of known archive files.  But it must
not add it to its single dictionary, because that would destroy the
information about its position along sys.path.  Instead, it must keep
a separate dictionary for each element of sys.path and search the
separate dictionaries under control of imputil.  That is, get_code()
needs a new argument for the element of sys.path being searched.
Alternatively, you could create a new importer instance for each
archive file found, but then you still have multiple dictionaries.
They are in the multiple instances.

All this is needed only to support import of identically named
modules.  If there are none, there is no problem because sys.path
is being used only to find modules, not to disambiguate them.

See also my separate reply to your other post which discusses
this same issue.

JimA


From gstein at lyra.org  Tue Dec  7 01:43:21 1999
From: gstein at lyra.org (Greg Stein)
Date: Mon, 6 Dec 1999 16:43:21 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <384C01C1.8D1AFFFF@interet.com>
Message-ID: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>

On Mon, 6 Dec 1999, James C. Ahlstrom wrote:
> Greg Stein wrote:
>...
> > I am not following this. What/where is the "single dictionary of module
> > names" ? Are you referring to a cache? Or is this about building an
> > archive?
> > 
> > An archive would look just like we have now: map a name to a module. It
> > would not need multiple dictionaries.
> 
> The "single dictionary of names" is in the single archive importer
> instance and has nothing to do with creating the archive.  It
> is currently programmed this way.

Ah. There is the problem. In Guido's suggestion for the "next path of
inquiry" :-), there is no "single dictionary of names". Instead, you have
Importer instances as items in sys.path. Each instance maintains its
dictionary, and they are not (necessarily) combined.

If we were to combine them, then we would need to maintain the ordering
requirements implied by sys.path. However, this would be problematic if
sys.path changed -- we would have to detect the situation and rebuild a
merged dict.

> Suppose the user specifies by name 12 archive files to be searched.
> That is, the user hacks site.py to add archive names to the importer.
> The "single dictionary" means that the archive importer takes the 12
> dictionaries in the 12 files and merges them together into one
> dictionary
> in order to speed up the search for a name.  The good news is you can
> always just call the archive importer to get a module.  The bad news is
> you can't do that for each entry on sys.path because there is no
> necessary identity between archive files and sys.path.  The user
> specified the archive files by name, and they may or may not be on
> sys.path, and the user may or may not have specified them in the
> same order as sys.path even if they are.

The importer must be inserted into sys.path to establish a precedence. If
the user wants to add 12 libraries... fine. But *all* of those modules
will fall under a precedence defined by the Importer's position on
sys.path.

> Suppose archive files must lie on sys.path and are processed in order.
> Then to find them you must know their name.  But IMHO you want to
> avoid doing a readdir() on each element of sys.path and looking for
> files *.pyl.

I do not believe that we will arbitrarily locate and open library files.
They must be specified explicitly.

> Suppose archive file names in general are the known name "lib.pyl"
> for the Python library, plus the names "package.pyl" where "package"
> can be the name of a Python package as a single archive file.  Then
> if the user tries to import foo, imputil will search along sys.path
> looking for foo.pyc, foo.pyl, etc.  If it finds foo.pyl, the archive
> importer will add it to its list of known archive files.  But it must
> not add it to its single dictionary, because that would destroy the
> information about its position along sys.path.  Instead, it must keep
> a separate dictionary for each element of sys.path and search the
> separate dictionaries under control of imputil.  That is, get_code()
> needs a new argument for the element of sys.path being searched.
> Alternatively, you could create a new importer instance for each
> archive file found, but then you still have multiple dictionaries.
> They are in the multiple instances.

If the user installs ".pyl" as a recognized extension (i.e. installs into
the PathImporter), then the above scenario is possible. In my
in-head-design, I had not imagined any state being retained for
extension-recognizer hooks. Of course, state can be retained simply by
using a bound-method for the hook function.

get_code() would not need to change. The foo.pyl would be consulted at the
appropriate time based on where it is found in sys.path. Note that file-
extension hooks would definitely have a complete path to the target file.
Those are not Importers, however (although they will closely follow the
get_code() hook since the extension is called from get_code).


From tim_one at email.msn.com  Tue Dec  7 06:11:25 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 7 Dec 1999 00:11:25 -0500
Subject: [Python-Dev] Re: .DLL vs .PYD search order
In-Reply-To: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com>
Message-ID: <001601bf4071$8278cc20$88a0143f@tim>

[/F]
> PS. on the performance side of things, did you know
> that 're' can be up to ten times slower than 'regex'?
> but people don't complain -- probably because it
> allows them to do things they couldn't do before...

Bad example:  people do complain about this.  Those who care a lot continue
to use regex, temporarily pacified by the promise that re.py will get
recoded in C and thus regain a good chunk of regex's speed.  Those who care
a whale of a lot continue to use Perl <0.9 wink>.


From guido at CNRI.Reston.VA.US  Tue Dec  7 13:45:25 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 07 Dec 1999 07:45:25 -0500
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: Your message of "Mon, 06 Dec 1999 16:43:21 PST."
             <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org> 
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org> 
Message-ID: <199912071245.HAA21596@eric.cnri.reston.va.us>

> If we were to combine them, then we would need to maintain the ordering
> requirements implied by sys.path. However, this would be problematic if
> sys.path changed -- we would have to detect the situation and rebuild a
> merged dict.

No need to worry about this: just don't merge the caches.  Compared to
the hundreds of failed open() calls that are done now, it's no big
deal to do 12 failed Python dictionary lookups instead of one.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik at pythonware.com  Tue Dec  7 14:25:54 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 7 Dec 1999 14:25:54 +0100
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>
Message-ID: <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com>

Greg Stein <gstein at lyra.org> wrote:
> > The "single dictionary of names" is in the single archive importer
> > instance and has nothing to do with creating the archive.  It
> > is currently programmed this way.
> 
> Ah. There is the problem. In Guido's suggestion for the "next path of
> inquiry" :-), there is no "single dictionary of names". Instead, you have
> Importer instances as items in sys.path. Each instance maintains its
> dictionary, and they are not (necessarily) combined.

so the "sys.path contains importers (or strings)" strategy
is now officially sanctioned?  cool!!!

(a quick look in our code base says that this will cause
some trouble, unless os.path.isdir() is modified to reject
non-strings...  after all, if it's not a string, it cannot be
a valid directory path, so this does make some sense ;-)

another aside: can we have a standard mechanism for
listing the contents of a given archive, please?  we have
a lot of "path scanning" stuff (PIL and PST, among others),
and it would be great if things didn't break down if you
stuff it all in an archive.

something like:

    for path in sys.path:
        if os.path.isdir(path):
            files = os.listdir(path)
        else:
            try:
                files = path.listdir()
            except AttributeError:
                files = None
        if files is None:
            # no idea what's in here
        else:
            # path provides (at least) these modules

would be really useful.

and yes, it shouldn't have to be mentioned, since squeeze
have done it since early 1997, but archive importers should
provide a standard way to include non-module resources in
the archive, and a standard way to access such resources
as ordinary python streams.

e.g:

    file = path.open(name, "rb")

or something...

</F>


From jim at interet.com  Tue Dec  7 16:20:15 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 07 Dec 1999 10:20:15 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org> <199912071245.HAA21596@eric.cnri.reston.va.us>
Message-ID: <384D25AF.4C4F5107@interet.com>

Guido van Rossum wrote:

> No need to worry about this: just don't merge the caches.  Compared to
> the hundreds of failed open() calls that are done now, it's no big
> deal to do 12 failed Python dictionary lookups instead of one.

Agreed.

JimA


From jim at interet.com  Tue Dec  7 16:31:30 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 07 Dec 1999 10:31:30 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org>
Message-ID: <384D2852.3C36C216@interet.com>

Greg Stein wrote:

> Ah. There is the problem. In Guido's suggestion for the "next path of
> inquiry" :-), there is no "single dictionary of names". Instead, you have
> Importer instances as items in sys.path. Each instance maintains its
> dictionary, and they are not (necessarily) combined.

> [A large number of other design issues]

OK, all design issues agreed.  I will make needed changes.

JimA


From jim at interet.com  Tue Dec  7 16:37:36 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 07 Dec 1999 10:37:36 -0500
Subject: [Python-Dev] Import redesign [LONG]
References: <Pine.LNX.4.10.9912061630260.18926-100000@nebula.lyra.org> <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com>
Message-ID: <384D29C0.3D3A2194@interet.com>

Fredrik Lundh wrote:

> another aside: can we have a standard mechanism for
> listing the contents of a given archive, please?

I will add this.

> and yes, it shouldn't have to be mentioned, since squeeze
> have done it since early 1997, but archive importers should
> provide a standard way to include non-module resources in
> the archive, and a standard way to access such resources
> as ordinary python streams.

I will add this.

JimA


From gstein at lyra.org  Tue Dec  7 17:53:49 1999
From: gstein at lyra.org (Greg Stein)
Date: Tue, 7 Dec 1999 08:53:49 -0800 (PST)
Subject: [Python-Dev] Import redesign [LONG]
In-Reply-To: <199912071245.HAA21596@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912070853230.21367-100000@nebula.lyra.org>

On Tue, 7 Dec 1999, Guido van Rossum wrote:
> > If we were to combine them, then we would need to maintain the ordering
> > requirements implied by sys.path. However, this would be problematic if
> > sys.path changed -- we would have to detect the situation and rebuild a
> > merged dict.
> 
> No need to worry about this: just don't merge the caches.  Compared to
> the hundreds of failed open() calls that are done now, it's no big
> deal to do 12 failed Python dictionary lookups instead of one.

Have no fear... I wasn't planning on this... complicates too much stuff
for too little gain.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido at CNRI.Reston.VA.US  Wed Dec  8 13:07:31 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 07:07:31 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 02:46:02 EST."
             <000201bf4150$46749da0$5aa2143f@tim> 
References: <000201bf4150$46749da0$5aa2143f@tim> 
Message-ID: <199912081207.HAA00040@eric.cnri.reston.va.us>

[Great analysis, Tim!]

> 4) The audience is Python end-users "in general", and the product is pure
> Python.  I think this is the most important one for Distutils to address,
> and compilation isn't a part of it.  So far, though, what Gordon is doing
> seems more appropriate than what Distutils has been up to.  I hope his work
> gets folded into this.

I'm not sure what stuff by which Gordon you're referring to.  I am
only familiar with his installer, which I thought is win32 only (but
I may be mistaken) and is an installer for a whole application, not
just a bunch of modules.  Please correct me if I'm wrong.

But this reminds me of a different issue, which Jim Ahlstrom has been
hammering about before: there's a completely separate set of cases
where what you are distributing is a stand-alone application, and the
target consists of end users who are entirely uninterested in whether
it's written in Python, C or Elvish.  (And then there's still the
distinction between Win32, Unix or both.)  The current distutil dools
don't deal with this at all.  I think it should though, and I think
its framework is powerful enough to be able to add this, e.g. as a new
"appdist" command.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Wed Dec  8 15:16:07 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 09:16:07 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 02:46:02 EST."             <000201bf4150$46749da0$5aa2143f@tim> 
Message-ID: <1267460464-31845181@hypernet.com>

Guido wrote:

> [Great analysis, Tim!]
> 
> > 4) The audience is Python end-users "in general", and the
> > product is pure Python.  I think this is the most important one
> > for Distutils to address, and compilation isn't a part of it. 
> > So far, though, what Gordon is doing seems more appropriate
> > than what Distutils has been up to.  I hope his work gets
> > folded into this.
> 
> I'm not sure what stuff by which Gordon you're referring to.  I
> am only familiar with his installer, which I thought is win32
> only (but I may be mistaken) and is an installer for a whole
> application, not just a bunch of modules.  Please correct me if
> I'm wrong.

It needed a name. I hate the word "Installer", but it expresses 
in one word the most common use of my stuff.

I'll be releasing a beta for Linux real soon. Only some of the 
tricks are Windows only (such as self-extracting executables, 
which is only culturally appropriate on Windows, anyway).

But more importantly it's not just for installing. The Python I 
use (interactively) on my wife's machine is 1 directory with 
about 6 files in it. On my Linux box I've been using the std lib 
in a .pyz for about a month now. Someone distributing a pure 
Python package could instead ship 3 files (imputil.py, 
archive.py and <package>.pyz) with the "install" consisting of 
adding one line to site.py in the user's perfectly normal Python 
installation.

And yeah, I solved the "manifest" problem, too. Mine predates 
Distutils, so don't accuse me of duplicate effort, (I pointed 
them to it a couple times). It uses ConfigParser and a config 
file, so it allows finer control.

While .pyz's are completely cross-platform, I have yet to work 
out endianness issues in the other archive I use (which should 
probably be zip format - it can hold anything). And at the 
"Installer" end, I have yet to work out how things should work 
on non-ELF/COFF platforms (where I can't append the archive 
to the executable). But there aren't any technical issues 
involved; just lack of time.

So no, it's not just for Windows; and no, it's not just for 
creating standalones (though that's what almost everyone 
uses it for).

- Gordon


From guido at CNRI.Reston.VA.US  Wed Dec  8 15:56:42 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 09:56:42 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 09:16:07 EST."
             <1267460464-31845181@hypernet.com> 
References: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim>  
            <1267460464-31845181@hypernet.com> 
Message-ID: <199912081456.JAA00200@eric.cnri.reston.va.us>

> It needed a name. I hate the word "Installer", but it expresses 
> in one word the most common use of my stuff.
> 
> I'll be releasing a beta for Linux real soon. Only some of the 
> tricks are Windows only (such as self-extracting executables, 
> which is only culturally appropriate on Windows, anyway).
> 
> But more importantly it's not just for installing. The Python I 
> use (interactively) on my wife's machine is 1 directory with 
> about 6 files in it. On my Linux box I've been using the std lib 
> in a .pyz for about a month now. Someone distributing a pure 
> Python package could instead ship 3 files (imputil.py, 
> archive.py and <package>.pyz) with the "install" consisting of 
> adding one line to site.py in the user's perfectly normal Python 
> installation.
> 
> And yeah, I solved the "manifest" problem, too. Mine predates 
> Distutils, so don't accuse me of duplicate effort, (I pointed 
> them to it a couple times). It uses ConfigParser and a config 
> file, so it allows finer control.
> 
> While .pyz's are completely cross-platform, I have yet to work 
> out endianness issues in the other archive I use (which should 
> probably be zip format - it can hold anything). And at the 
> "Installer" end, I have yet to work out how things should work 
> on non-ELF/COFF platforms (where I can't append the archive 
> to the executable). But there aren't any technical issues 
> involved; just lack of time.
> 
> So no, it's not just for Windows; and no, it's not just for 
> creating standalones (though that's what almost everyone 
> uses it for).

Gordon, I'm sorry, but from this description I still have no idea what
your stuff is (and I forgot the URL so I can't look it up).  For
example, if it's not (just) for installing, what *is* it for?

What is the ``"manifest" problem'' and how did you solve it?

Also, note that editing site.py is a no-no!  You can create/edit
sitecustomize.py, but you should leave site.py alone!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Wed Dec  8 17:17:03 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 11:17:03 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081456.JAA00200@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 09:16:07 EST."             <1267460464-31845181@hypernet.com> 
Message-ID: <1267453215-32281635@hypernet.com>

Guido,
 
> Gordon, I'm sorry, but from this description I still have no idea
> what your stuff is (and I forgot the URL so I can't look it up). 

http://starship.python.org/crew/gmcm/installer.html

The Linux stuff has a couple alpha testers and will probably 
get announced in a week or two.

> For example, if it's not (just) for installing, what *is* it for?
 
At the bottom level, it's a bunch of tools using freeze's 
modulefinder, imputil.py and 2 kinds of archives. There's at 
least 2 layers above that, with "Installer" being the top.  
There's a clean separation between the layers, so you can 
break in wherever you like.

> What is the ``"manifest" problem'' and how did you solve it?

The problem is specifying a set of resources, hopefully without 
having to list them explicitly. I solve this with a config file that 
lets you specify packages, directories, directory trees.. with 
filters that can work from paths, names, extensions, regular 
expressions...
 
> Also, note that editing site.py is a no-no!  You can create/edit
> sitecustomize.py, but you should leave site.py alone!

That would work fine. One of the standalone configurations will 
write a site.py, but that's for a completely self-contained 
installation (ie, one which will have no conflicts with another 
Python installation). 

I'd also note that, for Windows at least, the path-expanding 
mechanism created by site.py has not caught on. I've got lots 
installed, and no site-python, site-packages or sitecustomize.


- Gordon


From guido at CNRI.Reston.VA.US  Wed Dec  8 17:23:34 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 11:23:34 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 11:17:03 EST."
             <1267453215-32281635@hypernet.com> 
References: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com>  
            <1267453215-32281635@hypernet.com> 
Message-ID: <199912081623.LAA04119@eric.cnri.reston.va.us>

[me]
> > Also, note that editing site.py is a no-no!  You can create/edit
> > sitecustomize.py, but you should leave site.py alone!

[Gordon]
> That would work fine. One of the standalone configurations will 
> write a site.py, but that's for a completely self-contained 
> installation (ie, one which will have no conflicts with another 
> Python installation). 
> 
> I'd also note that, for Windows at least, the path-expanding 
> mechanism created by site.py has not caught on. I've got lots 
> installed, and no site-python, site-packages or sitecustomize.

You shouldn't see site-python or site-packages, they only exist on
Unix.  On Windows, everything is installed in the top Python
directory.  However you should see .pth files there, which is what
site.py looks for.  I believe NumPy and PIL use those.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Wed Dec  8 17:55:51 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 11:55:51 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081623.LAA04119@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 11:17:03 EST."             <1267453215-32281635@hypernet.com> 
Message-ID: <1267450887-32421651@hypernet.com>

> [Gordon]
> > That would work fine. One of the standalone configurations will
> > write a site.py, but that's for a completely self-contained
> > installation (ie, one which will have no conflicts with another
> > Python installation). 
> > 
> > I'd also note that, for Windows at least, the path-expanding
> > mechanism created by site.py has not caught on. I've got lots
> > installed, and no site-python, site-packages or sitecustomize.
[Guido] 
> You shouldn't see site-python or site-packages, they only exist
> on Unix.  

You mean "they only exist _for_ Unix", (site.py looks for them 
on Windows). I don't like that. For one thing, modulo a few 
platform differences, the same mechanism should work for 
multi-user Unix and Windows LAN installations. And single-
user Windows (I know, redundant, even on NT) should be a 
degenerate case of the above.

> On Windows, everything is installed in the top Python
> directory.  However you should see .pth files there, which is
> what site.py looks for.  I believe NumPy and PIL use those.

No NumPy, no PIL, no .pth files. 99% of everything out there 
just says "unzip this somewhere on your Python path".

In this case, Jim Ahlstrom may be right - there are too many 
options, or at least an insufficiently emphasized "proper" 
method. Until I worked out my own way of installing stuff, I 
used to lose a large number of packages whenever I upgraded 
my Windows Python.

Much as I love Mark's stuff (and hesitate to criticize crazy 
Aussies), I wish there weren't so much special casing here for 
Windows.

And no, I don't have any solutions to this, I'm just griping...

- Gordon


From guido at CNRI.Reston.VA.US  Wed Dec  8 18:07:30 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 12:07:30 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Wed, 08 Dec 1999 11:55:51 EST."
             <1267450887-32421651@hypernet.com> 
References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com>  
            <1267450887-32421651@hypernet.com> 
Message-ID: <199912081707.MAA04242@eric.cnri.reston.va.us>

> [Guido] 
> > You shouldn't see site-python or site-packages, they only exist
> > on Unix.  

[Gordon]
> You mean "they only exist _for_ Unix", (site.py looks for them 
> on Windows).

No it doesn't.  The code in site.py only adds site-packages and
site-python when os.sep is '/'.  RTSL.

> I don't like that. For one thing, modulo a few 
> platform differences, the same mechanism should work for 
> multi-user Unix and Windows LAN installations. And single-
> user Windows (I know, redundant, even on NT) should be a 
> degenerate case of the above.

What do you mean by "the same mechanism should work"?  The same
mechanism for what?  Are you talking about sharing the installed
files somehow?

> > On Windows, everything is installed in the top Python
> > directory.  However you should see .pth files there, which is
> > what site.py looks for.  I believe NumPy and PIL use those.
> 
> No NumPy, no PIL, no .pth files. 99% of everything out there 
> just says "unzip this somewhere on your Python path".

Fair enough.  Of course I know about .pth files so I unzipped them
elsewhere and added a .pth file pointing there...

> In this case, Jim Ahlstrom may be right - there are too many 
> options, or at least an insufficiently emphasized "proper" 
> method. Until I worked out my own way of installing stuff, I 
> used to lose a large number of packages whenever I upgraded 
> my Windows Python.

The .pth files are designed for this.  Maybe they haven't been
explained as well as they should.

> Much as I love Mark's stuff (and hesitate to criticize crazy 
> Aussies), I wish there weren't so much special casing here for 
> Windows.

It's not Mark's fault, it's Microsoft's fault.  If you don't do things
the way MS wants you to, experienced Windows users will gripe,
misunderstand what you do, etc.

> And no, I don't have any solutions to this, I'm just griping...

Ditto.  Understanding the problems is half of the solution though.
The problems seem pretty complex!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Wed Dec  8 19:25:50 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 13:25:50 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us>
References: Your message of "Wed, 08 Dec 1999 11:55:51 EST."             <1267450887-32421651@hypernet.com> 
Message-ID: <1267445488-32746429@hypernet.com>

[Guido] 
> No it doesn't.  The code in site.py only adds site-packages and
> site-python when os.sep is '/'.  RTSL.

Oops. Missed that.

> > I don't like that. For one thing, modulo a few 
> > platform differences, the same mechanism should work for 
> > multi-user Unix and Windows LAN installations. And single- user
> > Windows (I know, redundant, even on NT) should be a degenerate
> > case of the above.
> 
> What do you mean by "the same mechanism should work"?  The same
> mechanism for what?  Are you talking about sharing the installed
> files somehow?

In the above, "mechanism" basically meant that which creates 
sys.path. 

Basically, this came up for me because in standalone 
configurations (my Installer again), I have to take complete 
control of sys.path. After doing so differently on Windows and 
Linux, I finally realized that I can do it the same way on both.
 
Which makes me question why they are so different.

> The .pth files are designed for this.  Maybe they haven't been
> explained as well as they should.

I'd say "badgered" or "browbeaten" instead of "explained" ;-).
 
> > Much as I love Mark's stuff (and hesitate to criticize crazy
> > Aussies), I wish there weren't so much special casing here for
> > Windows.
> 
> It's not Mark's fault, it's Microsoft's fault.  If you don't do
> things the way MS wants you to, experienced Windows users will
> gripe, misunderstand what you do, etc.

Even MS doesn't do things the way MS says they want you to.

I find MS users equally divided between those who scream 
bloody murder if you touch the registry, and those who 
scream if you don't.

It's not like *nixen suffer from an excessive degree of 
conformity in preferred installation procedures, but somehow 
Python survives there...

> > And no, I don't have any solutions to this, I'm just griping...
> 
> Ditto.  Understanding the problems is half of the solution
> though. The problems seem pretty complex!

Grumpily agreed ;-).


- Gordon


From jim at interet.com  Wed Dec  8 19:33:51 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 08 Dec 1999 13:33:51 -0500
Subject: [Python-Dev] Linux Journal confirms evil rumor
References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com>  
	            <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us>
Message-ID: <384EA48F.F5190180@interet.com>

I finally got around to reading the current Linux
Journal (which just keeps getting better and better)
and lo! there was a picture of a familiar face I just
couldn't quite....

Oh no!  Could it be true?  I heard rumors but I refused to
believe them until now.  The glasses are gone!  Guido now
looks like an investment banker!  The sky is falling!

Next will probably be a Python 1.6 as a 27 Meg DLL, and
a Python IPO.  Well, maybe not.  Now that I look more
closely, he is wearing a black and white and mustard
(??MUSTARD) T-shirt which says "You Need Python".

At least we ought to make him wear a name tag at IPC8.

JimA


From fdrake at acm.org  Wed Dec  8 19:37:44 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 8 Dec 1999 13:37:44 -0500 (EST)
Subject: [Python-Dev] Linux Journal confirms evil rumor
In-Reply-To: <384EA48F.F5190180@interet.com>
References: <1267453215-32281635@hypernet.com>
	<1267450887-32421651@hypernet.com>
	<199912081707.MAA04242@eric.cnri.reston.va.us>
	<384EA48F.F5190180@interet.com>
Message-ID: <14414.42360.309237.967766@weyr.cnri.reston.va.us>

James C. Ahlstrom writes:
 > Oh no!  Could it be true?  I heard rumors but I refused to
 > believe them until now.  The glasses are gone!  Guido now
 > looks like an investment banker!  The sky is falling!

  I'm afraid this non-distinctive look was introduced at IPC7... it's
too bad we can't tell people Python was invented by the guy with the
glasses anymore.

 > Next will probably be a Python 1.6 as a 27 Meg DLL, and
 > a Python IPO.  Well, maybe not.  Now that I look more
 > closely, he is wearing a black and white and mustard
 > (??MUSTARD) T-shirt which says "You Need Python".

  It's really the blue & white & orange IPC7 shirt.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From bwarsaw at cnri.reston.va.us  Wed Dec  8 19:41:51 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Wed, 8 Dec 1999 13:41:51 -0500 (EST)
Subject: [Python-Dev] Linux Journal confirms evil rumor
References: <1267453215-32281635@hypernet.com>
	<1267450887-32421651@hypernet.com>
	<199912081707.MAA04242@eric.cnri.reston.va.us>
	<384EA48F.F5190180@interet.com>
Message-ID: <14414.42607.701538.783684@anthem.cnri.reston.va.us>

>>>>> "JCA" == James C Ahlstrom <jim at interet.com> writes:

    JCA> Oh no!  Could it be true?  I heard rumors but I refused to
    JCA> believe them until now.  The glasses are gone!  Guido now
    JCA> looks like an investment banker!  The sky is falling!

He's not the only one who's, like, "gone corporate", but I won't
mention any names, so as to protect the guilty.


From jim at digicool.com  Wed Dec  8 20:03:42 1999
From: jim at digicool.com (Jim Fulton)
Date: Wed, 08 Dec 1999 14:03:42 -0500
Subject: [Python-Dev] Linux Journal confirms evil rumor
References: <1267453215-32281635@hypernet.com>
		<1267450887-32421651@hypernet.com>
		<199912081707.MAA04242@eric.cnri.reston.va.us>
		<384EA48F.F5190180@interet.com> <14414.42607.701538.783684@anthem.cnri.reston.va.us>
Message-ID: <384EAB8E.EBA595B5@digicool.com>

"Barry A. Warsaw" wrote:
> 
> He's not the only one who's, like, "gone corporate", but I won't
> mention any names, so as to protect the guilty.

OK, Buzz.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From tim_one at email.msn.com  Thu Dec  9 06:31:52 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 9 Dec 1999 00:31:52 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us>
Message-ID: <000301bf4206$b39e5b80$36a2143f@tim>

[Guido]
> [Great analysis, Tim!]

I beg to differ:  it's internally inconsistent and should have identified at
least 3 axes and hence at least 8 cases.  Still, you got more than you paid
for <wink>.

>> 4) The audience is Python end-users "in general", and the
>> product is pure Python.  I think this is the most important one
>> for Distutils to address, and compilation isn't a part of it.
>> So far, though, what Gordon is doing seems more appropriate
>> than what Distutils has been up to.  I hope his work gets folded
>> into this.

> I'm not sure what stuff by which Gordon you're referring to.

You guessed right!

> I am only familiar with his installer, which I thought is win32
> only (but I may be mistaken) and is an installer for a whole
> application, not just a bunch of modules.  Please correct me if
> I'm wrong.

If it can install a whole app, what makes you suspect it couldn't install
just a bunch of modules <0.5 wink>?

It started life as Windows-only, and I believe it's been virtually ignored
by non-Windows folk because of that.  Bad blind spot.  It supplies
already-working approaches to many of the issues that are still being
*talked* about on Distutils (at least archive formats, code to manipulate
same, manifest files (how do you tell the tool which files to package?), and
transparently bundling a Python interpreter when needed).

> But this reminds me of a different issue, which Jim Ahlstrom has
> been hammering about before: there's a completely separate set of
> cases where what you are distributing is a stand-alone application,
> and the target consists of end users who are entirely uninterested
> in whether it's written in Python, C or Elvish.

I include part of that in my case #4 above, where the app happens to be
written in Pure Python -- but the user doesn't have to know that.  Gordon is
addressing at least that part of it.  AFAIK he can't deal with transparently
compiling C or exorcising Elvish on the target platform, but if you're just
distributing the binaries I expect his work is directly usable already.

> (And then there's still the distinction between Win32, Unix or
> both.)

I vote "both".  The world really doesn't need another Win32-only (or
Unix-only) installer, archive format, compression format, or distribution
model.

Jim seems mostly interested in Win32-only to me, and his concerns haven't
been about the mechanics of distribution but about how-- regardless of
tool --to create a bulletproof Python installation by hook or by crook.
Last time we went thru this, it was concluded that one couldn't without
patching the Python Windows binary with a resource editor (to point to its
own infernal <0.5 wink> registry entries).

Distutils hasn't talked about that at all (that I've seen, anyway); if there
were a less radical approach to that, I suspect Jim would be delighted to
use one of the commercial Win32 installation pkgs (and if that's what his
customers expect, delighted or not that's what he'll do).

> The current distutil dools don't deal with this at all.

That's why I said I thought what Gordon is doing seems more appropriate to
case #4 than what Distutils has been doing.

> I think it should though,

Ditto.

> and I think its framework is powerful enough to be able to
> add this, e.g. as a new "appdist" command.

I cordially invite (since Gordon will uncordially browbeat <wink>) people to
look seriously at what he's done.  Best I can tell, for apps that don't need
compilation "on the other end", it's mostly "there" already!

give-the-man-a-hand-ly y'rs  - tim


From tim_one at email.msn.com  Thu Dec  9 06:52:23 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 9 Dec 1999 00:52:23 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <1267453215-32281635@hypernet.com>
Message-ID: <000601bf4209$90a90c80$36a2143f@tim>

> http://starship.python.org/crew/gmcm/installer.html

Eh?  Doesn't work for me.  This does:

    http://starship.python.net/crew/gmcm/distribute.html


From tim_one at email.msn.com  Thu Dec  9 07:38:54 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 9 Dec 1999 01:38:54 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us>
Message-ID: <000701bf4210$10925a40$36a2143f@tim>

[Gordon]
>> Much as I love Mark's stuff (and hesitate to criticize crazy
>> Aussies), I wish there weren't so much special casing here for
>> Windows.

[Guido]
> It's not Mark's fault, it's Microsoft's fault.  If you don't do
> things the way MS wants you to, experienced Windows users will
> gripe, misunderstand what you do, etc.

Something just occurred to me:  MS's guidelines aren't arbitrary, they
actually have very good reasons.  In the case of putting all an app's
crucial info in the Registry, it's the only way to allow a site
administrator to set policy and site options remotely (an admin can fiddle
other machines' registries remotely).  This works very well indeed when
there's only "one copy" of an app on a machine (or at most one copy "per
user").

What just occurred to me is that JimA is concerned with *not* letting any
info from a previously-installed Python affect the app he's installing.
Similarly, Gordon's Win32 "standalone installer" modifies python.exe and
pythonw.exe to use a PYTHONPATH he forces, leaving the registry out of it.
Similarly, the woes I've had in trying to sell Python as a general Win32
scripting tool at work mostly boil down to that there's no effortless way to
do it that doesn't risk picking up info from-- or forcing info
onto --pre-existing or future distinct Python installations (in contrast,
Perl "just works" in this respect).

IOW, the three of us find getting path info out of the registry intolerable
because we are in fact trying to do the opposite of what the registry
mechanism was *designed* for:  we want perfect isolation, not perfect
sharing.

This has come up on Python-Help a few times too, in the guise of someone
installing a product that in turn installs an older version of Python, which
in turn confuses another product that relies on features in a newer version
of Python.

So while the traditional Windows .ini file (like Unix this-or-that.rc file)
model was replaced by the registry for excellent reasons, those reasons
don't apply to the way we're using Python!  The .ini file model was exactly
right for what most of us seem to want to do, and the registry model is
exactly wrong.

just-thought-i'd-cheer-you-up<wink>-ly y'rs  - tim


From skip at mojam.com  Thu Dec  9 08:38:36 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 9 Dec 1999 01:38:36 -0600 (CST)
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <000701bf4210$10925a40$36a2143f@tim>
References: <199912081707.MAA04242@eric.cnri.reston.va.us>
	<000701bf4210$10925a40$36a2143f@tim>
Message-ID: <14415.23676.775163.786028@dolphin.mojam.com>

    Tim> So while the traditional Windows .ini file (like Unix
    Tim> this-or-that.rc file) model was replaced by the registry for
    Tim> excellent reasons, those reasons don't apply to the way we're using
    Tim> Python!  The .ini file model was exactly right for what most of us
    Tim> seem to want to do, and the registry model is exactly wrong.

Alright!  Now I understand what all the hubbub is about!  My eyes have
mostly been glazing over trying to follow all this Windows registry/path/ini
stuff.  MS believes that Python is the application.  Those of us writing
Python programs view those programs as the applications, not the Python
interpreter per se.  Is there some way that people writing applications in
Python can set up registry entries that are specific to their application
(e.g. tabnanny.py) instead of only specific to the Python interpreter?

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From gmcm at hypernet.com  Thu Dec  9 15:17:27 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Thu, 9 Dec 1999 09:17:27 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <000701bf4210$10925a40$36a2143f@tim>
References: <199912081707.MAA04242@eric.cnri.reston.va.us>
Message-ID: <1267374045-37047016@hypernet.com>

[Guido]
> > It's not Mark's fault, it's Microsoft's fault.  If you don't do
> > things the way MS wants you to, experienced Windows users will
> > gripe, misunderstand what you do, etc.
[Tim] 
> Something just occurred to me:  MS's guidelines aren't arbitrary,
> they actually have very good reasons.  In the case of putting all
> an app's crucial info in the Registry, it's the only way to allow
> a site administrator to set policy and site options remotely (an
> admin can fiddle other machines' registries remotely).  This
> works very well indeed when there's only "one copy" of an app on
> a machine (or at most one copy "per user").

And actually, the business about separate subtrees for the 
machine's configuration and the user's configuration is pretty 
clever. MS doesn't explain it well, and it gets misused, but 
when done right, it's a lot simpler than the maze of .xxxrc files 
you sometimes find in other OSes.
 
> What just occurred to me is that JimA is concerned with *not*
> letting any info from a previously-installed Python affect the
> app he's installing. Similarly, Gordon's Win32 "standalone
> installer" modifies python.exe and pythonw.exe to use a
> PYTHONPATH he forces, leaving the registry out of it. Similarly,
> the woes I've had in trying to sell Python as a general Win32
> scripting tool at work mostly boil down to that there's no
> effortless way to do it that doesn't risk picking up info from--
> or forcing info onto --pre-existing or future distinct Python
> installations (in contrast, Perl "just works" in this respect).

In my Linux version, I went to the heart of the matter - 
getpath.c. It occurs to me that getpath.c might do better to 
follow a normal bootstrap process - ie,  create the absolute 
minimal sys.path required to go to the next step. Then the 
rest of what goes on in getpath.c could be written in Python. 
Maybe that Python code needs to get frozen in (to prevent 
bozos from destroying an installation by stepping on 
getpath.py), but it would make it a lot easier to create 
independent installations, and also reduce the variations 
between platforms at the C level. (Then again, I've never heard 
of anyone stepping on exceptions.py.)

If some registry manipulation primitives were exposed (say, 
through ntpath) that would mean that Windows developers 
could (if they wanted) play by the MS rules with at least the 
option of not stepping on each other.
 

- Gordon


From jim at interet.com  Thu Dec  9 16:02:18 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 10:02:18 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim>
Message-ID: <384FC47A.BB4DA517@interet.com>

Tim Peters wrote:

> Jim seems mostly interested in Win32-only to me, and his concerns haven't
> been about the mechanics of distribution but about how-- regardless of
> tool --to create a bulletproof Python installation by hook or by crook.

Not exactly.  I am interested in how to create a bullet-proof
installation.
But I am equally interested in Unix (especially Linux) and dislike the
current dichotomy in the code base.

Lately I have been more active in distribution via archive files.
Part of the solution is an archive file format which is identical on
Unix and Windows, and which can hold the Python library and packages
as single files.  For my own efforts on this see:

    ftp://ftp.interet.com/pub/pylib.html

This is an archive file format similar to Gordon's format, although
Gordon's work goes well beyond just file formats.  I currently have
fifth generation code for this format, and am adding features as
suggested by Fredrik Lundt.  I hope it gets considered as a candidate
for a Python standard format.

> Distutils hasn't talked about that at all (that I've seen, anyway);

Gordon, Greg Stein and I have discussed file formats before.  I think
it was on distutils.  Anyway that was months ago.

JimA


From guido at CNRI.Reston.VA.US  Thu Dec  9 17:17:18 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 11:17:18 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 09:17:27 EST."
             <1267374045-37047016@hypernet.com> 
References: <199912081707.MAA04242@eric.cnri.reston.va.us>  
            <1267374045-37047016@hypernet.com> 
Message-ID: <199912091617.LAA05742@eric.cnri.reston.va.us>

> [Guido]
> > > It's not Mark's fault, it's Microsoft's fault.  If you don't do
> > > things the way MS wants you to, experienced Windows users will
> > > gripe, misunderstand what you do, etc.
> [Tim] 
> > Something just occurred to me:  MS's guidelines aren't arbitrary,
> > they actually have very good reasons.  In the case of putting all
> > an app's crucial info in the Registry, it's the only way to allow
> > a site administrator to set policy and site options remotely (an
> > admin can fiddle other machines' registries remotely).  This
> > works very well indeed when there's only "one copy" of an app on
> > a machine (or at most one copy "per user").
[Gordon]
> And actually, the business about separate subtrees for the 
> machine's configuration and the user's configuration is pretty 
> clever. MS doesn't explain it well, and it gets misused, but 
> when done right, it's a lot simpler than the maze of .xxxrc files 
> you sometimes find in other OSes.

I agree.  And I am guilty of not even try to find MS' explanation -- I
just looked in the registry at what other apps did and tried to mimic
that (plus what Mark had already done), without really knowing what I
was doing.  I now know a little better -- see the end of this message.

> In my Linux version, I went to the heart of the matter - 
> getpath.c. It occurs to me that getpath.c might do better to 
> follow a normal bootstrap process - ie,  create the absolute 
> minimal sys.path required to go to the next step. Then the 
> rest of what goes on in getpath.c could be written in Python. 
> Maybe that Python code needs to get frozen in (to prevent 
> bozos from destroying an installation by stepping on 
> getpath.py), but it would make it a lot easier to create 
> independent installations, and also reduce the variations 
> between platforms at the C level. (Then again, I've never heard 
> of anyone stepping on exceptions.py.)

Yes, this is exactly what was proposed in the thread on the Big Import
Rewrite.

> If some registry manipulation primitives were exposed (say, 
> through ntpath) that would mean that Windows developers 
> could (if they wanted) play by the MS rules with at least the 
> option of not stepping on each other.

That's a good idea.  These functions are already available through
Mark's win32api extension -- much of which will eventually (I hope
before 1.6 is out!) become part of the core distribution.

In the mean time, I've been thinking a bit more about how Python
should be using the Windows registry.  (It's clear to me that Python
should use the registry -- those who disagree can go build their own
Python distribution.)

The basic ideas of Python's current registry usage are sound: there's
a resource built into the DLL which is part of the key into the
registry used for all information.

The problem lies in which key is used.  All versions of Python 1.5.x
(1.5, 1.5.1, 1.5.2) use the same key!  This is a main cause of
trouble, because it means that different versions cannot peacefully
live together even if the user installs them into different
directories -- they will all use the registry keys of the last version
installed.  This, in turn, means that someone who writes a Python
application that has a dependency on a particular Python version (and
which application worth distributing doesn't :-) cannot trust that if
a Python installation is present, it is the right one.  But they also
cannot simply bundle the standard installer for the correct Python
version with their program, because its installation would overwrite
an existing Python application, thus breaking some *other* Python apps
that the user might already have installed.

(There's a solution for app builders who are willing to do a lot of
work -- you can change the registry key resource in the DLL.  For
example, Alice comes with its own version of Python 1.5.1 and it uses
"1.5.1-alice" as its registry key.  The Alice installer installs
Python in a subdirectory of the Alice installation directory and
points the 1.5.1-alice registry entries there.  The problem is that
this is a lot of work for the average app builder.)

I thought a bit about how VB solves this.  I think that when you wrap
up a VB app in, all the support code (mostly a big DLL) is wrapped
with it.  When the user runs the installer, the DLL is installed
(probably in the WINDOWS directory).  If a user installs several VB
apps built with the same VB version, they all attempt to install the
exact same DLL; of course the installers notice this and optimize it
away, keeping a reference count.  (Ignoring for now the fact that
those reference counts don't always work!)  If an app builty with a
different VB version is installed, it has a DLL with a different name,
and that is installed separately.  Other support files, I presume, are
dealt with in much the same way.  Voila, there's the theory.

How can we do something similar for Python?

A app written in Python should need to install only three or four
files:

- a driver EXE to start the app
- a copy of the Python DLL
- the Python library in an archive
- the app code in an archive

The latter two could be combined into a single archive, but I propose
that we use two archives so that the DLL and the Python library
archive can be shared between installations of independent Python apps
as long as they use the exact same Python version and don't need
additional 3rd party packages.  (I believe that Jim A's proposal
combines the archives with the EXE and the DLL, reducing the number of
files to two.  That's fine too.)

Is there a use for the registry here at all?  Maybe not.  (I notice
that VB seems to have a single registry entry, pointing to a DLL; all
other VB files also seem to live there.)

Complications:

- Some apps may need a custom extension module, which has to be
  installed as a PYD file.  So it seems that there needs to be a
  directory per app, and perhaps per version of the app (if the app
  distributor cares).

- Some apps need other, non-pyc files (e.g. data tables or help
  files); it would be handy if these could be stored in the archives as
  well.

- Some standard extension modules are in their own PYD files; these
  also need to be installed.  They aren't typically marked with a
  version, so perhaps a path directory per version of Python (if not per
  installed app) is wise.

- How to distribute an app that needs 3rd party stuff, e.g. Tcl/Tk, or
  PIL, or NumPy?  Their Python code can easily be wrapped up in another
  archive with a standard name incorporating a version number; but the
  required PYD and DLL files are a separate story.  (E.g. for Tkinter,
  you need _tkinter.pyd which links against tcl80.dll.)  Basically the
  same solution as for standard PYD files can work; the needed DLL files
  can be installed either systemwide (if they have a reliable version
  number in their name, like tcl80.dll) or in the per-app or per-package
  directory (like NumPy).

- Presumably, the archives will contain PYC files only.  This means
  that tracebacks will not show source code, only line numbers.  For Jim
  A, this is probably exactly what he wants (if the user gets a
  traceback, his "robust app" has miserably failed, and he takes it in
  pride that this doesn't happen).  But for some others, access to the
  sources could be essential.

  For example, I might want to distribute IDLE using this mechanism;
  users of IDLE who are curious about the standard library (or about
  IDLE itself) should be able to open the source for an arbitrary module
  (and maybe even edit it, although that's not a priority and perhaps
  should even be discouraged).  Library source access is an important
  feature of the IDLE debugger as well.

  A way out for IDLE is to install a classic distribution of the Python
  library sources, into the filesystem at an IDLE specific location.
  Other apps, with only the need for source code in tracebacks, might
  choose to to have the PY files in the archives sitting next to the PYC
  files, and somehow the traceback mechanism should be accessing the
  archive to get a hold of the source.

And yes, I realize that Jim A's latest offering solves most of these
problems to a large extent -- well done.  (Jim, would you care to
comment on the issues that you don't address?  Will you address them
in a future version?)

Final notes:

There are two different problems here.  One is how to distribute
Python apps robustly to end users who don't particular care about
Python.  This is Jim A's problem (and he has a solution that works for
him).  In general the solutions here try to isolate the installed app
from other Python installations.  I'm proposing that at least the DLL
and the Python library archive can probably be shared between apps
without reducing robustness if we keep track more carefully of version
numbers.

The other problem is how to distribute packages of Python and
extension modules for use by Python users.  These typically need to
drop into some existing Python installation.  This is Paul Dubois'
problem with NumPy (amongst others) and is the current focus of the
distutil SIG.

However I believe that there could be a lot of common infrastructure
that would help us create better solutions for both problems.  For
package distribution, common infrastructure (a.k.a. standards) is
essential.  For app distribution, common infrastructure isn't so
important (since the solutions strive for total isolation, there's no
problem if different apps use solutions).  However, this changes when
app creators want to distribute robust self-sufficient apps that use
3rd party packages -- then the 3rd party packages must allow being
packaged up using the app distribution creator of choice.

Solving this compound problem (creating package distributions that can
be redistributed easily as part of robust Python app distributions)
should be an important goal for the infrastructure we're building
here.  The Big Import Rewrite ought to add this to its list of
objectives if it isn't already on it.  My guess is that the solution
for this compound problem will increase the dependency of app
distribution tools on the package distribution infrastructure; which
to me seems like a Good Thing because it would lead to more code
sharing.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at interet.com  Thu Dec  9 17:24:40 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 11:24:40 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000701bf4210$10925a40$36a2143f@tim>
Message-ID: <384FD7C8.12832BF1@interet.com>

Tim Peters wrote:

> Something just occurred to me:  MS's guidelines aren't arbitrary, they
> actually have very good reasons.  In the case of putting all an app's
> crucial info in the Registry, it's the only way to allow a site
> administrator to set policy and site options remotely (an admin can fiddle
> other machines' registries remotely).  This works very well indeed when
> there's only "one copy" of an app on a machine (or at most one copy "per
> user").

The registry is still a bad idea because it lumps critical and app data
into single files and brings up the ugly problem of protecting
individual registry entries instead of just files.  Microsoft
should have put all app config into the app directory and provided
for remote admin of that.  But that is not really your point (just
ranting about the registry again).

> IOW, the three of us find getting path info out of the registry intolerable
> because we are in fact trying to do the opposite of what the registry
> mechanism was *designed* for:  we want perfect isolation, not perfect
> sharing.
> 
> This has come up on Python-Help a few times too, in the guise of someone
> installing a product that in turn installs an older version of Python, which
> in turn confuses another product that relies on features in a newer version
> of Python.

Or, in other words, no isolation is possible if critical info
depends on global data like PYTHONPATH or a _common_ registry
entry.  We could have different registry entries, but this is
confusing and not documented.

I think we can solve this with archive files in a way compatible
with Unix without going off on a Windows-only wavelength.  If the
archive file contains everything, and it is in the dir of the app,
and the app looks there and finds it, then it Just Works.

See also my reply to Skip.

JimA


From akuchlin at mems-exchange.org  Thu Dec  9 17:32:08 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Thu, 9 Dec 1999 11:32:08 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
Message-ID: <199912091632.LAA09236@amarok.cnri.reston.va.us>

After poking around in the O'Reilly POSIX book, here's a list of POSIX
functions that don't seem to be available in Python.  Not all of them
seem worth supporting.   Ironically, Greg Ward's daemonize() Perl
subroutine, which started me on this, doesn't actually seem to need
anything that Python doesn't have.

I'm looking for corrections to the list; are there other POSIX
functions I've missed, or are some of them actually in Python?

I think implementing most of these functions is straightforward, with
the exception of opendir/readdir/closedir.

Worth adding?
=============
opendir(), readdir(), closedir() -- 
	   most of their functionality is available through
	   os.listdir(), but it might be useful to have a direct
	   interface.  Downside is that this would require a new
	   extension type for the C DIR struct.  My (lazy) inclination
	   is to not bother.

Worth adding:
=============

abort() -- used in Py_FatalError(), but not accessible to Python code

ctermid(), ctermid_r() -- returns the terminal pathname 
	   -- probably just add ctermid(), but use ctermid_r() for
thread-safety
            
fpathconf(fd, name) -- Get configuration limit for a file
	    -- would need constants from unistd.h

getlogin() -- returns user's login name
	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
	 getlogin() apparently looks in utmp

getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs

pathconf(path, name) -- Gets config variables for a path
	    -- would need constants from unistd.h

sysconf(int name) -- Gets system configuration information
	    -- would need constants from unistd.h

Not worth adding:
=================
clearerr() -- looks like fileobjects call clearerr() before raising errors

cuserid() -- returns user's login name
	  -- ORA book says "Do not use this function" -- removed in 1990 POSIX

difftime
	  -- seems only required in C "because no addition properties
are defined for time_t" (Solaris man page)              

tmpfile(), tmpnam() -- Create temp file, generate temp filename
		    -- Similar functionality available in tempfile.py

mblen(), mbstowcs(), mbtowc(), wcstombs(),  wctomb()
	 -- Multi-byte character functions: 
	 -- Don't bother; wait for the Unicode type.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
I'm sorry I became abusive just now ... calling you worms... I was just
speaking relatively, you understand.
    -- Dekko, in ZOT! #3


From jcw at equi4.com  Thu Dec  9 17:38:13 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Thu, 09 Dec 1999 17:38:13 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com>
Message-ID: <384FDAF5.C25C447C@equi4.com>

"James C. Ahlstrom" wrote:

[...]
>     ftp://ftp.interet.com/pub/pylib.html

Ouch - what's wrong with zip archives?

There are utilities to convert to/from zip, to re-pack, to mount zip
transparently so it's entries look like regular files, FTP servers, etc.

Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format.

Zips would seem natural with JPython.  And suppose that scripting ever
starts to consolidate to a common scripting kernel (yah, well), do you
really want a system which is closing all doors to cross-fertilization?

Zip has an advantage over .tar.gz in that its table of contents is
available without having to decompress the whole kaboodle.

Your format has no checksum, which for deployment and long-term storage
can be important.

If you want a marshalled TOC, then why not add a manifest entry for it,
sort of like what ranlib does with ar?

You designed the format so archives can be concatenated without any tool
(other than "cat"), but this works just as well with zip files, as the
Tcl Wrap approach demonstrates.

Allow me to very, very loosely paraphrase Guido here: sure, everyone can
design an archive format, but they are likely to make the same mistakes
all over again - so why not adopt a format which is tried and tested?

With all due respect - I sincerely hope you will reconsider and alter
your code to work with zip files.  It's probably a small adjustment?

Unless your *intent* is to create a diverging standard, of course...

-- Jean-Claude


From jim at interet.com  Thu Dec  9 17:46:35 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 11:46:35 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <199912081707.MAA04242@eric.cnri.reston.va.us>
		<000701bf4210$10925a40$36a2143f@tim> <14415.23676.775163.786028@dolphin.mojam.com>
Message-ID: <384FDCEB.2226C1C1@interet.com>

Skip Montanaro wrote:

> MS believes that Python is the application.  Those of us writing
> Python programs view those programs as the applications, not the Python
> interpreter per se.

I think this is a good point.  Windows app programmers (mostly)
view Python as part of their app and try it install it in their
app directory.  Unix installs Python as a system app in multiple
versions and users use PATH to pick a version.  Unix users view
the Python interpreter as a system service which is needed for
running their app.

I think this is because a Windows app is a visual program,
and the Python release compiles to a console app (not really
a visual program).  So all
(?most) Windows Python apps are custom mains with Python
as a component, but the stock python.exe is not the main.
This makes it difficult to document a way to install Python
in the Unix fashion, since all apps need their own binary main
and python15.dll is the only thing in common.

IMHO archive files can solve this a lot more simply.

JimA


From guido at CNRI.Reston.VA.US  Thu Dec  9 17:55:40 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 11:55:40 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 17:38:13 +0100."
             <384FDAF5.C25C447C@equi4.com> 
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com>  
            <384FDAF5.C25C447C@equi4.com> 
Message-ID: <199912091655.LAA05928@eric.cnri.reston.va.us>

> "James C. Ahlstrom" wrote:
> 
> [...]
> >     ftp://ftp.interet.com/pub/pylib.html

Jean-Claude Wippler replied:

> Ouch - what's wrong with zip archives?
> 
> There are utilities to convert to/from zip, to re-pack, to mount zip
> transparently so it's entries look like regular files, FTP servers, etc.
> 
> Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format.
> 
> Zips would seem natural with JPython.  And suppose that scripting ever
> starts to consolidate to a common scripting kernel (yah, well), do you
> really want a system which is closing all doors to cross-fertilization?
> 
> Zip has an advantage over .tar.gz in that its table of contents is
> available without having to decompress the whole kaboodle.
> 
> Your format has no checksum, which for deployment and long-term storage
> can be important.
> 
> If you want a marshalled TOC, then why not add a manifest entry for it,
> sort of like what ranlib does with ar?
> 
> You designed the format so archives can be concatenated without any tool
> (other than "cat"), but this works just as well with zip files, as the
> Tcl Wrap approach demonstrates.
> 
> Allow me to very, very loosely paraphrase Guido here: sure, everyone can
> design an archive format, but they are likely to make the same mistakes
> all over again - so why not adopt a format which is tried and tested?
> 
> With all due respect - I sincerely hope you will reconsider and alter
> your code to work with zip files.  It's probably a small adjustment?
> 
> Unless your *intent* is to create a diverging standard, of course...

Exactly my sentiments.  We have rough Python code to deal with zip
files; it's very rough because we got kind of carried away adding
features and ended up with spaghetti code :-(  But it's working code
nevertheless and we're offering it up for anyone in this group to
clean up (we could do that ourselves but it's not high on our current
priority list).

I don't know anything about Tcl Wrap.  I do know a great deal about
the ZIP format, but apparently I missed the concatenation feature.
How does this work?  Does that work for all zip tools, or just for the
ZIP reader in Wrap?  (I looked up how Jim A does it -- his central
directory at the end of the file contains the total size of the data
covered by that directory, so he seeks back to the beginning of it and
sees if another magic number precedes it; and so on.  Very simple.)

I quickly looked at the Wrap page; it shows how to access data files
stored in the archive.  Question: does the wrap::open code go out to
the regular filesystem if it finds there's no wrap archive?  That
would be handy so you can test the code in its unwrapped form without
change.  Python needs this too.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward at cnri.reston.va.us  Thu Dec  9 18:12:00 1999
From: gward at cnri.reston.va.us (Greg Ward)
Date: Thu, 9 Dec 1999 12:12:00 -0500
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Dec 09, 1999 at 11:32:08AM -0500
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <19991209121159.B20179@cnri.reston.va.us>

On 09 December 1999, Andrew M. Kuchling said:
> After poking around in the O'Reilly POSIX book, here's a list of POSIX
> functions that don't seem to be available in Python.  Not all of them
> seem worth supporting.   Ironically, Greg Ward's daemonize() Perl
> subroutine, which started me on this, doesn't actually seem to need
> anything that Python doesn't have.

I think I already pointed this your way, but don't forget the man page
for Perl's POSIX module: "perldoc POSIX".  I suspect POSIX functions
that don't make sense in Perl also don't make sense in Python.

I agree with all your assessments about what's worth adding and what's
not, and that {close,read,open}dir() are questionable and probably not
worth the bother.  Random thoughts:

> abort() -- used in Py_FatalError(), but not accessible to Python code

Would this do the same as in C, ie. terminate the process and dump core?

> getlogin() -- returns user's login name
> 	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
> 	 getlogin() apparently looks in utmp

With a documentation proviso that utmp is very old-fashioned, and you
really should do the getuid() thing unless you definitely want to get
the login ID from utmp.  Perhaps an alternate "getlogin" (different
name?) that does the getuid() thing could be provided.

        Greg


From guido at CNRI.Reston.VA.US  Thu Dec  9 18:16:03 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 12:16:03 -0500
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: Your message of "Thu, 09 Dec 1999 12:12:00 EST."
             <19991209121159.B20179@cnri.reston.va.us> 
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>  
            <19991209121159.B20179@cnri.reston.va.us> 
Message-ID: <199912091716.MAA06063@eric.cnri.reston.va.us>

> > getlogin() -- returns user's login name
> > 	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
> > 	 getlogin() apparently looks in utmp
> 
> With a documentation proviso that utmp is very old-fashioned, and you
> really should do the getuid() thing unless you definitely want to get
> the login ID from utmp.  Perhaps an alternate "getlogin" (different
> name?) that does the getuid() thing could be provided.

There's the getpass module which has a getuser() function that looks
in various env vars and if all else fails uses getuid() and pwd.

If the goal is to get the user ID without being fooled, using
os.getuid() or os.geteuid() directly seems to be the right thing to
do; I don't see the need for a shorthand for
pwd.getpwuid(os.getuid())[0] (which is what getuser() uses).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Thu Dec  9 18:18:10 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 12:18:10 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 10:02:18 EST."
             <384FC47A.BB4DA517@interet.com> 
References: <000301bf4206$b39e5b80$36a2143f@tim>  
            <384FC47A.BB4DA517@interet.com> 
Message-ID: <199912091718.MAA06087@eric.cnri.reston.va.us>

[Jim A]
> Lately I have been more active in distribution via archive files.
> Part of the solution is an archive file format which is identical on
> Unix and Windows, and which can hold the Python library and packages
> as single files.  For my own efforts on this see:
> 
>     ftp://ftp.interet.com/pub/pylib.html

Apart from agreeing with Jean-Claude's rant about inventing a new
archive format, I think this is a good proposal because it is very
clear about the problem it tries to solve and doesn't get distracted
by other issues.  I also commend Jim for building upon Greg Stein's
imputil (like Gordon did).  I wish I could present a solution this
simple as The Standard Way, but (as explained in my long post earlier
today) there just are so many wrinkles that I'd rather hold out for
the Right Solution...  But I've taken good notice of Jim's solution.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From beazley at cs.uchicago.edu  Thu Dec  9 18:16:57 1999
From: beazley at cs.uchicago.edu (David Beazley)
Date: Thu, 9 Dec 1999 11:16:57 -0600 (CST)
Subject: [Python-Dev] Missing POSIX functions: the list
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
	<19991209121159.B20179@cnri.reston.va.us>
Message-ID: <199912091716.LAA15624@gargoyle.cs.uchicago.edu>

Greg Ward writes:
> 
> I think I already pointed this your way, but don't forget the man page
> for Perl's POSIX module: "perldoc POSIX".  I suspect POSIX functions
> that don't make sense in Perl also don't make sense in Python.
> 
> I agree with all your assessments about what's worth adding and what's
> not, and that {close,read,open}dir() are questionable and probably not
> worth the bother.  Random thoughts:
> 

I disagree.  I think that the POSIX module should strive to be as
complete as possible--even if certain functions are closely related
other functionality in the library (tmpfile for instance).  I suspect
that this sort of thing is probably the cause of the missing
functionality in the current library (as in, "why would anyone want to
do that?" when in fact there may be a perfectly good reason in certain
situations).  

> > abort() -- used in Py_FatalError(), but not accessible to Python code
> 
> Would this do the same as in C, ie. terminate the process and dump core?
> 

Sure, why not?  This might be a useful thing to do every so
often---when trying to figure out what's wrong with a C extension
module for instance.

Cheers,

Dave


From jim at interet.com  Thu Dec  9 18:43:57 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 09 Dec 1999 12:43:57 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com>
Message-ID: <384FEA5D.A07F23EC@interet.com>

Jean-Claude Wippler wrote:

> Ouch - what's wrong with zip archives?

Thanks very much for looking over the format.

In general Zip archives store whole branches of a file
system.  A Python ./Lib zip archive would contain:

  N:/python/Python-1.5.2/Lib/string.pyc
  N:/python/Python-1.5.2/Lib/os.pyc
  N:/python/Python-1.5.2/Lib/copy.pyc
  N:/python/Python-1.5.2/Lib/test/testall.pyc

Zip archives are isomorphic to branches of a file system.
That means there must be a sys.path for each zip archive file.
How would this be specified?

The archive format stores modules as dotted names, just as they
appear in the import statement.  The search path is "." in every
archive file by definition.  The import statement "import foo"
just results in a dictionary lookup for key "foo", not a search
through a zip directory along a local search path for "foo.something"
where "something" can be pyc, pyo, py, etc.

The intent was to link the archives to the import statement, not
re-create a directory tree.  It borrowed this feature from
the archive formats of Greg and Gordon.

> There are utilities to convert to/from zip, to re-pack, to mount zip
> transparently so it's entries look like regular files, FTP servers, etc.

Basic operations (to, from, repack) are easy in Python.

> Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format.

Hmmm....
 
> Your format has no checksum, which for deployment and long-term storage
> can be important.

Actually the pylib.py "dir()" method reads all *.pyc with marshal,
and I am depending on marshal to object to bad data and also
out-of-date magic numbers.  But this is a good point.

> If you want a marshalled TOC, then why not add a manifest entry for it,
> sort of like what ranlib does with ar?

Sorry, I don't understand.  Please explain.

> You designed the format so archives can be concatenated without any tool
> (other than "cat"), but this works just as well with zip files, as the
> Tcl Wrap approach demonstrates.

Are you saying that cat zip1.zip zip2.zip > myzip.zip works?

An important feature is the ability to concatenate to a binary:
  cat python.exe zip1.zip > myapp.exe
Searching for this isn't fast unless magic numbers are at the
end.  Are zip files recognizable from the end (I don't know)?

> Allow me to very, very loosely paraphrase Guido here: sure, everyone can
> design an archive format, but they are likely to make the same mistakes
> all over again - so why not adopt a format which is tried and tested?
> 
> With all due respect - I sincerely hope you will reconsider and alter
> your code to work with zip files.  It's probably a small adjustment?
> 
> Unless your *intent* is to create a diverging standard, of course...

The intent is to create a standard but not a diverging standard.

Are there any zip experts out there?  Can zip files satisfy all the
design requirements I listed in pylib.html?  Is there zip code
available?  All my code is in Python.

JimA


From jcw at equi4.com  Thu Dec  9 18:57:33 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Thu, 09 Dec 1999 18:57:33 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com>  
	            <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us>
Message-ID: <384FED8D.3C535D38@equi4.com>

Guido van Rossum wrote:
> 
> [... my not-really-meant-as-rant about adopting zip as format ...]
>
[zip concatenation feature]

> How does this work?  Does that work for all zip tools, or just for the
> ZIP reader in Wrap?  (I looked up how Jim A does it -- his central
> directory at the end of the file contains the total size of the data
> covered by that directory, so he seeks back to the beginning of it and
> sees if another magic number precedes it; and so on.  Very simple.)

Same for Wrap.  Standard tools would not see the preceding ZIP groups.

In terms of maintenance, I'd avoid this trick.  I merely wanted to point
out that zip archives can be stacked, if the reader is set up to it.

> Question: does the wrap::open code go out to the regular filesystem
> if it finds there's no wrap archive?  That would be handy so you can
> test the code in its unwrapped form without change.

IIRC, Wrap overrides "open" for embedded entries as "file.zip/abc.py".
There's more being developed in this area: a "virtual file system" which
lets you mount archives and such (VFS by Matt Newman, mentioned with his
permission), so that the file-system model can be extended to navigate
into a lot more things than real file systems.

Andrew Kuchling's post hints at another tangent: opendir/readdir is of
course simply an enumeration.  There's a lot of "genericity" lurking in
scanning across file systems, trees, networks, and resources in general.

<minirant> The filesystem <-> OO dichotomy needs a review. </minirant>

> Python needs this too.

<voice location=in-the-desert level=timid>
Concepts like these have a lot to offer - and would make even more sense
if they were done in a way which benefits multiple scripting languages.
Feel free to reply by email if you ever want to further discuss this.
</voice>

-- Jean-Claude


From fdrake at acm.org  Thu Dec  9 19:10:44 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 13:10:44 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <14415.61604.415084.520092@weyr.cnri.reston.va.us>

Andrew M. Kuchling writes:
 > After poking around in the O'Reilly POSIX book, here's a list of POSIX
 > functions that don't seem to be available in Python.  Not all of them
 > seem worth supporting.   Ironically, Greg Ward's daemonize() Perl

  I think your assessment is reasonable.  I looked at posixmodule.c
and note also that the functions use PyArg_Parse() and PyArg_NoArgs()
instead of using PyArg_ParseTuple().  The advantage of
PyArg_ParseTuple() is that the name of the function can be specified
for inclusion in TypeError messages when the arguments are not of the
right type.
  I'm doing some work to correct this now.  I've also added ctermid(), 
and will try to add at least a few more before I check in the changes.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From bwarsaw at cnri.reston.va.us  Thu Dec  9 19:17:35 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Thu, 9 Dec 1999 13:17:35 -0500 (EST)
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim>
	<384FC47A.BB4DA517@interet.com>
	<384FDAF5.C25C447C@equi4.com>
	<199912091655.LAA05928@eric.cnri.reston.va.us>
	<384FED8D.3C535D38@equi4.com>
Message-ID: <14415.62015.856931.750279@anthem.cnri.reston.va.us>

>>>>> "JW" == Jean-Claude Wippler <jcw at equi4.com> writes:

    JW> Same for Wrap.  Standard tools would not see the preceding ZIP
    JW> groups.

    JW> In terms of maintenance, I'd avoid this trick.  I merely
    JW> wanted to point out that zip archives can be stacked, if the
    JW> reader is set up to it.

I agree.  I can't recall the details now, but I had a lot of problems
with zip concatenation in JPython.  I think at least some of the older
Java tools for groking zips don't work with contatenation.

-Barry


From guido at CNRI.Reston.VA.US  Thu Dec  9 19:21:42 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 13:21:42 -0500
Subject: [Python-Dev] Virtual filesystem APIs
In-Reply-To: Your message of "Thu, 09 Dec 1999 18:57:33 +0100."
             <384FED8D.3C535D38@equi4.com> 
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us>  
            <384FED8D.3C535D38@equi4.com> 
Message-ID: <199912091821.NAA06209@eric.cnri.reston.va.us>

Jean-Claude Wippler:
> There's more being developed in this area: a "virtual file system" which
> lets you mount archives and such (VFS by Matt Newman, mentioned with his
> permission), so that the file-system model can be extended to navigate
> into a lot more things than real file systems.

I agree.  We have experimented with this a bunch in the Knowbot
sofware, where we have some code that wants to look at a "filesystem"
but could be talking to some kind of filesystem emulation across an
RPC connection or alternatively could be accessing a zip file.  Our
conclusion is that a convenient interface is modeled after (a subset
of) the os and os.path functionality.  In fact, the only thing you
would need to add to the os module would be a function to open a file
object; I've proposed to add os.fopen() as an alias for the built-in
open().

The idea that you could mount one VFS inside another is nice, although
I'm not sure how practical it is.  For one thing, in our fs code,
os.path.sep and friends (e.g. os.path.normcase behavior) were set per
filesystem; what would happen if you mounted a Unix filesystem in an
NT tree?  Doing the translations is hard too; e.g. on a Mac fs, the
separator is ':' and a '/' can be part of a filename -- do you simply
swap them?  What if a Mac file has both '/' and '\'  and you mount it
on a Windows FS?  I'd rather stay away from this.

On the other hand the VFS concept could be used as a totally different
solution to the sys.importers vs. sys.path 

> Andrew Kuchling's post hints at another tangent: opendir/readdir is of
> course simply an enumeration.  There's a lot of "genericity" lurking in
> scanning across file systems, trees, networks, and resources in general.

I'd still rather see listdir() (which our sample virtual FS API
supported).  I don't think it necessarily makes sense to do this on a
more generic basis -- other trees and graphs have sufficiently
different semantics that using a FS like API doesn't necessarily cut
it.  Take for example the Windows registry -- looks a lot like a
filesystem, doesn't it?  Yet it has one fundamental property that a
typical FS doesn't: directory nodes can have data *and* children...

I've written a tree widget and found that it's remarkably hard to come
up with a workable API to talk to trees *in general*.  Trees are a
universal concept, but code sharing is still elusive...  Perhaps
because the concept is so simple?

> <minirant> The filesystem <-> OO dichotomy needs a review. </minirant>

I think that my proposal above should cover this.  (We looked briefly
at doing a similar thing for Java, and found that it's actually harder
there -- they have all these nice objects representing paths, but it's
not easily subclassable to represent paths in some virtual
filesystem.)

> Concepts like these have a lot to offer - and would make even more sense
> if they were done in a way which benefits multiple scripting languages.
> Feel free to reply by email if you ever want to further discuss this.

I see only very hope for this point of view, but I will refrain to
comment more.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Thu Dec  9 19:23:14 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Thu, 9 Dec 1999 13:23:14 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <384FEA5D.A07F23EC@interet.com>
Message-ID: <1267359311-37934097@hypernet.com>

James C. Ahlstrom wrote:

> Jean-Claude Wippler wrote:
> 
> > Ouch - what's wrong with zip archives?

> In general Zip archives store whole branches of a file
> system.  

> The archive format stores modules as dotted names, just as they
> appear in the import statement.  The search path is "." in every
> archive file by definition.  The import statement "import foo"
> just results in a dictionary lookup for key "foo", not a search
> through a zip directory along a local search path for
> "foo.something" where "something" can be pyc, pyo, py, etc.
> 
> The intent was to link the archives to the import statement, not
> re-create a directory tree.  It borrowed this feature from the
> archive formats of Greg and Gordon.

As I've stated before, I have 2 archive formats. This may seem 
a needless complication, but my suspicion is that sooner or 
later, people will want 2 different kinds.

One is a .pyz format, which corresponds closely to Jim's .pyl 
format (with a number of minor differences: it's compressed, 
the archive as a whole has the Python magic number, instead 
of each entry, and it's not designed for concatenation).
 
The other is like a zip, and probably should be zip format.  It's 
designed to hold _anything_, and can be manipulated from C 
and from Python. It can be concatenated and / or embedded 
(and the innner one opened without extraction). It's table of 
contents is more file-system like. Importing from one is 
slower, but that's not really what it's for. It's for packaging up 
arbitrary resources. Like .pyz's, or Tcl/Tk for Tkinter apps, or 
configuration files.

Jim is correct that a good importer (which can say "No, it's not 
mine" as quickly as possible) is better satisfied by a simple 
dictionary lookup than fooling with file extensions and 
directories (virtual or real).

> > If you want a marshalled TOC, then why not add a manifest entry
> > for it, sort of like what ranlib does with ar?
> 
> Sorry, I don't understand.  Please explain.

The table of contents is just another entry.
 
> An important feature is the ability to concatenate to a binary:
>   cat python.exe zip1.zip > myapp.exe
> Searching for this isn't fast unless magic numbers are at the
> end.  Are zip files recognizable from the end (I don't know)?

Where do you think we got this idea?

> Are there any zip experts out there?  Can zip files satisfy all
> the design requirements I listed in pylib.html?  Is there zip
> code available?  All my code is in Python.

Hmm. My bookmark appears to be dead (I was there not long 
ago):
http://www.cubic.org/source/archive/fileform/packers/appnote.t
xt

There have been several references on this list to Guido et al 
having some Python / zip code.


- Gordon


From guido at CNRI.Reston.VA.US  Thu Dec  9 19:23:27 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 13:23:27 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 13:17:35 EST."
             <14415.62015.856931.750279@anthem.cnri.reston.va.us> 
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com>  
            <14415.62015.856931.750279@anthem.cnri.reston.va.us> 
Message-ID: <199912091823.NAA06243@eric.cnri.reston.va.us>

> I agree.  I can't recall the details now, but I had a lot of problems
> with zip concatenation in JPython.  I think at least some of the older
> Java tools for groking zips don't work with contatenation.

The Java "jar" tool mostly ignores the central directory -- it seems
to read the archive from the front, using the local header records,
and ignoring the central directory (of course it writes one when it
creates an archive).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Thu Dec  9 19:32:15 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 13:32:15 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Thu, 09 Dec 1999 12:43:57 EST."
             <384FEA5D.A07F23EC@interet.com> 
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com>  
            <384FEA5D.A07F23EC@interet.com> 
Message-ID: <199912091832.NAA06287@eric.cnri.reston.va.us>

> In general Zip archives store whole branches of a file
> system.  A Python ./Lib zip archive would contain:
> 
>   N:/python/Python-1.5.2/Lib/string.pyc
>   N:/python/Python-1.5.2/Lib/os.pyc
>   N:/python/Python-1.5.2/Lib/copy.pyc
>   N:/python/Python-1.5.2/Lib/test/testall.pyc
> 
> Zip archives are isomorphic to branches of a file system.
> That means there must be a sys.path for each zip archive file.
> How would this be specified?

Not true.  It's easy (using the proper Zip tools) to creat an archive
containing this instead:

  string.pyc
  os.pyc
  copy.pyc
  testall.pyc

Thus the entire archive is considered the directory.  The Java "jar"
tool uses this approach.  It's also easy to have packages in there
(again this is what Java does):

  test/
  test/__init__.pyc
  test/pystone.pyc
  test_support.pyc
  (etc.)

> The archive format stores modules as dotted names, just as they
> appear in the import statement.  The search path is "." in every
> archive file by definition.  The import statement "import foo"
> just results in a dictionary lookup for key "foo", not a search
> through a zip directory along a local search path for "foo.something"
> where "something" can be pyc, pyo, py, etc.
> 
> The intent was to link the archives to the import statement, not
> re-create a directory tree.  It borrowed this feature from
> the archive formats of Greg and Gordon.

Maybe you've gone overboard.  The time it takes to translate the dots
into slashes really isn't the big deal.

> Are there any zip experts out there?  Can zip files satisfy all the
> design requirements I listed in pylib.html?  Is there zip code
> available?  All my code is in Python.

Yes (all of us here at CNRI), yes, yes (we have the spaghetti code).
While zip files support compression, they support uncompressed files
as well and we could go either way.  Their most popular compression
format is gzip compatible and can be read and written with the zlib
module, which is in the standard Python distribution (even on Windows)
-- though to build it you need the zlib C library which is of course
external (but solid open source).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Thu Dec  9 19:41:22 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 13:41:22 -0500 (EST)
Subject: [Python-Dev] Virtual filesystem APIs
In-Reply-To: <199912091821.NAA06209@eric.cnri.reston.va.us>
References: <000301bf4206$b39e5b80$36a2143f@tim>
	<384FC47A.BB4DA517@interet.com>
	<384FDAF5.C25C447C@equi4.com>
	<199912091655.LAA05928@eric.cnri.reston.va.us>
	<384FED8D.3C535D38@equi4.com>
	<199912091821.NAA06209@eric.cnri.reston.va.us>
Message-ID: <14415.63442.92911.748132@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > os.path.sep and friends (e.g. os.path.normcase behavior) were set per

  Hah!  Caught you in public!  "sep" & friends are defined in the os
module; this is where the separation breaks down.
  I think these should be located in os.path, and os can just pick
them up from there to be backward compatible.
  os.pathsep is a problem, somewhat; it is related to os.sep, but is
very different in many ways.  I don't think there's a good way to deal 
with it.

 > filesystem; what would happen if you mounted a Unix filesystem in an
 > NT tree?  Doing the translations is hard too; e.g. on a Mac fs, the
 > separator is ':' and a '/' can be part of a filename -- do you simply
 > swap them?  What if a Mac file has both '/' and '\'  and you mount it
 > on a Windows FS?  I'd rather stay away from this.

  And this is tightly related to the sep/pathsep problem as well.  I
agree, we should stay away from it.

 > I think that my proposal above should cover this.  (We looked briefly
 > at doing a similar thing for Java, and found that it's actually harder
 > there -- they have all these nice objects representing paths, but it's
 > not easily subclassable to represent paths in some virtual

  But it was easy to create a set of interfaces with a reasonable API; 
getting back to the "typical" Java classes was what really changed the 
most.
  For those of us not working on the KOE:  I set up Filesystem and
FSFile interfaces; the Filesystem represented the entire filesystem
and the FSFile was very similar to the java.io.File class, but had
additional methods to get input and output stream objects (of the
standard Java flavor); all the buffering and such could be wrapped on
top of that just like any other Java I/O.
  The specific application was to provide access to an isolated
directory structure which untrusted code "owned", but ensured that
parent directories were unreachable.  Additional security checks can
be worked into such a structure as applicable.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From fdrake at acm.org  Thu Dec  9 20:06:32 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 14:06:32 -0500 (EST)
Subject: [Python-Dev] posix module test suite
Message-ID: <14415.64952.780974.8124@weyr.cnri.reston.va.us>

  There's not a test for the posix or os modules; if anyone would like 
to contribute one, this would be a good time!  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jcw at equi4.com  Thu Dec  9 21:51:11 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Thu, 09 Dec 1999 21:51:11 +0100
Subject: [Python-Dev] Virtual filesystem APIs
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us>  
	            <384FED8D.3C535D38@equi4.com> <199912091821.NAA06209@eric.cnri.reston.va.us>
Message-ID: <3850163F.80BDCB75@equi4.com>

Guido van Rossum wrote:
>
[... horrors of cross-OS mounts and ":\/" separators ...]

I agree, this has some very hairy sides to it.  But VFS is really more
about mounting non-FS things in a "root" FS (presumably the real one).

> On the other hand the VFS concept could be used as a totally different
> solution to the sys.importers vs. sys.path

Heck, I'll be the "enfant terrible" once more: yes, and this stuff could
well be implemented generically across scripting languages.  Of course
the act of "importing" is a very Pythonic issue - but FS/VFS traversal
and the actual shared library load need not be.  Anyway, enough of that.

> Take for example the Windows registry -- looks a lot like a 
> filesystem, doesn't it?  Yet it has one fundamental property that a
> typical FS doesn't: directory nodes can have data *and* children...

What you're saying is that dir = set-of-subdirs + set-of-files, and that
this is a more general requirement than plain FS's.  Doesn't that simply
mean that the more general model is needed as basis to handle both?

> Trees are a universal concept, but code sharing is still elusive...

Ah, but think of the implications: archives, networks, XML, the world!

-- Jean-Claude


From fdrake at acm.org  Thu Dec  9 22:16:00 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 16:16:00 -0500 (EST)
Subject: [Python-Dev] forwarded message from Fred L. Drake
Message-ID: <14416.7184.255000.342231@weyr.cnri.reston.va.us>


  OK, I've checked in some changes to the posix module to add support
for a few of the POSIX interfaces Andrew expressed interest in seeing
(and some he said weren't such a good idea, or at least not necessary,
but about which I decided I disagreed after all).
  For those of you who aren't on the checkins list (??), I've attached 
the message so you'll know what functions were added.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


-------------- next part --------------
An embedded message was scrubbed...
From: "Fred L. Drake" <fdrake at weyr.cnri.reston.va.us>
Subject: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.115,2.116
Date: Thu, 9 Dec 1999 16:13:10 -0500 (EST)
Size: 3800
URL: <http://mail.python.org/pipermail/python-dev/attachments/19991209/ed5f3b37/attachment-0001.eml>

From guido at CNRI.Reston.VA.US  Thu Dec  9 22:19:57 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 16:19:57 -0500
Subject: [Python-Dev] forwarded message from Fred L. Drake
In-Reply-To: Your message of "Thu, 09 Dec 1999 16:16:00 EST."
             <14416.7184.255000.342231@weyr.cnri.reston.va.us> 
References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> 
Message-ID: <199912092119.QAA06731@eric.cnri.reston.va.us>

>   OK, I've checked in some changes to the posix module to add support
> for a few of the POSIX interfaces Andrew expressed interest in seeing
> (and some he said weren't such a good idea, or at least not necessary,
> but about which I decided I disagreed after all).

I wish you'd made your disagreement public before checking it in...
But it's not too late...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin at mems-exchange.org  Thu Dec  9 22:32:26 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Thu, 9 Dec 1999 16:32:26 -0500 (EST)
Subject: [Python-Dev] forwarded message from Fred L. Drake
In-Reply-To: <14416.7184.255000.342231@weyr.cnri.reston.va.us>
References: <14416.7184.255000.342231@weyr.cnri.reston.va.us>
Message-ID: <14416.8170.18298.33796@amarok.cnri.reston.va.us>

Fred L. Drake, Jr. writes (in a CVS checkin):
>Added support for abort(), ctermid(), tmpfile(), tempnam(), tmpnam(),
>and TMP_MAX.

For those of you following along, the tmpfile(), tempnam(), tmpnam()
functions were ones I listed as probably not worth adding.  On the
other hand, David Beazley wrote:

>  I think that the POSIX module should strive to be as
>complete as possible--even if certain functions are closely related
>other functionality in the library (tmpfile for instance).  I suspect

... and that's a good point, too.  The POSIX functions may provide
adaptability that a Python analog doesn't; for example, you could read
/etc/passwd in pure Python, but that wouldn't handle NIS or shadow
passwords.  So I guess I'll vote for completeness over lack of
overlap; leave tmpfile() & friends in.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
This supports reflection, which is the 90s way of writing self-modifying code.
    -- John Aycock at IPC7, during his parsing talk


From guido at CNRI.Reston.VA.US  Thu Dec  9 22:38:42 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 16:38:42 -0500
Subject: [Python-Dev] forwarded message from Fred L. Drake
In-Reply-To: Your message of "Thu, 09 Dec 1999 16:32:26 EST."
             <14416.8170.18298.33796@amarok.cnri.reston.va.us> 
References: <14416.7184.255000.342231@weyr.cnri.reston.va.us>  
            <14416.8170.18298.33796@amarok.cnri.reston.va.us> 
Message-ID: <199912092138.QAA06790@eric.cnri.reston.va.us>

> ... and that's a good point, too.  The POSIX functions may provide
> adaptability that a Python analog doesn't; for example, you could read
> /etc/passwd in pure Python, but that wouldn't handle NIS or shadow
> passwords.  So I guess I'll vote for completeness over lack of
> overlap; leave tmpfile() & friends in.

OK, I agree now.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Thu Dec  9 23:30:52 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 9 Dec 1999 17:30:52 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <14416.11676.888918.511932@weyr.cnri.reston.va.us>

Andrew M. Kuchling writes:
 > After poking around in the O'Reilly POSIX book, here's a list of POSIX

  Ok, here's my comments on the remainder of these.

 > Worth adding?
 > =============
 > opendir(), readdir(), closedir() -- 
 > 	   most of their functionality is available through
 > 	   os.listdir(), but it might be useful to have a direct
 > 	   interface.  Downside is that this would require a new
 > 	   extension type for the C DIR struct.  My (lazy) inclination
 > 	   is to not bother.

  [rewinddir() and seekdir() should be considered as well, where
supported.]

  There's more tedium than anything in implementing a new C type.  I'm 
a little concerned that there might not be any real value here, but
it's hard to be sure about that.  Is there any real reason not to use
os.listdir().

 > Worth adding:
 > =============
...
 > fpathconf(fd, name) -- Get configuration limit for a file
 > 	    -- would need constants from unistd.h

  This is mostly a matter of setting up the constants; not hard, just
more distracting than I want to deal with right now.

 > getlogin() -- returns user's login name
 > 	 -- could do something similar with pwd.getpwuid( os.getuid() )[0], but
 > 	 getlogin() apparently looks in utmp

  Per Guido's comments, I'm not sure how valuable it is.  It may make
sense strictly for completeness, but I've never heard of utmp being
considered reliable in any way.  Maybe I'm too new at all this.

 > getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs

  This should be easy enough.

 > pathconf(path, name) -- Gets config variables for a path
 > 	    -- would need constants from unistd.h

  (Same as for fpathconf().)

 > sysconf(int name) -- Gets system configuration information
 > 	    -- would need constants from unistd.h
 > 
 > Not worth adding:
 > =================

  Aside from the ones I've already added, I agree.  ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jim at digicool.com  Fri Dec 10 00:31:40 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 09 Dec 1999 18:31:40 -0500
Subject: [Python-Dev] Thankyou for fsync :)
Message-ID: <38503BDC.CB91FB29@digicool.com>

I found recently that I needed fsync and was pleasantly surprized 
to find that it is provided in the posix module, where available.

Can I count on it staying in the posix module, when available, 
for the forseeable future?

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From gstein at lyra.org  Fri Dec 10 01:32:33 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 9 Dec 1999 16:32:33 -0800 (PST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <14416.11676.888918.511932@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912091630540.10472-100000@nebula.lyra.org>

On Thu, 9 Dec 1999, Fred L. Drake, Jr. wrote:
> Andrew M. Kuchling writes:
>...
>  > opendir(), readdir(), closedir() -- 
>  > 	   most of their functionality is available through
>  > 	   os.listdir(), but it might be useful to have a direct
>  > 	   interface.  Downside is that this would require a new
>  > 	   extension type for the C DIR struct.  My (lazy) inclination
>  > 	   is to not bother.
> 
>   [rewinddir() and seekdir() should be considered as well, where
> supported.]
> 
>   There's more tedium than anything in implementing a new C type.  I'm 
> a little concerned that there might not be any real value here, but
> it's hard to be sure about that.  Is there any real reason not to use
> os.listdir().

No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic
number if you're worried about mixing CObjects.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido at CNRI.Reston.VA.US  Fri Dec 10 03:03:04 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 09 Dec 1999 21:03:04 -0500
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: Your message of "Thu, 09 Dec 1999 18:31:40 EST."
             <38503BDC.CB91FB29@digicool.com> 
References: <38503BDC.CB91FB29@digicool.com> 
Message-ID: <199912100203.VAA07410@eric.cnri.reston.va.us>

> I found recently that I needed fsync and was pleasantly surprized 
> to find that it is provided in the posix module, where available.
> 
> Can I count on it staying in the posix module, when available, 
> for the forseeable future?

Since we seem to be on an adding spree, I don't see why not -- as long
as POSIX keeps it available :)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip at mojam.com  Fri Dec 10 07:28:56 1999
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 10 Dec 1999 00:28:56 -0600 (CST)
Subject: [Python-Dev] posix module test suite
In-Reply-To: <14415.64952.780974.8124@weyr.cnri.reston.va.us>
References: <14415.64952.780974.8124@weyr.cnri.reston.va.us>
Message-ID: <14416.40360.611743.143624@dolphin.mojam.com>

    Fred> There's not a test for the posix or os modules; if anyone would
    Fred> like to contribute one, this would be a good time!  ;-)

Not having ever written any tests for the core Python modules, it seems
natural to ask if there are any guidelines for the construction of such
tests or the test equivalent of the Modules/xxmodule.c file.  Are there
standard behaviors expected for passing and failing a test?

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From tim_one at email.msn.com  Fri Dec 10 09:48:59 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Fri, 10 Dec 1999 03:48:59 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <14415.23676.775163.786028@dolphin.mojam.com>
Message-ID: <000501bf42eb$66529860$412d153f@tim>

[Skip Montanaro]
> Alright!  Now I understand what all the hubbub is about!  My eyes have
> mostly been glazing over trying to follow all this Windows
> registry/path/ini stuff.  MS believes that Python is the application.
> Those of us writing Python programs view those programs as the
> applications, not the Python interpreter per se.

Eww -- that's a helpful and insightful way to put it, Skip!  Now maybe *I*
can understand what the hubbub is about <wink>.

> Is there some way that people writing applications in Python can set
> up registry entries that are specific to their application (e.g.
> tabnanny.py) instead of only specific to the Python interpreter?

Yes, but they can't get Python to look at those before it's too late.  I
spent a whole evening a month or two ago just trying to figure out where all
the cruft in my Windows sys.path *came* from.  This is out-of-the-box; I
haven't added anything myself:

['',
 'D:\\Python\\win32',
 'D:\\Python\\win32\\lib',
 'D:\\Python',
 'D:\\Python\\Pythonwin',
 'D:\\Python\\Lib\\plat-win',
 'D:\\Python\\Lib',
 'D:\\Python\\DLLs',
 'D:\\Python\\Lib\\lib-tk',
 'D:\\PYTHON\\DLLs',
 'D:\\PYTHON\\lib',
 'D:\\PYTHON\\lib\\plat-win',
 'D:\\PYTHON\\lib\\lib-tk',
 'D:\\PYTHON']

That's bizarre on the face of it, and tracking it all down was draining.
I've forgotten the details.  I do remember concluding that it was impossible
to do what I wanted to do without changing the implementation, though, and
nobody on Python-Dev disputed that at the time.

In a pragmatic crunch, I wrote the little app I needed to distribute at the
time in Perl instead, meaning to come back to this.  I haven't had time.

IIRC, the ultimate problem wasn't really that Python looked at the registry
to get *some* path info, it was a combination of

A) It looked at the registry so early that it was impossible to stop it from
executing whatever site.py the registry pointed at (well, I could with
the -S option -- but then there was no way to get it to do the site.py that
was *wanted* instead).

B) No way to override what was in the registry; e.g., I was greatly
surprised to discover that setting a PYTHONPATH envar didn't override
anything, it simply plunked the PYTHONPATH entries into sys.path along with
everything else -- and too late to stop anything anyway.

In a long msg I haven't yet read all the way thru, Guido at least suggested
associating different registry path info with different Python versions.
That would address a number of otherwise currently intractable problems.

I suspect it still wouldn't help with the problem I was facing, though.
That is, I wanted to be able to tell people to run

\\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py

which is just a Windows way of saying "run a Python executable from a shared
network location".  When they tried that, though, the network Python looked
in *their* individual registries for its Python path info, and some of the
hackers with mondo customized Python setups on their own machines watched
things go down in flames.

This certainly can't be a common problem, but it speaks to an unforgiving
rigidity in the current approach.  There seemed to be nothing I could do to
guarantee this would work, short of telling users to edit their registries
before running this tool (that's a non-starter on Windows -- editing the
registry is dangerous) or putting a customized Python on the network
pointing to a bogus registry key (it was faster to write the app in Perl!
Perl doesn't *try* to be so infernally helpful <wink>, so doesn't get in the
way either).

I'm left wondering what purpose putting Python library path info into the
Windows registry serves.  Is there anyone on Windows who *doesn't* have
their Python Lib/ etc as direct subdirectories of the directory containing
python.exe?  Not that I've seen.  Python puts *those* in sys.path too -- but
only after it (in the normal case; see my sys.path above) pulls identically
redundant paths out of the registry first, or (in the cases we're griping
about) pulls irrelevant or downright harmful paths out of the registry first
(paths appropriate to the last Python you *installed*, not to the Python
that's *running*!).

Perhaps all this cruft is needed to support embedded Python, though
(something I've never done).

Regardless, I expect it would have been enough for me if PYTHONPATH simply
worked the way I mistakenly assumed it would (that is, this is sys.path, and
that's *it*; feel free to prepend the current directory when initialization
is complete, but before then looking at any file not reached from PYTHONPATH
is verboten).

the-cleverer-the-code-the-more-vital-that-there-be-a-way-to-
    short-circuit-it-ly y'rs  - tim


From jim at interet.com  Fri Dec 10 13:16:31 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 07:16:31 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000501bf42eb$66529860$412d153f@tim>
Message-ID: <3850EF1F.158445B6@interet.com>

Tim Peters wrote:
> 
> [Skip Montanaro]
> > Is there some way that people writing applications in Python can set
> 
> Yes, but they can't get Python to look at those before it's too late.  I
> spent a whole evening a month or two ago just trying to figure out where all
> the cruft in my Windows sys.path *came* from.  This is out-of-the-box; I
> .....

Excellent discussion Tim!

> I suspect it still wouldn't help with the problem I was facing, though.
> That is, I wanted to be able to tell people to run
> 
> \\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py
> 
> which is just a Windows way of saying "run a Python executable from a shared
> network location".  When they tried that, though, the network Python looked
> in *their* individual registries for its Python path info, and some of the
> hackers with mondo customized Python setups on their own machines watched
> things go down in flames.

I think a sensible way to run little apps is to put everything
in an archive file including the main.py.  On Windows you
concattenate that to python.exe, and it Just Works.

> Windows registry serves.  Is there anyone on Windows who *doesn't* have
> their Python Lib/ etc as direct subdirectories of the directory containing
> python.exe?  Not that I've seen.

Point on the curve.  We don't.  We freeze everything except the main.py.

JimA


From jim at interet.com  Fri Dec 10 14:38:28 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 08:38:28 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com>
Message-ID: <38510254.ED15D32B@interet.com>

Jean-Claude Wippler wrote:

> Ouch - what's wrong with zip archives?

> With all due respect - I sincerely hope you will reconsider and alter
> your code to work with zip files.  It's probably a small adjustment?

OK, you talked me into it.  Ya, small adjustment, no problem ;-)

JimA


From jack at oratrix.nl  Fri Dec 10 14:51:10 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 10 Dec 1999 14:51:10 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Message by "James C. Ahlstrom" <jim@interet.com> ,
	     Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com> 
Message-ID: <19991210135111.2F83C370CF2@snelboot.oratrix.nl>

Is it possible nowadays to have two files with the same name but different 
paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive?

That's the one thing that always struck me as very very silly about zipfiles.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From gmcm at hypernet.com  Fri Dec 10 15:28:51 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Fri, 10 Dec 1999 09:28:51 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: <19991210135111.2F83C370CF2@snelboot.oratrix.nl>
References: Message by "James C. Ahlstrom" <jim@interet.com> ,	     Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com> 
Message-ID: <1267287023-386248@hypernet.com>

Jack Jansen asks:

> Is it possible nowadays to have two files with the same name but
> different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same
> archive?

Depends on how you do it.

If the user imports foo.spam.bar, an importer will be asked for:
  foo (return foo.__init__)
  foo.spam (return foo.bar.__init__)
  foo.spam.bar (return foo.spam.bar)

But the API allows lots of variations. This is another possible 
interaction:
  foo (return None)
  foo.__init__ (return foo.__init__)
  foo.spam (return None)
  foo.bar.__init__ (return foo.bar.__init__)
  foo.spam.bar (return foo.spam.bar)

Or, by looking at different args to get_code, you could look at 
the requests as:
  foo in context of None
  spam in context of foo
  bar in context of foo.spam
 
With another variation where the request for __init__ becomes 
explicit.

The first way seems the natural way for archives, and makes it 
easy to keep foo.bar.spam distinct from foo.spam.

> That's the one thing that always struck me as very very silly
> about zipfiles.

Huh?

- Gordon


From guido at CNRI.Reston.VA.US  Fri Dec 10 15:51:39 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 09:51:39 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 14:51:10 +0100."
             <19991210135111.2F83C370CF2@snelboot.oratrix.nl> 
References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> 
Message-ID: <199912101451.JAA07786@eric.cnri.reston.va.us>

> Is it possible nowadays to have two files with the same name but different 
> paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive?
> 
> That's the one thing that always struck me as very very silly about zipfiles.

Zip files contain the full path, there's no problem with that.  Was
there ever?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jack at oratrix.nl  Fri Dec 10 15:52:26 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 10 Dec 1999 15:52:26 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy 
 )
In-Reply-To: Message by "Gordon McMillan" <gmcm@hypernet.com> ,
	     Fri, 10 Dec 1999 09:28:51 -0500 , <1267287023-386248@hypernet.com> 
Message-ID: <19991210145227.01F99370CF2@snelboot.oratrix.nl>

> Jack Jansen asks:
> 
> > Is it possible nowadays to have two files with the same name but
> > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same
> > archive?
> 
> Depends on how you do it.

Apparently I mis-phrased my question, I'll try again.

When people suggested to use zip format as the standard Python archive format 
I was a bit worried, becuase I've had it happen to me various times that I was 
unable to create a ZIP archive with two files with the same name but different 
paths (i.e. create an archive of a directory that contains both a foo/bar.py 
and a foo/spam/bar.py).

So, my question was: has this happened to me because the winzip I used was 
braindead, or is there possibly a problem with the ZIP file format that 
disallows two files with the same name in one archive? Most zip programs I've 
seen also seem to present filenames as the primary metaphore, with full 
pathnames somewhat "tacked on".

If the latter is the case I wonder whether zip is the right format to use...
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From guido at CNRI.Reston.VA.US  Fri Dec 10 16:00:51 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 10:00:51 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 15:52:26 +0100."
             <19991210145227.01F99370CF2@snelboot.oratrix.nl> 
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> 
Message-ID: <199912101500.KAA07863@eric.cnri.reston.va.us>

Again, the zip format does not have this problem.  Some zip tools may
-- then we don't use those.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Fri Dec 10 16:40:21 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 10:40:21 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <Pine.LNX.4.10.9912091630540.10472-100000@nebula.lyra.org>
References: <14416.11676.888918.511932@weyr.cnri.reston.va.us>
	<Pine.LNX.4.10.9912091630540.10472-100000@nebula.lyra.org>
Message-ID: <14417.7909.511437.230915@weyr.cnri.reston.va.us>

Greg Stein writes:
 > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic
 > number if you're worried about mixing CObjects.

  That's certainly one option, but I would have made readdir(),
seekdir(), rewinddir() and closedir() into the methods read(), seek(), 
rewind() and close().  So it's a question of what interface you
prefer; functions with magically interpreted token parameters (kind of 
like file descriptors, hey!), or something that is more recognizably
object-oriented.
  I know my preference.  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From mal at lemburg.com  Fri Dec 10 16:55:02 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 16:55:02 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
Message-ID: <38512256.F9287E24@lemburg.com>

Jack Jansen wrote:
> 
> > Jack Jansen asks:
> >
> > > Is it possible nowadays to have two files with the same name but
> > > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same
> > > archive?
> >
> > Depends on how you do it.
> 
> Apparently I mis-phrased my question, I'll try again.
> 
> When people suggested to use zip format as the standard Python archive format
> I was a bit worried, becuase I've had it happen to me various times that I was
> unable to create a ZIP archive with two files with the same name but different
> paths (i.e. create an archive of a directory that contains both a foo/bar.py
> and a foo/spam/bar.py).
> 
> So, my question was: has this happened to me because the winzip I used was
> braindead, or is there possibly a problem with the ZIP file format that
> disallows two files with the same name in one archive? Most zip programs I've
> seen also seem to present filenames as the primary metaphore, with full
> pathnames somewhat "tacked on".
> 
> If the latter is the case I wonder whether zip is the right format to use...

Hmm, I've been doing the above for years now... never had a problem
with it (I use Info-ZIPs tools, BTW), e.g.

/home/lemburg> unzip -l projects/distribution/mxODBC-1.1.1.zip 
Archive:  projects/distribution/mxODBC-1.1.1.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
   131316  06-09-99 14:10   ODBC/EasySoft/mxODBC.c
   131316  06-09-99 14:10   ODBC/Informix/mxODBC.c
   ...

Would be cool if I could use my packages as ZIP files :-) So
here's another vote for using the ZIP format.

BTW, wouldn't it make sense to include the zlib code
in the core distribution much like the pcre stuff is now ?
AFAIK, it is public domain and including it would remedy many of the
compatibility issues with the different zlib versions around.

Ok, that was my wish for Xmas :-) The rest is up to Mr. Clause...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido at CNRI.Reston.VA.US  Fri Dec 10 17:04:24 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:04:24 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 16:55:02 +0100."
             <38512256.F9287E24@lemburg.com> 
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>  
            <38512256.F9287E24@lemburg.com> 
Message-ID: <199912101604.LAA14100@eric.cnri.reston.va.us>

> BTW, wouldn't it make sense to include the zlib code
> in the core distribution much like the pcre stuff is now ?
> AFAIK, it is public domain and including it would remedy many of the
> compatibility issues with the different zlib versions around.

What compatibility issues?  Note that the Win32 distri already comes
with zlib statically linked into zlib.pyd.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal at lemburg.com  Fri Dec 10 17:15:48 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:15:48 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>  
	            <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>
Message-ID: <38512734.CF6E4489@lemburg.com>

Guido van Rossum wrote:
> 
> > BTW, wouldn't it make sense to include the zlib code
> > in the core distribution much like the pcre stuff is now ?
> > AFAIK, it is public domain and including it would remedy many of the
> > compatibility issues with the different zlib versions around.
> 
> What compatibility issues?  Note that the Win32 distri already comes
> with zlib statically linked into zlib.pyd.

There were issues with zlib 1.0.4 and later ones. Also, many
Linux distributions don't have the zlib header files installed.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido at CNRI.Reston.VA.US  Fri Dec 10 17:19:47 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:19:47 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 17:15:48 +0100."
             <38512734.CF6E4489@lemburg.com> 
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>  
            <38512734.CF6E4489@lemburg.com> 
Message-ID: <199912101619.LAA14174@eric.cnri.reston.va.us>

> There were issues with zlib 1.0.4 and later ones. Also, many
> Linux distributions don't have the zlib header files installed.

Hm.  I don't recall having any problems reported to me.  I'd rather
not include the entire zlib distri in the Python distri -- zlib
is rather big.  Adding only the Unix source would be cheating.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at CNRI.Reston.VA.US  Fri Dec 10 17:25:23 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:25:23 -0500
Subject: [Python-Dev] dbm clone with serious specs wanted
Message-ID: <199912101625.LAA14216@eric.cnri.reston.va.us>

Someone has asked me for a dbm clone that can store 16M keys of 350
bytes each, and runs on Linux, HPUX, and NT.  That's 5.6 Gigabyte in
keys alone!  I presume most classic approaches won't cut it since
total file size is typicall limited by the seek system call, internal
data structures and/or file index format to 2Gb (signed longs) or 4Gb
(unsigned longs).

Does anyone have an idea where to start looking?  Would a Python
extension already exist?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From petrilli at amber.org  Fri Dec 10 17:29:27 1999
From: petrilli at amber.org (Christopher Petrilli)
Date: Fri, 10 Dec 1999 11:29:27 -0500
Subject: [Python-Dev] dbm clone with serious specs wanted
In-Reply-To: <199912101625.LAA14216@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Fri, Dec 10, 1999 at 11:25:23AM -0500
References: <199912101625.LAA14216@eric.cnri.reston.va.us>
Message-ID: <19991210112927.A14102@trump.amber.org>

Guido van Rossum [guido at CNRI.Reston.VA.US] wrote:
> Someone has asked me for a dbm clone that can store 16M keys of 350
> bytes each, and runs on Linux, HPUX, and NT.  That's 5.6 Gigabyte in
> keys alone!  I presume most classic approaches won't cut it since
> total file size is typicall limited by the seek system call, internal
> data structures and/or file index format to 2Gb (signed longs) or 4Gb
> (unsigned longs).
> 
> Does anyone have an idea where to start looking?  Would a Python
> extension already exist?

Assuming you mean an interface to a ddbm-style situation, you could easily
use berkeley DB, I belive it is limited in the 4TB range...  

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org


From mal at lemburg.com  Fri Dec 10 17:26:10 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:26:10 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>  
	            <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>
Message-ID: <385129A2.6FAF4E81@lemburg.com>

Guido van Rossum wrote:
> 
> > There were issues with zlib 1.0.4 and later ones. Also, many
> > Linux distributions don't have the zlib header files installed.
> 
> Hm.  I don't recall having any problems reported to me.  I'd rather
> not include the entire zlib distri in the Python distri -- zlib
> is rather big.  Adding only the Unix source would be cheating.

How about only adding those parts which would be needed to
at least deflate the ZIP archive contents ?

If the ZIP archive format becomes the standard for Python, we'd
have to ensure that all Python users can read them. Well, at
least that's what I would expect from a standard format :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido at CNRI.Reston.VA.US  Fri Dec 10 17:29:36 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 10 Dec 1999 11:29:36 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: Your message of "Fri, 10 Dec 1999 17:26:10 +0100."
             <385129A2.6FAF4E81@lemburg.com> 
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>  
            <385129A2.6FAF4E81@lemburg.com> 
Message-ID: <199912101629.LAA14274@eric.cnri.reston.va.us>

> How about only adding those parts which would be needed to
> at least deflate the ZIP archive contents ?

Ditto -- still lots of portability issues I bet.

> If the ZIP archive format becomes the standard for Python, we'd
> have to ensure that all Python users can read them. Well, at
> least that's what I would expect from a standard format :-)

There's a simple solution: don't use compression.  With current disk
prices it's really not worth it.  Let the installer do the
decompression (installers travel across networks where compression
*is* worth it).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin at mems-exchange.org  Fri Dec 10 17:34:09 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Fri, 10 Dec 1999 11:34:09 -0500 (EST)
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: <38512734.CF6E4489@lemburg.com>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
	<38512256.F9287E24@lemburg.com>
	<199912101604.LAA14100@eric.cnri.reston.va.us>
	<38512734.CF6E4489@lemburg.com>
Message-ID: <14417.11137.562474.99270@amarok.cnri.reston.va.us>

M.-A. Lemburg writes:
>There were issues with zlib 1.0.4 and later ones. Also, many
>Linux distributions don't have the zlib header files installed.

For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm,
and zlib.XXX.rpm only contains libz.so.  On the other hand, anyone
who's compiling Python should really have the various -devel RPMs
installed.  I'd argue against including it, because it might cause odd
versioning problems.  For example, what if I have PIL compiled against
zlib1.1.2 (zlib is used for writing PNGs) and the Python binary
includes zlib1.1.3?  There might be hard-to-debug problems
caused by calling the wrong symbol.

PCRE is a special case, because we've actually hacked the code a lot;
it's not the PCRE code as Philip Hazel distributes it.

Just received Guido's email suggesting skipping compression in
archives; not a bad idea.  You'd use less CPU, but might do
more I/O because you're reading more sectors off disk.  There
probably isn't much need for compression when the archive is on-disk;
Java needed it because of applets.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
The NSA response was, "Well, that was interesting, but there aren't any
ciphers like that."
    -- Gus Simmons, "The History of Subliminal Channels"


From petrilli at amber.org  Fri Dec 10 17:39:44 1999
From: petrilli at amber.org (Christopher Petrilli)
Date: Fri, 10 Dec 1999 11:39:44 -0500
Subject: [Python-Dev] dbm clone with serious specs wanted
In-Reply-To: <19991210112927.A14102@trump.amber.org>; from petrilli@amber.org on Fri, Dec 10, 1999 at 11:29:27AM -0500
References: <199912101625.LAA14216@eric.cnri.reston.va.us> <19991210112927.A14102@trump.amber.org>
Message-ID: <19991210113944.B14102@trump.amber.org>

Christopher Petrilli [petrilli at amber.org] wrote:
> Guido van Rossum [guido at CNRI.Reston.VA.US] wrote:
> > Does anyone have an idea where to start looking?  Would a Python
> > extension already exist?
> 
> Assuming you mean an interface to a ddbm-style situation, you could easily
> use berkeley DB, I belive it is limited in the 4TB range...  

I just did some checking... first Robin Dunn has an interface, but it's not
currently compatible with BerkeleyDB 3.x, which just came out... it shouldn't
be hard to retrofit.  Anyway, the limits are based on page size...

	512b page:	2TB
	64K page:	256TB

It uses 32bit numbers for pages, so I assume that is also a reflection
of the number of keys allowed... given I belive one key must use a minimum
of one page.

I know that I've pushed earlier releases o around 50Gb without trouble,
but you might see issues relatd to the number of keys.  I'd ask Sleepycat
directly, as they'r amazingly responsive.

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org


From mal at lemburg.com  Fri Dec 10 17:37:30 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:37:30 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>  
	            <385129A2.6FAF4E81@lemburg.com> <199912101629.LAA14274@eric.cnri.reston.va.us>
Message-ID: <38512C4A.ADB63C2B@lemburg.com>

Guido van Rossum wrote:
> 
> > How about only adding those parts which would be needed to
> > at least deflate the ZIP archive contents ?
> 
> Ditto -- still lots of portability issues I bet.

Hmm, not sure: zlib is pretty portable. Its the interface
changes that can break code, not so much the zlib portability.
 
> > If the ZIP archive format becomes the standard for Python, we'd
> > have to ensure that all Python users can read them. Well, at
> > least that's what I would expect from a standard format :-)
> 
> There's a simple solution: don't use compression.  With current disk
> prices it's really not worth it.  Let the installer do the
> decompression (installers travel across networks where compression
> *is* worth it).

That's a possibility, right. It would still let us use the many
ZIP tools while not adding complexity to the core.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Fri Dec 10 17:43:11 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 10 Dec 1999 17:43:11 +0100
Subject: [Python-Dev] dbm clone with serious specs wanted
References: <199912101625.LAA14216@eric.cnri.reston.va.us>
Message-ID: <38512D9F.2AE9DC8B@lemburg.com>

Guido van Rossum wrote:
> 
> Someone has asked me for a dbm clone that can store 16M keys of 350
> bytes each, and runs on Linux, HPUX, and NT.  That's 5.6 Gigabyte in
> keys alone!  I presume most classic approaches won't cut it since
> total file size is typicall limited by the seek system call, internal
> data structures and/or file index format to 2Gb (signed longs) or 4Gb
> (unsigned longs).
> 
> Does anyone have an idea where to start looking?  Would a Python
> extension already exist?

I'd suggest using a dbm style wrapper around the DB-API and then
trying out the many cross-platform databases. IBM DB2 comes to
mind... it can certainly handle these sizes given the right
hardware.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    21 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fdrake at acm.org  Fri Dec 10 18:35:01 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 12:35:01 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <199912100203.VAA07410@eric.cnri.reston.va.us>
References: <38503BDC.CB91FB29@digicool.com>
	<199912100203.VAA07410@eric.cnri.reston.va.us>
Message-ID: <14417.14789.306365.439782@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > Since we seem to be on an adding spree, I don't see why not -- as long
 > as POSIX keeps it available :)

  fsync() isn't listed in O'Reilly's POSIX book, so it's probably not
in the POSIX spec.  Neither is the tempnam() function I added in
yesterdays spree, though tmpfile() and tmpnam() are.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jim at digicool.com  Fri Dec 10 19:37:53 1999
From: jim at digicool.com (Jim Fulton)
Date: Fri, 10 Dec 1999 18:37:53 +0000
Subject: [Python-Dev] Thankyou for fsync :)
References: <38503BDC.CB91FB29@digicool.com>
		<199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us>
Message-ID: <38514881.5C124E36@digicool.com>

"Fred L. Drake, Jr." wrote:
> 
> Guido van Rossum writes:
>  > Since we seem to be on an adding spree, I don't see why not -- as long
>  > as POSIX keeps it available :)
> 
>   fsync() isn't listed in O'Reilly's POSIX book, so it's probably not
> in the POSIX spec. 

It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;)

I'd still like it to stay, where available. :)

Jim

--
Jim Fulton           mailto:jim at digicool.com
Technical Director   (888) 344-4332              Python Powered!
Digital Creations    http://www.digicool.com     http://www.python.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From fdrake at acm.org  Fri Dec 10 19:36:44 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 13:36:44 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <38514881.5C124E36@digicool.com>
References: <38503BDC.CB91FB29@digicool.com>
	<199912100203.VAA07410@eric.cnri.reston.va.us>
	<14417.14789.306365.439782@weyr.cnri.reston.va.us>
	<38514881.5C124E36@digicool.com>
Message-ID: <14417.18492.932392.608912@weyr.cnri.reston.va.us>

Jim Fulton writes:
 > It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;)

  I don't have that one, but I certainly don't have any plans on
ripping out fsync().  Not today, at any rate.  ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jim at interet.com  Fri Dec 10 19:37:50 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 13:37:50 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl>
Message-ID: <3851487E.F610BE17@interet.com>

Jack Jansen wrote:
> 
> Is it possible nowadays to have two files with the same name but different
> paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive?

Yes, I just made one with WinZip.

JimA


From gmcm at hypernet.com  Fri Dec 10 19:41:56 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Fri, 10 Dec 1999 13:41:56 -0500
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <38514881.5C124E36@digicool.com>
Message-ID: <1267271840-1299809@hypernet.com>

Fred L. Drake, Jr. wrote:
> 
> Guido van Rossum writes:
>  > Since we seem to be on an adding spree, I don't see why not
>  > -- as long as POSIX keeps it available :)
> 
>   fsync() isn't listed in O'Reilly's POSIX book, so it's
>   probably not
> in the POSIX spec. 
> 

It's in the other O'Reilly POSIX book, p 348 of POSIX.4.

- Gordon


From fdrake at acm.org  Fri Dec 10 19:43:56 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 13:43:56 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <1267271840-1299809@hypernet.com>
References: <38514881.5C124E36@digicool.com>
	<1267271840-1299809@hypernet.com>
Message-ID: <14417.18924.461115.906914@weyr.cnri.reston.va.us>

Gordon McMillan writes:
 > It's in the other O'Reilly POSIX book, p 348 of POSIX.4.

  Ah, I don't have that either.  I thought POSIX.4 was real-time
stuff.
  (If anyone wants to send a copy along, I'd be glad to consider
adding reasonable interfaces for Python. ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jim at interet.com  Fri Dec 10 19:43:18 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 13:43:18 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
Message-ID: <385149C6.DF942F36@interet.com>

Jack Jansen wrote:

> When people suggested to use zip format as the standard Python archive format
> I was a bit worried, becuase I've had it happen to me various times that I was
> unable to create a ZIP archive with two files with the same name but different
> paths (i.e. create an archive of a directory that contains both a foo/bar.py
> and a foo/spam/bar.py).

No problem.

But most zip tools will create an archive with either no
path (file name is "bar.py") or full path (filename "foo/bar.py".
If paths are different Ok, not sure about duplicate bare names.
The difference is an option and has nothing to do with how the
file name is specified to the utility.

JimA


From jim at interet.com  Fri Dec 10 19:48:47 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 10 Dec 1999 13:48:47 -0500
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>  
		            <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com>
Message-ID: <38514B0F.84A546C6@interet.com>

"M.-A. Lemburg" wrote:

> How about only adding those parts which would be needed to
> at least deflate the ZIP archive contents ?
> 
> If the ZIP archive format becomes the standard for Python, we'd
> have to ensure that all Python users can read them. Well, at
> least that's what I would expect from a standard format :-)

I think that for now we will need to create archives with
compression method zero: no compression.  That is a valid
compression method all ZIP utilities support.  The point is that
zlib just isn't part of Python.

Jim


From jcw at equi4.com  Fri Dec 10 19:57:00 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Fri, 10 Dec 1999 19:57:00 +0100
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>  
			            <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> <38514B0F.84A546C6@interet.com>
Message-ID: <38514CFC.47C8A8E0@equi4.com>

"James C. Ahlstrom" wrote:
[...]
> I think that for now we will need to create archives with
> compression method zero: no compression.  That is a valid
> compression method all ZIP utilities support.

Sounds good.  This is also exactly how Java started out with jar.

-jcw


From gmcm at hypernet.com  Fri Dec 10 20:06:59 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Fri, 10 Dec 1999 14:06:59 -0500
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us>
References: <1267271840-1299809@hypernet.com>
Message-ID: <1267270337-1390160@hypernet.com>

Fred wrote:
 
> Gordon McMillan writes:
>  > It's in the other O'Reilly POSIX book, p 348 of POSIX.4.
> 
>   Ah, I don't have that either.  I thought POSIX.4 was real-time
> stuff.

Well, it says it is, but having done some stuff with automated 
warehouses, I'm always amazed at how people will use the 
term "real-time". I'd say "pretty likely to be responsive" ;-).

>   (If anyone wants to send a copy along, I'd be glad to consider
> adding reasonable interfaces for Python. ;)

Only around 70 documented functions, but many of them 
appear to be tweaks, or redocumenting stuff in view of new 
kernel behaviors.

- Gordon


From fdrake at acm.org  Fri Dec 10 20:18:16 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 14:18:16 -0500 (EST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <1267270337-1390160@hypernet.com>
References: <1267271840-1299809@hypernet.com>
	<1267270337-1390160@hypernet.com>
Message-ID: <14417.20984.151867.630871@weyr.cnri.reston.va.us>

Gordon McMillan writes:
 > Well, it says it is, but having done some stuff with automated 
 > warehouses, I'm always amazed at how people will use the 
 > term "real-time". I'd say "pretty likely to be responsive" ;-).

  Oh, a manager's interpretation of real-time:  "I want this by close
of business next Wednesday!"

 > Only around 70 documented functions, but many of them 
 > appear to be tweaks, or redocumenting stuff in view of new 
 > kernel behaviors.

  Anything that should be added anywhere?  Failing all else, I can
probably read the man pages if I know what to look for.  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From fdrake at acm.org  Fri Dec 10 22:40:29 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 10 Dec 1999 16:40:29 -0500 (EST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>
References: <199912091632.LAA09236@amarok.cnri.reston.va.us>
Message-ID: <14417.29517.238124.767279@weyr.cnri.reston.va.us>

Andrew M. Kuchling writes:
 > fpathconf(fd, name) -- Get configuration limit for a file
...
 > pathconf(path, name) -- Gets config variables for a path
...
 > sysconf(int name) -- Gets system configuration information
 > 	    -- would need constants from unistd.h

  I'm almost done with these, and also confstr (from POSIX.2).  I
don't have time to finish them today; I'll check them in next week.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From skip at mojam.com  Sat Dec 11 00:20:21 1999
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 10 Dec 1999 17:20:21 -0600 (CST)
Subject: [Python-Dev] Thankyou for fsync :)
In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us>
References: <38514881.5C124E36@digicool.com>
	<1267271840-1299809@hypernet.com>
	<14417.18924.461115.906914@weyr.cnri.reston.va.us>
Message-ID: <14417.35509.284749.924066@dolphin.mojam.com>

    Fred> I thought POSIX.4 was real-time stuff.

This all seems to be happening in real-time to me... ;-)

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From andy at robanal.demon.co.uk  Sat Dec 11 01:11:28 1999
From: andy at robanal.demon.co.uk (Andy Robinson)
Date: Sat, 11 Dec 1999 00:11:28 GMT
Subject: [Python-Dev] Zip format (was: Questions about distutils strategy )
In-Reply-To: <199912101619.LAA14174@eric.cnri.reston.va.us>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us>   <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us>
Message-ID: <38519531.15439641@post.demon.co.uk>

On Fri, 10 Dec 1999 11:19:47 -0500, you wrote:

>> There were issues with zlib 1.0.4 and later ones. Also, many
>> Linux distributions don't have the zlib header files installed.
>
>Hm.  I don't recall having any problems reported to me.  I'd rather
>not include the entire zlib distri in the Python distri -- zlib
>is rather big.  Adding only the Unix source would be cheating.
>
Minor data point on the importance of zlib.  I spent a long time
figuring out what Adobe PDF's "flate filter" was before I discovered
it was the inverse of "deflate" (yes, there were loud sounds of
head-slapping when I clicked) and discovered that zlib.compress() was
EXACTLY what you need to create compressed streams in PDF documents.
Being a Windows person, I naively assumed zlib was in the standard
distribution everywhere, and subsequently discovered Mac and Unix
users were not so happy.  So if you want to make PDFs, having zlib
around is very useful indeed...

- Andy


From akuchlin at mems-exchange.org  Sat Dec 11 01:35:58 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Fri, 10 Dec 1999 19:35:58 -0500 (EST)
Subject: [Python-Dev] Enabling more modules by default
In-Reply-To: <38519531.15439641@post.demon.co.uk>
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl>
	<38512256.F9287E24@lemburg.com>
	<199912101604.LAA14100@eric.cnri.reston.va.us>
	<38512734.CF6E4489@lemburg.com>
	<199912101619.LAA14174@eric.cnri.reston.va.us>
	<38519531.15439641@post.demon.co.uk>
Message-ID: <14417.40046.850655.491684@amarok.cnri.reston.va.us>

Andy Robinson writes:
>...  So if you want to make PDFs, having zlib
>around is very useful indeed...

This raises a good point, though I still dislike the idea of including
the zlib library.  It would be nice if Setup.in would be autogenerated
to compile all the modules it can -- bsddb if it finds libdb, zlib if
it finds libz.a.  I vaguely recall once working on a Python script that
would generate a customized Setup.in file, though I can't find it at
the moment.  Given that someone has already suggested automatically
enabling threads on those platforms that support it, why not go all
the way?

(But a Python script that generates a Setup.in isn't going to work,
unless we compile a minipython first and then create a more complete
Setup file.)

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
The most merciful thing in the world... is the inability of the human mind to
correlate all its contents.
    -- H.P. Lovecraft


From petrilli at amber.org  Sat Dec 11 06:54:41 1999
From: petrilli at amber.org (Christopher Petrilli)
Date: Sat, 11 Dec 1999 00:54:41 -0500
Subject: [Python-Dev] Enabling more modules by default
In-Reply-To: <14417.40046.850655.491684@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Fri, Dec 10, 1999 at 07:35:58PM -0500
References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <38519531.15439641@post.demon.co.uk> <14417.40046.850655.491684@amarok.cnri.reston.va.us>
Message-ID: <19991211005441.A20923@trump.amber.org>

Andrew M. Kuchling [akuchlin at mems-exchange.org] wrote:
> Andy Robinson writes:
> >...  So if you want to make PDFs, having zlib
> >around is very useful indeed...
> 
> This raises a good point, though I still dislike the idea of including
> the zlib library.  It would be nice if Setup.in would be autogenerated
> to compile all the modules it can -- bsddb if it finds libdb, zlib if
> it finds libz.a.  I vaguely recall once working on a Python script that
> would generate a customized Setup.in file, though I can't find it at
> the moment.  Given that someone has already suggested automatically
> enabling threads on those platforms that support it, why not go all
> the way?

WEll, one warning about BSDdb, is that it comes in 3 incarnations that 
all might be -ldb :-):

	1.85
	2.x
	3.x

and they are NOT compatible with eachother.  1.85 has serious brain damage,
and honestly I'd love to see Robin Dunn's 2.x work rolled in to replace it,
but not sure how viable that is---people might actually want the 1.85 breakage.

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org


From gstein at lyra.org  Sat Dec 11 12:23:30 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 03:23:30 -0800 (PST)
Subject: [Python-Dev] Zip format
In-Reply-To: <1267287023-386248@hypernet.com>
Message-ID: <Pine.LNX.4.10.9912110321010.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Gordon McMillan wrote:
>...
> If the user imports foo.spam.bar, an importer will be asked for:
>   foo (return foo.__init__)
>   foo.spam (return foo.bar.__init__)

                         ^^^ foo.spam.__init__

>   foo.spam.bar (return foo.spam.bar)

The above sequence is what currently happens.

> But the API allows lots of variations. This is another possible 
> interaction:
>   foo (return None)
>   foo.__init__ (return foo.__init__)
>   foo.spam (return None)
>   foo.bar.__init__ (return foo.bar.__init__)
>   foo.spam.bar (return foo.spam.bar)

The core of imputil has no knowledge of the __init__ thingy. That is
specific to the filesystem-based stuff. So in this sense, "possible" means
"imputil could be changed to do this". I would argue against the change,
however :-)

> Or, by looking at different args to get_code, you could look at 
> the requests as:
>   foo in context of None
>   spam in context of foo
>   bar in context of foo.spam

Bing!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Sat Dec 11 12:26:59 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 03:26:59 -0800 (PST)
Subject: [Python-Dev] Zip format
In-Reply-To: <14417.11137.562474.99270@amarok.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912110323510.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Andrew M. Kuchling wrote:
> M.-A. Lemburg writes:
> >There were issues with zlib 1.0.4 and later ones. Also, many
> >Linux distributions don't have the zlib header files installed.
> 
> For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm,
> and zlib.XXX.rpm only contains libz.so.  On the other hand, anyone
> who's compiling Python should really have the various -devel RPMs

Exactly. The distro's *have* the headers -- it all depends on what you
installed. I happen to have the headers on my system (because I installed
zlib-devel, as AMK mentions).

> installed.  I'd argue against including it, because it might cause odd
> versioning problems.  For example, what if I have PIL compiled against
> zlib1.1.2 (zlib is used for writing PNGs) and the Python binary
> includes zlib1.1.3?  There might be hard-to-debug problems
> caused by calling the wrong symbol.

I totally agree.

>...
> Just received Guido's email suggesting skipping compression in
> archives; not a bad idea.  You'd use less CPU, but might do
> more I/O because you're reading more sectors off disk.  There
> probably isn't much need for compression when the archive is on-disk;
> Java needed it because of applets.

There are all kinds of things that we can do here. Consider mmap'ing the
archive into a shared memory segment, used by all the Python processes on
the system... woo! :-)

IMO, the standard distro can use zip files, and just bail if they are
compressed, but Python cannot load zlib. Obvious failure with an obvious
remedy. No big deal.

As Guido also mentions, an installer can just bring along zlib if they
want to use a compressed archive. i.e. their choice.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Sat Dec 11 12:33:47 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 03:33:47 -0800 (PST)
Subject: [Python-Dev] Missing POSIX functions: the list
In-Reply-To: <14417.7909.511437.230915@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912110332360.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Fred L. Drake, Jr. wrote:
> Greg Stein writes:
>  > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic
>  > number if you're worried about mixing CObjects.
> 
>   That's certainly one option, but I would have made readdir(),
> seekdir(), rewinddir() and closedir() into the methods read(), seek(), 
> rewind() and close().  So it's a question of what interface you
> prefer; functions with magically interpreted token parameters (kind of 
> like file descriptors, hey!), or something that is more recognizably
> object-oriented.
>   I know my preference.  ;-)

Well, I know my preference of those two alternatives, too :-), but if
we're going with the Pythonic minimalism, then I'd think you would expose
the functions "as close as possible."

Would I argue if you went with a method-based approach? No :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik at pythonware.com  Sat Dec 11 14:07:08 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 11 Dec 1999 14:07:08 +0100
Subject: [Python-Dev] Zip format
References: <Pine.LNX.4.10.9912110323510.16305-100000@nebula.lyra.org>
Message-ID: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com>

Greg Stein <gstein at lyra.org> wrote:
> There are all kinds of things that we can do here. Consider mmap'ing the
> archive into a shared memory segment, used by all the Python processes on
> the system... woo! :-)

it doesn't really look like this, but I hope we're defining
interfaces here, and not just "one true solution".  I'd be
very annoyed if it turned out that we couldn't use works'
archives with the new standard importer...

> As Guido also mentions, an installer can just bring along zlib if they
> want to use a compressed archive. i.e. their choice.

in the pythonworks universe, the installer and the
application is the same thing...

</F>


From fredrik at pythonware.com  Sat Dec 11 14:12:12 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 11 Dec 1999 14:12:12 +0100
Subject: [Python-Dev] Thankyou for fsync :)
References: <38503BDC.CB91FB29@digicool.com><199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us>
Message-ID: <006c01bf43d9$57bc0f90$f29b12c2@secret.pythonware.com>

Fred L. Drake, Jr. <fdrake at acm.org> wrote:
>   fsync() isn't listed in O'Reilly's POSIX book, so it's probably not
> in the POSIX spec.  Neither is the tempnam() function I added in
> yesterdays spree, though tmpfile() and tmpnam() are.

instead of guessing, you can get a complete
list from:

http://www.unix-systems.org/apis.html

reading up on the "single unix specification"
should also help:

http://www.unix-systems.org/online.html

(registration required; contains complete man
pages for all functions covered by the UNIX95
and UNIX98 specification)

</F>


From gstein at lyra.org  Sat Dec 11 14:10:00 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 11 Dec 1999 05:10:00 -0800 (PST)
Subject: [Python-Dev] Zip format
In-Reply-To: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com>
Message-ID: <Pine.LNX.4.10.9912110505580.16305-100000@nebula.lyra.org>

On Sat, 11 Dec 1999, Fredrik Lundh wrote:
> Greg Stein <gstein at lyra.org> wrote:
> > There are all kinds of things that we can do here. Consider mmap'ing the
> > archive into a shared memory segment, used by all the Python processes on
> > the system... woo! :-)
> 
> it doesn't really look like this, but I hope we're defining
> interfaces here, and not just "one true solution".  I'd be

Oh, I was just having fun there :-). I don't see "one true solution" at
all. Just some standards.

> very annoyed if it turned out that we couldn't use works'
> archives with the new standard importer...

get_code() and its processing is not going anywhere. Some stuff will
change under the covers, and we'll be using sys.path (typically) rather
than chaining (although chaining will still exist!).

I would think that your Importer subclass would be directly usable, but
the installation could/would be a bit different. Heck, worst case, nothing
is going to invalidate your archive format -- feel free to berate me if I
ever break that!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim at interet.com  Mon Dec 13 15:50:11 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 13 Dec 1999 09:50:11 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <38510254.ED15D32B@interet.com>
Message-ID: <385507A3.9F6AAF0F@interet.com>

> Jean-Claude Wippler wrote:
> 
> > Ouch - what's wrong with zip archives?
> 
> > With all due respect - I sincerely hope you will reconsider and alter
> > your code to work with zip files.  It's probably a small adjustment?

OK, I now have a new module "zipfile" which reads and
writes ZIP files.  It is written in Python and has been tested
on Windows and Linux.  I tested it with WinZip and found that
the files it creates are read OK with WinZip, and WinZip
files are read OK with zipfile.  So I am withdrawing my
Python archive file format, and re-writing all my stuff
using zipfile.  It should all be done in a week.

Basically everything works fine.  But there are some problems.

Python seems to lack a CRC-32 function, so I wrote one
in Python.  It is slow.  We need to add a CRC-32 function
to some Python built-in module that it always present, like
md5 or binascci.  The zlib module is not necessarily present.

I can't seem to get WinZip to record a partial path.  That is,
I want the ./Lib/test package to have these ZIP paths:
  test/__init__.pyc
  test/testall.pyc
  ...
but WinZip creates files with either no path at all or the
fully specified path.  Am I missing something?  Do all
other ZIP tools do this too?

JimA


Return-Path: <owner-python-dev at python.org>
Delivered-To: python-dev at dinsdale.python.org
Received: from python.org (parrot.python.org [132.151.1.90])
	by dinsdale.python.org (Postfix) with ESMTP id EFDA11CDB9
	for <python-dev at dinsdale.python.org>; Mon, 13 Dec 1999 10:21:56 -0500 (EST)
Received: from cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1] (may be forged))
	by python.org (8.9.1a/8.9.1) with ESMTP id KAA06423
	for <python-dev at python.org>; Mon, 13 Dec 1999 10:21:55 -0500 (EST)
Received: from kaluha.cnri.reston.va.us (kaluha.cnri.reston.va.us [132.151.7.31])
	by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id KAA04774
	for <python-dev at python.org>; Mon, 13 Dec 1999 10:21:56 -0500 (EST)
Received: from eric.cnri.reston.va.us (eric.cnri.reston.va.us [10.27.10.23])
	by kaluha.cnri.reston.va.us (8.9.1b+Sun/8.9.1) with ESMTP id KAA04556
	for <python-dev at python.org>; Mon, 13 Dec 1999 10:22:34 -0500 (EST)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by eric.cnri.reston.va.us (8.9.3+Sun/8.9.1) with ESMTP id KAA18858
	for <python-dev at python.org>; Mon, 13 Dec 1999 10:22:34 -0500 (EST)
Resent-Message-Id: <199912131522.KAA18858 at eric.cnri.reston.va.us>
Message-Id: <199912131522.KAA18858 at eric.cnri.reston.va.us>
To: "James C. Ahlstrom" <jim at interet.com>
Subject: Re: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-reply-to: Your message of "Mon, 13 Dec 1999 09:50:11 EST."
             <385507A3.9F6AAF0F at interet.com> 
References: <000301bf4206$b39e5b80$36a2143f at tim> <384FC47A.BB4DA517 at interet.com> <384FDAF5.C25C447C at equi4.com> <38510254.ED15D32B at interet.com>  
            <385507A3.9F6AAF0F at interet.com> 
Date: Mon, 13 Dec 1999 10:22:12 -0500
From: Guido van Rossum <guido at CNRI.Reston.VA.US>
Resent-Cc: python-dev at python.org
Resent-Date: Mon, 13 Dec 1999 10:22:34 -0500
Resent-From: Guido van Rossum <guido at CNRI.Reston.VA.US>
Sender: python-dev-admin at python.org
Errors-To: python-dev-admin at python.org
X-BeenThere: python-dev at python.org
X-Mailman-Version: 1.2 (experimental)
Precedence: bulk
List-Id: Python core developers <python-dev.python.org>

> OK, I now have a new module "zipfile" which reads and
> writes ZIP files.  It is written in Python and has been tested
> on Windows and Linux.  I tested it with WinZip and found that
> the files it creates are read OK with WinZip, and WinZip
> files are read OK with zipfile.  So I am withdrawing my
> Python archive file format, and re-writing all my stuff
> using zipfile.  It should all be done in a week.

Ah, good!  (This saves me the trouble of cleaning up our own zip code :-)

> Basically everything works fine.  But there are some problems.
> 
> Python seems to lack a CRC-32 function, so I wrote one
> in Python.  It is slow.  We need to add a CRC-32 function
> to some Python built-in module that it always present, like
> md5 or binascci.  The zlib module is not necessarily present.
> 
> I can't seem to get WinZip to record a partial path.  That is,
> I want the ./Lib/test package to have these ZIP paths:
>   test/__init__.pyc
>   test/testall.pyc
>   ...
> but WinZip creates files with either no path at all or the
> fully specified path.  Am I missing something?  Do all
> other ZIP tools do this too?

Unclick the "Save Extra Folder Info" and then drag the *parent* folder
into the archive.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Mon Dec 13 18:00:26 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 13 Dec 1999 12:00:26 -0500 (EST)
Subject: [Python-Dev] confstr(), fpathconf(), pathconf(), sysconf()
Message-ID: <14421.9770.623399.673010@weyr.cnri.reston.va.us>

  I've just checked in bindings for these POSIX.1 and POSIX.2
functions, and thought I'd explain the interfaces for those who don't
want to read the diffs.  ;)
  These functions expect a "name" parameter (that's how it's described 
in the man pages and the O'Reilly book).  The value for "name" is an
integer that's defined in the system headers.  The constants all have
the form

    _XX_SOME_NAME

where XX is PC for fpathconf()- and pathconf()-related names, SC for
sysconf()-related names, and CS for confstr()-related names.  Some
names are defined by the standards, but additional names are defined
by implementations (there are a *lot* of sysconf() names under
Solaris!).
  We don't want to expose enormous numbers of constants in the
module's interface, however, as there are already a lot of names in
the posix module.  That would also slow down module initialization.
We also don't want to force callers to use magic numbers in code that 
uses these functions, especially since the values may be
system-specific.
  The best way to call these functions, then, is to use a *string*
that corresponds to the name of the C #define sysmbol with the leading 
underscore stripped off.  For example, to get the length of the
arguments to exec(), you could say:

    num_args = os.sysconf("SC_ARG_MAX")

  The string will be mapped to the appropriate numeric value defined
in an internal table.  If the name isn't defined for the platform, a
ValueError will be raised.

    >>> num_args = os.sysconf("FOO_BAR")
    Traceback (innermost last):
      File "<stdin>", line 1, in ?
    ValueError: unrecognized configuration name

  To allow retrieval for platform-dependent configuration information, 
integers can also be passed in.  On Solaris, this is equivalent to
using "SC_ARG_MAX":

    num_args = os.sysconf(1)

(Ignoring the portability and readability issues, ha!)
  There are three separate tables used for this; one for confstr(),
one for sysconf(), and one shared by fpathconf() and pathconf().  The
names used to build the tables come from Linux and Solaris; we can add 
other names as needed.  To add names, I'd need the names to add and
how to test for their existence at compile time (#ifdef, etc.).


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From fdrake at acm.org  Mon Dec 13 19:35:49 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 13 Dec 1999 13:35:49 -0500 (EST)
Subject: [Python-Dev] CVS: python/dist/src/Modules posixmodule.c,2.116,2.117
In-Reply-To: <Pine.LNX.4.10.9912131025480.16305-100000@nebula.lyra.org>
References: <199912131637.LAA17318@weyr.cnri.reston.va.us>
	<Pine.LNX.4.10.9912131025480.16305-100000@nebula.lyra.org>
Message-ID: <14421.15493.28263.387680@weyr.cnri.reston.va.us>

Greg Stein writes:
 > I'm not very familiar with these APIs, but should you let go of the
 > interpreter lock when you call them?
 > (and for the other new funcs)

  None of these should be doing an I/O as far as I can determine.
Whenever I get to getlogin() (which AMK & I decided should be
included, based on the specs that /F pointed us to), I will release
the interpreter lock for the getlogin_r() variant.  I'm not sure I
should release it for the non-reentrant getlogin(), however; the
specification for getlogin*() pretty much requires that it read from
utmp.  ;(


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From gstein at lyra.org  Mon Dec 13 21:31:22 1999
From: gstein at lyra.org (Greg Stein)
Date: Mon, 13 Dec 1999 12:31:22 -0800 (PST)
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <385507A3.9F6AAF0F@interet.com>
Message-ID: <Pine.LNX.4.10.9912131229590.16305-100000@nebula.lyra.org>

On Mon, 13 Dec 1999, James C. Ahlstrom wrote:
>...
> OK, I now have a new module "zipfile" which reads and
> writes ZIP files.  It is written in Python and has been tested
> on Windows and Linux.  I tested it with WinZip and found that
> the files it creates are read OK with WinZip, and WinZip
> files are read OK with zipfile.  So I am withdrawing my
> Python archive file format, and re-writing all my stuff
> using zipfile.  It should all be done in a week.

Can you post zipfile.py so that people can starting reviewing that?

>...
> Python seems to lack a CRC-32 function, so I wrote one
> in Python.  It is slow.  We need to add a CRC-32 function
> to some Python built-in module that it always present, like
> md5 or binascci.  The zlib module is not necessarily present.

See zlib.crc32()

This is interesting, of course, because we have previously stated that
zlib (and its compression) is optional. But if we need the CRC-32
function...

hehe...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one at email.msn.com  Mon Dec 13 23:11:33 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Mon, 13 Dec 1999 17:11:33 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <385507A3.9F6AAF0F@interet.com>
Message-ID: <000401bf45b7$04edfaa0$96a2143f@tim>

[James C. Ahlstrom]
> ...
> Python seems to lack a CRC-32 function, so I wrote one
> in Python.  It is slow.  We need to add a CRC-32 function
> to some Python built-in module that it always present, like
> md5 or binascci.  The zlib module is not necessarily present.

Unfortunately, there are many different CRC functions in common use.  None
belong in md5; if the intent is to support just zip's version, adding a
(say) zipcrc32 function to binascii would be ok; if we expect to support
others as well, a new parameterized crc module would be in order.

> I can't seem to get WinZip to record a partial path.  That is,
> I want the ./Lib/test package to have these ZIP paths:
>   test/__init__.pyc
>   test/testall.pyc
>   ...
> but WinZip creates files with either no path at all or the
> fully specified path.  Am I missing something?  Do all
> other ZIP tools do this too?

No, it's a clumsiness unique to WinZip (damn GUIs <0.9 wink>).  In the Add
dialog box, you need to cd to the *Lib* directory, check the "Save extra
folder info" box, and then, e.g.,

1. Put
      test\*.pyc
   in the Add Files line, and click Add With Wildcards.
   Then all test\*.pyc files will be added, with paths test/__init__.pyc
   etc.

or

2. Put
      "test\__init__.pyc" "test\testall.pyc"
   (including the quotes!) in the Add Files line, and click Add.

Since #2 can be unbearable, other useful strategies include:

3. Use #1 (e.g. with dir\*.*) then delete the files you didn't really
   want.

4. Use #1 repeatedly, cleverly using a number of wildcard patterns that
   cover the files of interest.

5. Mixtures of #3 and #4.

6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has
   an "experimental" cmdline add-on too, but haven't tried it).


From jim at interet.com  Tue Dec 14 14:13:03 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 08:13:03 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <Pine.LNX.4.10.9912131229590.16305-100000@nebula.lyra.org>
Message-ID: <3856425F.8C5E7A42@interet.com>

Greg Stein wrote:
> 

> Can you post zipfile.py so that people can starting reviewing that?

Yes, it will be available by next Monday.  I just want to
get it really working and pretty, and with documentation.

JimA


From jim at interet.com  Tue Dec 14 14:26:50 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 08:26:50 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000401bf45b7$04edfaa0$96a2143f@tim>
Message-ID: <3856459A.BF5A798A@interet.com>

Tim Peters wrote:
> 
> [James C. Ahlstrom]
> > ...
> > Python seems to lack a CRC-32 function, so I wrote one
>
> Unfortunately, there are many different CRC functions in common use.  None
> belong in md5; if the intent is to support just zip's version, adding a
> (say) zipcrc32 function to binascii would be ok; if we expect to support
> others as well, a new parameterized crc module would be in order.

OK, a CRC-32 in binascii it is.  The CRC-32 I
have comes with these comments which seem to indicate it is a
more "official standard" CRC-32 than average:

# *  Crc - 32 BIT ANSI X3.66 CRC checksum files
#*********************************************************************\
#*                                                                    *|
#* Demonstration program to compute the 32-bit CRC used as the frame  *|
#* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71     *|
#* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level     *|
#* protocol).  The 32-bit FCS was added via the Federal Register,     *|
#* 1 June 1982, p.23798.  I presume but don't know for certain that   *|
#* this polynomial is or will be included in CCITT V.41, which        *|
#* defines the 16-bit CRC (often called CRC-CCITT) polynomial.  FIPS  *|
#* PUB 78 says that the 32-bit FCS reduces otherwise undetected       *|
#* errors by a factor of 10^-5 over 16-bit FCS.                       *|
#*                                                                    *|
#*********************************************************************
#* Copyright (C) 1986 Gary S. Brown.  You may use this program, or
#* code or tables extracted from it, as desired without restriction.
 
I can submit this as a patch to binascii, or if the Copyright bothers
anyone, maybe it is better for Guido to use his CRC-32 from his ZIP
code.  Preference?

> > I can't seem to get WinZip to record a partial path.  That is,
>
> dialog box, you need to cd to the *Lib* directory, check the "Save extra
> folder info" box, and then, e.g.,

Thanks.  I knew there had to be some magic incantation to do it.
 
> 6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has
>    an "experimental" cmdline add-on too, but haven't tried it).

Actually pkzip 2.04g doesn't work because it writes names in upper case
and is limited to 8.3 names (I think).  My zipfile.py can be used as
a basis for a command line tool.  Actually I use makefiles with imbedded
Python programs and find this easier than command line tools.

JimA


From guido at CNRI.Reston.VA.US  Tue Dec 14 15:53:04 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 09:53:04 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: Your message of "Tue, 14 Dec 1999 08:26:50 EST."
             <3856459A.BF5A798A@interet.com> 
References: <000401bf45b7$04edfaa0$96a2143f@tim>  
            <3856459A.BF5A798A@interet.com> 
Message-ID: <199912141453.JAA23429@eric.cnri.reston.va.us>

> OK, a CRC-32 in binascii it is.  The CRC-32 I
> have comes with these comments which seem to indicate it is a
> more "official standard" CRC-32 than average:
> 
> # *  Crc - 32 BIT ANSI X3.66 CRC checksum files
> #*********************************************************************\
> #*                                                                    *|
> #* Demonstration program to compute the 32-bit CRC used as the frame  *|
> #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71     *|
> #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level     *|
> #* protocol).  The 32-bit FCS was added via the Federal Register,     *|
> #* 1 June 1982, p.23798.  I presume but don't know for certain that   *|
> #* this polynomial is or will be included in CCITT V.41, which        *|
> #* defines the 16-bit CRC (often called CRC-CCITT) polynomial.  FIPS  *|
> #* PUB 78 says that the 32-bit FCS reduces otherwise undetected       *|
> #* errors by a factor of 10^-5 over 16-bit FCS.                       *|
> #*                                                                    *|
> #*********************************************************************
> #* Copyright (C) 1986 Gary S. Brown.  You may use this program, or
> #* code or tables extracted from it, as desired without restriction.
>  
> I can submit this as a patch to binascii, or if the Copyright bothers
> anyone, maybe it is better for Guido to use his CRC-32 from his ZIP
> code.  Preference?

I looked, but "my" crc32 in the zlib module (which was actually
contributed by Andrew Kuchling) is just a wrapper around the crc32
function in zlib, which is copyrighted by Mark Adler and follows the
zlib rules.

I propose to use Gary Brown's code.  I'll defend this to CNRI's
lawyers if need be.

Jim, have you checked that this is the right CRC to use for zip's CRC?
(This in the light of Tim's assertion that there are many CRCs around.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at interet.com  Tue Dec 14 16:22:56 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 10:22:56 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000401bf45b7$04edfaa0$96a2143f@tim>  
	            <3856459A.BF5A798A@interet.com> <199912141453.JAA23429@eric.cnri.reston.va.us>
Message-ID: <385660D0.C6C0C7B9@interet.com>

Guido van Rossum wrote:

> I propose to use Gary Brown's code.  I'll defend this to CNRI's
> lawyers if need be.
> 
> Jim, have you checked that this is the right CRC to use for zip's CRC?
> (This in the light of Tim's assertion that there are many CRCs around.)

The CRC it calculates agrees with the CRC of WinZip for all
files I have tried.  The original Gary Brown code was much
longer and included file reading.  Here is the shortened version:

JimA


# *  Crc - 32 BIT ANSI X3.66 CRC checksum files
#*********************************************************************\
#*                                                                    *|
#* Demonstration program to compute the 32-bit CRC used as the frame  *|
#* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71     *|
#* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level     *|
#* protocol).  The 32-bit FCS was added via the Federal Register,     *|
#* 1 June 1982, p.23798.  I presume but don't know for certain that   *|
#* this polynomial is or will be included in CCITT V.41, which        *|
#* defines the 16-bit CRC (often called CRC-CCITT) polynomial.  FIPS  *|
#* PUB 78 says that the 32-bit FCS reduces otherwise undetected       *|
#* errors by a factor of 10^-5 over 16-bit FCS.                       *|
#*                                                                    *|
#*********************************************************************

#
#* Copyright (C) 1986 Gary S. Brown.  You may use this program, or
#* code or tables extracted from it, as desired without restriction.
 
# First, the polynomial itself and its table of feedback terms.  The  
# polynomial is                                                       
# X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X^1+X^0 
# Note that we take it "backwards" and put the highest-order term in  
# the lowest-order bit.  The X^32 term is "implied"; the LSB is the   
# X^31 term, etc.  The X^0 term (usually shown as "+1") results in    
# the MSB being 1.                                                    

# Note that the usual hardware shift register implementation, which   
# is what we're using (we're merely optimizing it by doing eight-bit  
# chunks at a time) shifts bits into the lowest-order term.  In our   
# implementation, that means shifting towards the right.  Why do we   
# do it this way?  Because the calculated CRC must be transmitted in  
# order from highest-order term to lowest-order term.  UARTs transmit 
# characters in order from LSB to MSB.  By storing the CRC this way,  
# we hand it to the UART in the order low-byte to high-byte; the UART 
# sends each low-bit to hight-bit; and the result is transmission bit 
# by bit from highest- to lowest-order term without requiring any bit 
# shuffling on our part.  Reception works similarly.                  

# The feedback terms table consists of 256, 32-bit entries.  Notes:   
#                                                                     
#  1. The table can be generated at runtime if desired; code to do so 
#     is shown later.  It might not be obvious, but the feedback      
#     terms simply represent the results of eight shift/xor opera-    
#     tions for all combinations of data and CRC register values.     
#                                                                     
#  2. The CRC accumulation logic is the same for all CRC polynomials, 
#     be they sixteen or thirty-two bits wide.  You simply choose the 
#     appropriate table.  Alternatively, because the table can be     
#     generated at runtime, you can start by generating the table for 
#     the polynomial in question and use exactly the same "updcrc",   
#     if your application needn't simultaneously handle two CRC       
#     polynomials.  (Note, however, that XMODEM is strange.)          
#                                                                     
#  3. For 16-bit CRCs, the table entries need be only 16 bits wide;   
#     of course, 32-bit entries work OK if the high 16 bits are zero. 
#                                                                     
#  4. The values must be right-shifted by eight bits by the "updcrc"  
#     logic; the shift must be unsigned (bring in zeroes).  On some   
#     hardware you could probably optimize the shift in assembler by  
#     using byte-swap instructions.                                   

# Converted to Python by James C. Ahlstrom

crc_32_tab = [	# CRC polynomial 0xedb88320
0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f,
0xe963a535, 0x9e6495a3,
0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, 0x09b64c2b, 0x7eb17cbd,
0xe7b82d07, 0x90bf1d91,
0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, 0x1adad47d, 0x6ddde4eb,
0xf4d4b551, 0x83d385c7,
0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, 0x14015c4f, 0x63066cd9,
0xfa0f3d63, 0x8d080df5,
0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, 0x3c03e4d1, 0x4b04d447,
0xd20d85fd, 0xa50ab56b,
0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, 0x32d86ce3, 0x45df5c75,
0xdcd60dcf, 0xabd13d59,
0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423,
0xcfba9599, 0xb8bda50f,
0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, 0x2f6f7c87, 0x58684c11,
0xc1611dab, 0xb6662d3d,
0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f,
0x9fbfe4a5, 0xe8b8d433,
0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d,
0x91646c97, 0xe6635c01,
0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, 0x6c0695ed, 0x1b01a57b,
0x8208f4c1, 0xf50fc457,
0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49,
0x8cd37cf3, 0xfbd44c65,
0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7,
0xa4d1c46d, 0xd3d6f4fb,
0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, 0x44042d73, 0x33031de5,
0xaa0a4c5f, 0xdd0d7cc9,
0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3,
0xb966d409, 0xce61e49f,
0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81,
0xb7bd5c3b, 0xc0ba6cad,
0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, 0xead54739, 0x9dd277af,
0x04db2615, 0x73dc1683,
0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d,
0x0a00ae27, 0x7d079eb1,
0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb,
0x196c3671, 0x6e6b06e7,
0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, 0xf9b9df6f, 0x8ebeeff9,
0x17b7be43, 0x60b08ed5,
0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767,
0x3fb506dd, 0x48b2364b,
0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55,
0x316e8eef, 0x4669be79,
0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, 0xcc0c7795, 0xbb0b4703,
0x220216b9, 0x5505262f,
0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31,
0x2cd99e8b, 0x5bdeae1d,
0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f,
0x72076785, 0x05005713,
0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, 0x92d28e9b, 0xe5d5be0d,
0x7cdcefb7, 0x0bdbdf21,
0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b,
0x6fb077e1, 0x18b74777,
0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69,
0x616bffd3, 0x166ccf45,
0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, 0xa7672661, 0xd06016f7,
0x4969474d, 0x3e6e77db,
0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5,
0x47b2cf7f, 0x30b5ffe9,
0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693,
0x54de5729, 0x23d967bf,
0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, 0xb40bbe37, 0xc30c8ea1,
0x5a05df1b, 0x2d02ef8d
]


def crc32(string):
  crc = 0xFFFFFFFF
  for ch in string:
    crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) &
0xFFFFFF)
  return ~crc


From tim_one at email.msn.com  Tue Dec 14 18:06:36 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 14 Dec 1999 12:06:36 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <199912141453.JAA23429@eric.cnri.reston.va.us>
Message-ID: <000101bf4655$94e40840$3a2d153f@tim>

[Guido]
> I propose to use Gary Brown's code.  I'll defend this to CNRI's
> lawyers if need be.

If there's a hassle, I can do a clean-room implementation easily enough --
although I'd rather not.

> Jim, have you checked that this is the right CRC to use for zip's CRC?

If WinZip unzips Jim's files without griping, the odds that he's got the
wrong CRC are about 1 in 2**36 <wink>.

> (This in the light of Tim's assertion that there are many CRCs
> around.)

There are, and several others are hiding in assorted communications stds
(e.g., Ethernet uses a different 32-bit CRC); but the zip CRC is the one
you'll find most commonly described on the Web.

All the same, once Jim releases his code, I'll do an anal verification that
it's the right one.


From jim at interet.com  Tue Dec 14 18:54:35 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Tue, 14 Dec 1999 12:54:35 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000101bf4655$94e40840$3a2d153f@tim>
Message-ID: <3856845B.6C3C7330@interet.com>

Tim Peters wrote:

> If WinZip unzips Jim's files without griping, the odds that he's got the
> wrong CRC are about 1 in 2**36 <wink>.

You mean 2**32, right?  Oh, sorry, you must be
using a DEC-10  <wink again>.

JimA


From gstein at lyra.org  Tue Dec 14 20:23:36 1999
From: gstein at lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 11:23:36 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <3856425F.8C5E7A42@interet.com>
Message-ID: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, James C. Ahlstrom wrote:

> Greg Stein wrote:
> > 
> 
> > Can you post zipfile.py so that people can starting reviewing that?
> 
> Yes, it will be available by next Monday.  I just want to
> get it really working and pretty, and with documentation.

My point was that people could possibly use it *before* then. Not
everybody needs it to be pretty, needs doc, or needs it fully working.
Maybe people would like to provide feedback on the API. Maybe they'd like
to start their own modules that use your library.

This goes back to my years-old statement: release it now rather than later
-- people can always use it now, and there might not be a later.

Release early. Release often. :-)

People are too hesitant to release code. Why? Just send it out there. When
you update it, send out another. It doesn't hurt anybody to have more than
one release.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one at email.msn.com  Wed Dec 15 05:20:25 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 14 Dec 1999 23:20:25 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <3856845B.6C3C7330@interet.com>
Message-ID: <000501bf46b3$b6184f40$05a0143f@tim>

[Tim]
> If WinZip unzips Jim's files without griping, the odds that he's
> got the wrong CRC are about 1 in 2**36 <wink>.

[JimA]
> You mean 2**32, right?

Nope!  For each of the 2**32 polynomials you may have pulled out of thin
air, there are about a dozen common variations in the details of CRC
algorithms.  For example, a CRC used for hashing usually initializes "the
register" to 0, but a CRC used to protect against transmission errors
usually initializes to a block of 1 bits (since leading zeroes don't affect
the result, and a common transmission error is dropping a prefix of the
msg).  Similarly, algorithms vary in the order they scan the data; in
whether they use the raw data or its complement; and in whether they return
the actual remainder, the complement of the remainder, or a checksum
cleverly computed so that "the other end" always sees a fixed remainder
other than 0 (or ~0).

> Oh, sorry, you must be using a DEC-10  <wink again>.

I used a Univac 1108 in college, back when ASCII was in its infancy.  They
couldn't decide on the natural size for a character, so the 36-bit 1108
could be configured to treat each word as either 6 6-bit bytes or 4 9-bit
ones.  If they had been thinking ahead, they would have defined it as two
Unicode characters plus a 4-bit tag field for the Python implementation to
play with <wink>.

now-they-make-their-living-suing-.gif-bandits-ly y'rs  - tim


From tim_one at email.msn.com  Wed Dec 15 08:40:11 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 15 Dec 1999 02:40:11 -0500
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
In-Reply-To: <385660D0.C6C0C7B9@interet.com>
Message-ID: <000b01bf46cf$9ebe27e0$05a0143f@tim>

[JimA posts his Python rendering of Gary Brown's code]

Yup!  That's the zip algorithm, right down to the absurdly bit-reversed
polynomial.

> def crc32(string):
>   crc = 0xFFFFFFFF
>   for ch in string:
>     crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) &
> 0xFFFFFF)
>   return ~crc

Note that the last line is better (whether in Python or C!) as

    return crc ^ 0xffffffff

Else you'll get a surprising result in a 64-bit Python, and in some 64-bit C
implementations.

it's-a-32-bit-algorithm-not-an-"int"-or-"long"-one-ly y'rs  - tim


From fredrik at pythonware.com  Wed Dec 15 10:31:29 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 15 Dec 1999 10:31:29 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000101bf4655$94e40840$3a2d153f@tim>
Message-ID: <002601bf46e0$06e25ca0$f29b12c2@secret.pythonware.com>

> [Guido]
> > I propose to use Gary Brown's code.  I'll defend this to CNRI's
> > lawyers if need be.
> 
> If there's a hassle, I can do a clean-room implementation easily enough --
> although I'd rather not.

or you can grab the code from PIL, which already
comes with a Python compatible license...

(it's based on ISO 3307, but judging from the table
James posted, it's the same thing...)

</F>


From fredrik at pythonware.com  Wed Dec 15 10:39:19 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 15 Dec 1999 10:39:19 +0100
Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy
References: <000b01bf46cf$9ebe27e0$05a0143f@tim>
Message-ID: <003001bf46e0$43860b20$f29b12c2@secret.pythonware.com>

Tim Peters <tim_one at email.msn.com> wrote:
> Yup!  That's the zip algorithm, right down to the absurdly bit-reversed
> polynomial.

also known as ISO 3307, according to some
strange comments in PIL's sources...

</F>


From jim at interet.com  Wed Dec 15 16:53:34 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 15 Dec 1999 10:53:34 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
Message-ID: <3857B97E.3684224F@interet.com>

Greg Stein wrote:

> Release early. Release often. :-)

You are right of course.  OK, the zipfile.py code and docs are at:

  ftp://ftp.interet.com/pub/pylib.html

Despite the ftp URL, clicking on it should display the html.

Please don't panic if is seems to be slow.  It uses a Python CRC-32
which is slow.  You may want to hack it to use zlib.crc32() if you
have it.

I am testing with WinZip.  If you have another zip tool, it
would be interesting to see how compatible it is.

JimA


From guido at CNRI.Reston.VA.US  Wed Dec 15 17:38:47 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 15 Dec 1999 11:38:47 -0500
Subject: [Python-Dev] Writers wanted for Linux Journal Python special issue
Message-ID: <199912151638.LAA02522@eric.cnri.reston.va.us>

Linux Journal is preparing a special issue devoted to Python (actually
more like a pullout section or whatever I think).  They are looking
for writers, e.g. to write a piece about Python's history and/or an
introduction.  And probably anything else Python related.

If you're interested, please write to Marjorie Richardson
<mlr at ssc.com>, who is coordinating.  Also direct any questions to her.

This is for the June issue which will be on newsstands mid-May and
mailed to subscribers even earlier, I believe.  The deadline is
February 1st (magazine production takes forever!).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin at mems-exchange.org  Wed Dec 15 19:17:53 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Wed, 15 Dec 1999 13:17:53 -0500 (EST)
Subject: [Python-Dev] fwd. from Paul Prescod
Message-ID: <14423.56145.877163.395736@amarok.cnri.reston.va.us>

This is a forwarded e-mail from the XML-SIG mailing list, in which
Paul makes some good points.  Some context: I've been arguing against
adding more XML stuff to the base Python distribution, because 1) it's
bloat for those people don't care about XML, and 2) the Distutils is
supposed to fix this by making installing things easier.  Paul's
response, below, has shaken my conviction a bit (*only* a bit,
though).  If it's deemed valuable, perhaps the XML-SIG could
concentrate on the minimal set of parser + SAX + DOM that could be
included in 1.6.

Please join the XML-SIG to follow the specifics of this thread
further, as it relates only to XML.  As a more general philosophical
question for python-dev: do we want to add things to 1.6 following the
"batteries included" philosophy?  Or should we wave in the direction
of the distutils and say they'll fix the problem?  (In which case they
should be given high priority, as in "1.6 doesn't ship until they're
done".)

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
And after all, why should I go to bed every night? Sleep is only a habit.
    -- Cornelius Van Horne


Paul Prescod writes:
>"Andrew M. Kuchling" wrote:
>> 
>> Huh?  There's obviously a good deal of stuff in there, some of it
>> perhaps too esoteric, but I don't see where there's overlap.  
>
>Well, there are several parsers and parser wrappers. How is a user
>supposed to choose? And there is PyDOM, Minidom and qp_dom.
>
>> Or are
>> you talking about Python tools in general, where there are 3 DOM
>> implementations?  (PyDOM, 4DOM, and ZDOM hiding inside Zope.)
>
>That too.
>
>> I lean against shoveling more stuff into 1.6; better to get the
>> Distutils widely used, which makes it easier to install *all* Python
>> extensions.
>
>I don't think that XML is any more of an "add-on" to a modern scripting
>language than URL support or regular expression support. I'm in the
>"batteries included" camp for this and several other reasons: 
>
>	* standard Python libraries may soon need XML support. If WebDAV takes
>off then there should be a libWebDAV right alongside libftp and libhttp.
>And libWebDAV will require XML
>
>	* there is a difference between theory and practice. In theory,
>distutils will be done soon and everything will be easy. In practice, it
>is the end of 1999 and at every conference I have to install the XML sig
>package on the machines of several people who haven't been able to get
>it going themselves. In practice, we can't wait for distutils because
>people are choosing their XML tools now.
>
>> >Ideally we would have one (or at most two!) implementation of each of
>> >the major specs:
>> >XML    >SAX   >Unicode    >XPath    >XPointer   >XSLT    >DOM
>> 
>> Do you mean "one implementation of each in a single package", or "one
>> implementation existing for Python, distributed separately"?
>
>With the possible exception of XSLT, one implementation of each *in
>Python 1.6*.
>
>> We need to come up with a position paper for developer's day, stating
>> what needs to be discussed.  Suggestions?  I'd propose focusing on
>> getting the XML-SIG package to 1.0, but that's just an idea.
>
>I don't see how the XML-SIG package can ever get to 1.0. Anybody can
>contribute code at anytime and thus far we've been totally flexible
>about putting it in. I think that's great. It just won't ever lead to a
>stable, carefully maintained, tightly interoperable package. Some of the
>maintainers of the individual pieces have probably lost interest and
>there is probably nobody that understands it all enough to integrate it
>nicely.
>
>-- 
> Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
>


From fdrake at acm.org  Wed Dec 15 20:47:01 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 15 Dec 1999 14:47:01 -0500 (EST)
Subject: [Python-Dev] posix module
Message-ID: <14423.61493.90107.433664@weyr.cnri.reston.va.us>

  Ok, I think I'm done with the posix module updates, modulo bugs and
additional symbols for the *conf*() tables.  That leaves us with the
following status for interfaces that Andrew brought up in the message
that started this spate of additions:

Worth adding?
=============
opendir(), readdir(), closedir() -- not added
           The only thing these give us that os.listdir() doesn't is
           the inode numbers.  Unless someone actually wants those,
           it's not worth having.

Worth adding:
=============

abort() -- added

ctermid(), ctermid_r() -- added
            
fpathconf(fd, name) -- added

getlogin() -- added

getgroups(gidsetsize, grouplist) -- added

pathconf(path, name) -- added

sysconf(int name) -- added; also added confstr(int name)

Not worth adding:
=================
clearerr() -- not added

cuserid() -- not added

difftime -- not added

tmpfile(), tmpnam() -- added, also tempnam()

mblen(), mbstowcs(), mbtowc(), wcstombs(),  wctomb() -- not added


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jeremy at cnri.reston.va.us  Wed Dec 15 20:58:16 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 15 Dec 1999 14:58:16 -0500 (EST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
References: <3856425F.8C5E7A42@interet.com>
	<Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
Message-ID: <14423.62168.576273.719577@goon.cnri.reston.va.us>

>>>>> "GS" == Greg Stein <gstein at lyra.org> writes:

  GS> On Tue, 14 Dec 1999, James C. Ahlstrom wrote:
  >> Greg Stein wrote: >
  >> 
  >> > Can you post zipfile.py so that people can starting reviewing
  >> that?
  >> 
  >> Yes, it will be available by next Monday.  I just want to get it
  >> really working and pretty, and with documentation.

  GS> My point was that people could possibly use it *before*
  GS> then. Not everybody needs it to be pretty, needs doc, or needs
  GS> it fully working.  Maybe people would like to provide feedback
  GS> on the API. Maybe they'd like to start their own modules that
  GS> use your library.

  GS> This goes back to my years-old statement: release it now rather
  GS> than later -- people can always use it now, and there might not
  GS> be a later.

Ok.  I think we need some kind of zip file support in the core so that
it can be used as a standard distribution format.  I'd be happy if
Jim's zipfile module ended up being it.  We've got some zip code that
we developed at CNRI; it's a bit of a mess, but it might be helpful to
see what we did.  Our code is at ftp://www.python.org/pub/tmp/zip.zip

Jeremy


From jim at interet.com  Thu Dec 16 16:41:56 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 16 Dec 1999 10:41:56 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com>
Message-ID: <38590844.769C3025@interet.com>

Did anyone look at this yet?

   ftp://ftp.interet.com/pub/pylib.html

   ftp://ftp.interet.com/pub/zipfile.py

JimA


From skip at mojam.com  Thu Dec 16 16:46:28 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 16 Dec 1999 09:46:28 -0600 (CST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38590844.769C3025@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
	<3857B97E.3684224F@interet.com>
	<38590844.769C3025@interet.com>
Message-ID: <14425.2388.529932.61119@dolphin.mojam.com>

    JA> Did anyone look at this yet?
    JA>    ftp://ftp.interet.com/pub/pylib.html
    JA>    ftp://ftp.interet.com/pub/zipfile.py

I thought it wasn't supposed to be out until Monday?  You're looking for,
perhaps, a time machine? ;-)

(More seriously, it won't have any effect on my "gotta have this done
yesterday" list, so I will let others comment...)

Skip


From jim at interet.com  Thu Dec 16 18:16:21 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 16 Dec 1999 12:16:21 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com>
Message-ID: <38591E65.4885A39D@interet.com>

"James C. Ahlstrom" wrote:
 
>    ftp://ftp.interet.com/pub/pylib.html

I just changed zipfile.py so that regular zip compression
works.  And if zlib is available,
its crc32() is used instead of the Python version.

I should mention that the current code rejects zip files which have
an archive comment added to the end.  Accepting them would require
a search, and I am not sure it is worth it.

JimA


From fdrake at acm.org  Thu Dec 16 18:19:23 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 16 Dec 1999 12:19:23 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <Pine.LNX.4.10.9912151910500.16305-100000@nebula.lyra.org>
References: <199912151831.NAA02685@weyr.cnri.reston.va.us>
	<Pine.LNX.4.10.9912151910500.16305-100000@nebula.lyra.org>
Message-ID: <14425.7963.347400.763562@weyr.cnri.reston.va.us>

[Note that Greg's message went to python-checkins since he responded
to a checkin message, but I suspect he meant to change the header to
point to python-dev.  ;)  If not, too bad!]

Greg Stein writes:
 > But this means that your tables no long reside in "const" space. Yet More
 > Per-Process Memory...
 > 
 > It would be nice to have those tables marked as "const".

  Perhaps; as Guido points out, there haven't been a lot of complaints 
about this issue.
  I will note that only the tables aren't constant; the strings that
are pointed to are still constant.  I'm inclined to let the compiler/
linker care about this, and not change the code without a really clear 
need to do so.
  Here are the sizes of those tables and the strings they point to
(including terminating null bytes for the strings):

pathconf_names:  14 entries, 112 bytes,  176 string bytes
confstr_names:   25 entries, 200 bytes,  576 string bytes
sysconf_names:  108 entries, 864 bytes, 1774 string bytes

  Figures are for Solaris7.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From gstein at lyra.org  Thu Dec 16 19:10:14 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 10:10:14 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <14425.7963.347400.763562@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161006011.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Fred L. Drake, Jr. wrote:
> [Note that Greg's message went to python-checkins since he responded
> to a checkin message, but I suspect he meant to change the header to
> point to python-dev.  ;)  If not, too bad!]

I didn't really care too much where it went. I would actually suggest that
the Reply-To: on the checkin list is set to python-dev if that is where
replies are Supposed To Go.
[ I do this with mod_dav checkins; replies to dav-checkins mail goes to
  dav-dev. ]

> Greg Stein writes:
>  > But this means that your tables no long reside in "const" space. Yet More
>  > Per-Process Memory...
>  > 
>  > It would be nice to have those tables marked as "const".
> 
>   Perhaps; as Guido points out, there haven't been a lot of complaints 
> about this issue.
>   I will note that only the tables aren't constant; the strings that
> are pointed to are still constant.  I'm inclined to let the compiler/
> linker care about this, and not change the code without a really clear 
> need to do so.
>   Here are the sizes of those tables and the strings they point to
> (including terminating null bytes for the strings):
> 
> pathconf_names:  14 entries, 112 bytes,  176 string bytes
> confstr_names:   25 entries, 200 bytes,  576 string bytes
> sysconf_names:  108 entries, 864 bytes, 1774 string bytes
> 
>   Figures are for Solaris7.

Ah. I just replied to that. Guess that one went to python-checkins :-)

True, this is a small amount of memory. But they start to add up.
non-const globals also pain me when I start to work on free-threading
stuff (each must be examined to see if synchronization is needed), so
reducing the number there is important. Regarding the memory itself: as I
mentioned in the other note, I just want to ensure that Python's working
set remains low (reasons given in that email).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skip at mojam.com  Thu Dec 16 19:09:11 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 16 Dec 1999 12:09:11 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <Pine.LNX.4.10.9912161003010.16305-100000@nebula.lyra.org>
References: <199912161553.KAA08428@eric.cnri.reston.va.us>
	<Pine.LNX.4.10.9912161003010.16305-100000@nebula.lyra.org>
Message-ID: <14425.10951.169751.843764@dolphin.mojam.com>

>>>>> "Greg" == Greg Stein <gstein at lyra.org> writes:

    Greg> On Thu, 16 Dec 1999, Guido van Rossum wrote:
    >> I don't think there's much of a need to worry about this.  Why are
    >> you always bringing up this subject?  No-one else that I know has
    >> ever had this concern...

    Greg> Somebody has to :-)

    Greg> Keeping the working set low is more efficient from a system
    Greg> standpoint. 

Not to mention the not-all-that-occasional-anymore requests to have Python
on various itty-bitty things like Palm Pilots and WinCE devices.  It's one
thing to add size to modules people can live without for many applications,
but I think the posix module and its other platform-specific relations are
fairly heavily used.  (I realize this specific example isn't likely to apply
to PP/WinCE.)

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From gstein at lyra.org  Thu Dec 16 19:21:54 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 10:21:54 -0800 (PST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
In-Reply-To: <199912161527.KAA08308@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Guido van Rossum wrote:
>...
> I realize it's just a rant.  In this case (distutils) your advice is
> correct.  (I usually paraphrase it as "release early, release often".)

True. I prefer that phrase, too, but I used it on JimA earlier in the day
or the previous day. I didn't want to sound like a broken record :-). But
that is why I moved into <rant> mode... it seems like the mindset was
spreading :-) I've railed at AMK for it, too :-), when he was talking
about 0.5.1pre1 or whatever, rather than just releasing 0.5.1 and doing an
0.5.2 if there was a problem.

> However there are other situations, like core Python itself, where
> it's really useful to have stable releases -- if only for those users
> who won't touch anything with "beta" in its name.  I still hear from
> people who haven't upgraded to 1.5.2.

But this doesn't explain why there isn't a 1.5.3b1, 1.5.3b2, etc. Or
1.6.0a1 or whatever (maybe "d" or "r" for dev release, as opposed to
alpha).

There are some people would like the releases rather than using CVS. Some
people can't even use CVS because of firewall issues. Of course, an
alternative is snapshot-tarballs of the CVS repository. But a snapshot
could *really* be broken; something like 1.6.0d1 says "well, it's a
development release, but I've hit a good point between some changes."

> I wonder if perhaps for those cases (where there's a demand for stable
> releases) some other strategy could be used?  Such as labeling
> releases "stable" after the fact?  Or what Linus seems to do with the
> Linux kernel (even = stable, odd = development; or was it the other
> way around?).

Yes: even are stable (e.g. 1.0, 1.2, 2.0, 2.2). The odd numbers are for
development. Linus is currently working 2.3.x, but declared in the past
couple days that things will be wrapping up to move towards 2.4. Once he
thinks it is ready, he'll start off with 2.4.0pre1, pre2, pre3... At some
point the "pre" suffix will drop and 2.4.0 will be released.

You might have a bit of problem using that mechanism since the current
stable release is 1.5 :-). Once 1.6 hits the street, then you could start
doing 1.9 releases (dev) and shift to 2.0 once it is "stable".

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul at prescod.net  Thu Dec 16 19:02:55 1999
From: paul at prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 10:02:55 -0800
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
References: <199912132354.SAA10101@amarok.cnri.reston.va.us>
		<3856A77C.3A4D9F00@prescod.net>
		<14423.49044.143333.790752@amarok.cnri.reston.va.us>
		<3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us>
Message-ID: <3859294F.138FF398@prescod.net>

"Andrew M. Kuchling" wrote:
> 
>     * Python revisions come out slowly, once every year or two.  XML
>     standards have been revolving faster , and we don't want to wait
>     until 1.7 for SAX2, or DOM Level2, or other new revisions.
>     Keeping the modules out of the core lets them be updated at their
>     own pace.  A counterargument is that the XML specs are slowing
>     down -- add namespace support to SAX, and finalize DOM
>     Level 2, and I don't think any other standards are very important
>     to basic XML programming.

I agree with your counterargument. :) Anyhow, isn't there a logical
fallacy in your original argument? Why can't we offer a DOM 3 module or
extension after Python ships with DOM 2? 

>     * We really want a C-based parser to be commonly available.
>     sgmlop is the only reasonable choice for this, because I'd be
>     against including Expat.  To replay some arguments I made against
>     including the zlib library in 1.6, what if a C extension requires
>     a newer version of the library?  Symbol conflicts if you're lucky,
>     hard-to-debug problems if you're not.

I don't understand this issue. Why would a C extension build on sgmlop
which is designed to make XML information available to *Python*
programmers?

>     * We can drop various marginal bits of the CVS tree; the xmlarch
>     support is probably not of very wide interest, for example.

How about "expat", "mac", "pyexpat", "utils", "windows". There is just
too much stuff there! And I daresay that alot of it has not been
"quality controlled" to the level that we would expect if it were a part
of the real Python library. In other words, there is no single place to
go to get only XML-processing software that works well and works
together.

> I think I'm on the record as saying that Python's major problems now
> aren't language-related, but are with the development environment.
> Language changes (from minor, like 'for i in 1..9', to major, like
> fixing the type/class dichotomy or adding static types) aren't going
> to bring in piles of new users, useful though they might be to
> experienced Pythoneers, large projects, or some other specific
> application.

(irrelevant aside: I agree 100% that making things easier to install
will actually improve newbies experience more than (e.g.) static type
checking but I do not agree that it is a better "sales tool". Most
people are sold based on the language and its libraries before they
start trying to install extensions.)

> If installing things is a problem, then we need to
> buckle down and finish the distutils.  So, overall, I'd still vote
> against inclusion in 1.6.

So are you saying that Python 2 might have only five packages and
everything else must be downloaded? No httplib, no pickle, no random or
math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?

When people download Python and go to the library documentation that
impressive array of BUILT-IN-FEATURES is part of what sells them on
Python. Hell, I can download all of that stuff for Scheme but what makes
Python beautiful is that I don't have to download it for Python. It's
just there. But if an XML person comes to Python after hearing us rant
about how great it is for processing XML and all they find is
xmllib...they will be underwhelmed.

> No, it's *got* to reach 1.0.  The point of the package is that it's
> exactly *one* thing to install that gives basic XML tools; you don't
> need to chase down the SAX modules from Lars' page, PyExpat from
> ftp.cwi.nl, sgmlop from pythonware.com, and so forth.  If the
> Distutils made it as easy as:
> 
> python fetchpackage.py SAX PyExpat DOM sgmlop
>    <find PySAX's home site>
>    <download it>
>    <compile & install>
>    etc...
> 
> then much of the need for a single package goes away, but, as you
> point out, that isn't currently the case.

I'm a little lost here. We need xmllib to continue because distutils
doesn't do what we need yet but we don't need to put the stuff in the
Python library because disutils will work well enough soon.

But there is an important issue that disutils will not solve. One of the
beautiful things about the Python library is that everything is at the
same version level. When you install it you know that everything works
together or else it WILL in the next patch level if you report the
incompatibility. When the xml package gets versioned incompatibly with
the Python library you don't have that safe feeling. 

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From akuchlin at mems-exchange.org  Thu Dec 16 19:50:48 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Thu, 16 Dec 1999 13:50:48 -0500 (EST)
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
In-Reply-To: <3859294F.138FF398@prescod.net>
References: <199912132354.SAA10101@amarok.cnri.reston.va.us>
	<3856A77C.3A4D9F00@prescod.net>
	<14423.49044.143333.790752@amarok.cnri.reston.va.us>
	<3857CEB0.C29C5F24@prescod.net>
	<14423.57778.131798.776845@amarok.cnri.reston.va.us>
	<3859294F.138FF398@prescod.net>
Message-ID: <14425.13448.737831.460241@amarok.cnri.reston.va.us>

(Responding to the python-dev related portion of this...)

Paul Prescod writes:
>I don't understand this issue. Why would a C extension build on sgmlop
>which is designed to make XML information available to *Python*
>programmers?

No, no; I'm arguing against shipping with Expat; sgmlop good!
Consider this scenario:

	* Python includes Expat 1.0
	* Some C library (for DAV or whatever) uses Expat 1.1
	* Someone writes a Python interface to this C library and
	  attempts to compile it statically.
	* Two versions of Expat in the same binary; symbol conflicts
	  and core dumps, oh my!

>So are you saying that Python 2 might have only five packages and
>everything else must be downloaded? No httplib, no pickle, no random or
>math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?

I'm not arguing for dropping existing packages; I'm against adding
many more of them.  Existing library modules can stay where they are.
But I wouldn't mind a minimalist Python too much, if it came with a
script fetch-basic-packages:

python fetch-packages.py httplib
python fetch-packages.py imaplib
 ...  200 more lines ...

>I'm a little lost here. We need xmllib to continue because distutils
>doesn't do what we need yet but we don't need to put the stuff in the
>Python library because disutils will work well enough soon.

Basically, yes.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
And now let us hasten to the station. I have commanded the rain to fall at
exactly one-fifteen and I would hate to get my shoes wet.
    -- Lord Lavender, in SEBASTIAN O #2


From bwarsaw at cnri.reston.va.us  Thu Dec 16 19:50:49 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Thu, 16 Dec 1999 13:50:49 -0500 (EST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
References: <199912161527.KAA08308@eric.cnri.reston.va.us>
	<Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>
Message-ID: <14425.13449.954026.960703@anthem.cnri.reston.va.us>

    >> I wonder if perhaps for those cases (where there's a demand for
    >> stable releases) some other strategy could be used?  Such as
    >> labeling releases "stable" after the fact?  Or what Linus seems
    >> to do with the Linux kernel (even = stable, odd = development;
    >> or was it the other way around?).

I really dislike the odd/even distinction for exactly this reason.

-Barry


From guido at CNRI.Reston.VA.US  Thu Dec 16 20:02:16 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 16 Dec 1999 14:02:16 -0500
Subject: [Python-Dev] Batteries Included?
Message-ID: <199912161902.OAA11345@eric.cnri.reston.va.us>

I like the batteries included approach, but I also feel resistence
against including stuff I cannot maintain.  The XML code base is a
point in case; I don't understand enough about XML.  (I just read that
xmllib.py is "illegal".  Jeez!  What happened?  Did Congress pass a
law against it?)

I think it may be time for separate Python distributions, like Linux
-- I can concentrate on the core, and keep it really small; others can
make all-encompassing distributions.

There are currently some drawbacks to this approach: non-core modules
have less status; and the documentation process is fundamentally
different for core and non-core modules.  There's also the version
dependency stuff, but I think resolving that is the responsibility of
the distribution makers.

I think the status problem will be gone once there is a respected
distribution -- then you derive status from being in that
distribution, rather than from being in the core distribution.  (Well,
you would still derive status from being in the core, but it would be
much harder to obtain, since I can set a much higher standard.)

The documentation problem is the one that's left.  I think the doc-sig
may be on its way as we speak to solve this, though.  Fred?

This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip at mojam.com  Thu Dec 16 20:05:05 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 16 Dec 1999 13:05:05 -0600 (CST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
In-Reply-To: <14425.13449.954026.960703@anthem.cnri.reston.va.us>
References: <199912161527.KAA08308@eric.cnri.reston.va.us>
	<Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>
	<14425.13449.954026.960703@anthem.cnri.reston.va.us>
Message-ID: <14425.14305.907618.978628@dolphin.mojam.com>

    >>> Or what Linus seems to do with the Linux kernel (even = stable, odd
    >>> = development; or was it the other way around?).

    BAW> I really dislike the odd/even distinction for exactly this reason.

It's one saving grace is that it is a uniform format.  There are no
"optional" tokens like "pre", "alpha", "beta", etc for the most part.

To remember which way it is, I find it useful to execute "uname -r", check
the second digit, then look down at my shirt for a pocket protector.  The
two pieces of information together work for me.  I currently get
"2.2.13-4mdk" from uname.  I don't even have a pocket, let alone a pocket
protector, so even numbers must be stable releases...

;-)

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From fdrake at acm.org  Thu Dec 16 20:05:22 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 16 Dec 1999 14:05:22 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121
In-Reply-To: <14425.10951.169751.843764@dolphin.mojam.com>
References: <199912161553.KAA08428@eric.cnri.reston.va.us>
	<Pine.LNX.4.10.9912161003010.16305-100000@nebula.lyra.org>
	<14425.10951.169751.843764@dolphin.mojam.com>
Message-ID: <14425.14322.355507.500813@weyr.cnri.reston.va.us>

Skip Montanaro writes:
 > fairly heavily used.  (I realize this specific example isn't likely to apply
 > to PP/WinCE.)

  Or any version of Windows, I suspect; perhaps Mark Hammond can
elaborate.  Appearantly none of the pathconf() constants are defined
on that platform, at least not as #define constants.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jcw at equi4.com  Thu Dec 16 20:09:42 1999
From: jcw at equi4.com (Jean-Claude Wippler)
Date: Thu, 16 Dec 1999 20:09:42 +0100
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
References: <199912132354.SAA10101@amarok.cnri.reston.va.us>
			<3856A77C.3A4D9F00@prescod.net>
			<14423.49044.143333.790752@amarok.cnri.reston.va.us>
			<3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> <3859294F.138FF398@prescod.net>
Message-ID: <385938F6.C4164756@equi4.com>

Paul Prescod wrote:
[...]
> (irrelevant aside: [...] Most people are sold based on the language
> and its libraries before they start trying to install extensions.)
> 
> [AMK]
> > If installing things is a problem, then we need to
> > buckle down and finish the distutils.  So, overall, I'd still vote
> > against inclusion in 1.6.
> 
> So are you saying that Python 2 might have only five packages and
> everything else must be downloaded? No httplib, no pickle, no random
> or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?
> 
> When people download Python and go to the library documentation that
> impressive array of BUILT-IN-FEATURES is part of what sells them on
> Python. Hell, I can download all of that stuff for Scheme but what
> makes Python beautiful is that I don't have to download it for Python.
> It's just there. But if an XML person comes to Python after hearing us
> rant about how great it is for processing XML and all they find is
> xmllib...they will be underwhelmed.

(Nodding in agreement)

Could this perhaps be solved with a large batteries-included standard
distribution, plus a real easy/effective way to strip Python down and
wrap things up for deployment?  

In other words, aim for two very distinct goals: everything within easy
reach for development + fully signed-sealed-delivered products.

The first goal can evolve to do fancy net-bourne distribution, even if
it is a brittle process, because this is for Python developers.  They
want it all, so open the floodgate to give it all to them.

The second becomes a matter or pruning down and wrapping up.  All the
way down to an single installation-less executable, if possible.

I may well be wrong (and I'm not tracking distutils), but might it not
be simpler to focus on 1) power users + 2) production-grade deployment,
instead of trying to streamline a tangled-web-of-module-dependencies
into a distribution system which tries to meet a wide range of needs?

> [...] One of the beautiful things about the Python library is that
> everything is at the same version level. When you install it you know
> that everything works together or else it WILL in the next patch level
> if you report the incompatibility.  [...]

More nods.  So why not allow the Python distribution to become very
large - with every release moving to a better-tuned combination of all
the different parts (occasional mishaps can quickly be fixed)?

Plus some tools to dist(ut)il(l) a turnkey solution from this big soup.

Sort-of-from-violin-to-quartet-all-the-way-to-symphony-orchestra...

-- Jean-Claude


From gstein at lyra.org  Thu Dec 16 21:02:46 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 12:02:46 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38590844.769C3025@interet.com>
Message-ID: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, James C. Ahlstrom wrote:
> Did anyone look at this yet?
> 
>    ftp://ftp.interet.com/pub/pylib.html
> 
>    ftp://ftp.interet.com/pub/zipfile.py

I went to look for it, but I think that was before you put zipfile up.

Looking at it now...  The writepy() as a method is questionable, I think.
I think it should open the file at instantiation time. I don't see a
reason to allow that to be deferred. Especially given that some of the
methods fail if open() hasn't been called. It would be good to have
symbolic names for the 0 and 8 compression constants, and to fail if 8 is
passed and zlib is not available (otherwise, it doesn't fail until
read/write time, and with a NameError). There should probably be a
__del__ that calls close(). Oh, and a "closed" attribute that can be
checked and an error raised if an operation is done after the file has
been closed. I think dir() should return the contents, rather than print
them. read() and write() ought to fail if the mode is incorrect. Oh, some
symbolic constants for things like "PK\005\006" would be nice.

Do you have a ZipImporter written?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Thu Dec 16 21:12:30 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 12:12:30 -0800 (PST)
Subject: [Python-Dev] Re: [XML-SIG] Developer's Day
In-Reply-To: <14425.13448.737831.460241@amarok.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161210350.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Andrew M. Kuchling wrote:
> Paul Prescod writes:
> >I don't understand this issue. Why would a C extension build on sgmlop
> >which is designed to make XML information available to *Python*
> >programmers?
> 
> No, no; I'm arguing against shipping with Expat; sgmlop good!
> Consider this scenario:
> 
> 	* Python includes Expat 1.0
> 	* Some C library (for DAV or whatever) uses Expat 1.1
> 	* Someone writes a Python interface to this C library and
> 	  attempts to compile it statically.
> 	* Two versions of Expat in the same binary; symbol conflicts
> 	  and core dumps, oh my!

We should ship pyexpat, not Expat.  (IMO)

> >So are you saying that Python 2 might have only five packages and
> >everything else must be downloaded? No httplib, no pickle, no random or
> >math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec?
> 
> I'm not arguing for dropping existing packages; I'm against adding
> many more of them.  Existing library modules can stay where they are.
> But I wouldn't mind a minimalist Python too much, if it came with a
> script fetch-basic-packages:
> 
> python fetch-packages.py httplib
> python fetch-packages.py imaplib
>  ...  200 more lines ...

Considering that it would probably use HTTP to fetch the packages, I think
you wouldn't be fetching httplib :-)

But yes: I agree with the basic sentiment.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From petrilli at amber.org  Thu Dec 16 21:55:16 1999
From: petrilli at amber.org (Christopher Petrilli)
Date: Thu, 16 Dec 1999 15:55:16 -0500
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <199912161902.OAA11345@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Thu, Dec 16, 1999 at 02:02:16PM -0500
References: <199912161902.OAA11345@eric.cnri.reston.va.us>
Message-ID: <19991216155516.A28037@trump.amber.org>

Guido van Rossum [guido at CNRI.Reston.VA.US] wrote:
> I think it may be time for separate Python distributions, like Linux
> -- I can concentrate on the core, and keep it really small; others can
> make all-encompassing distributions.

My fear is what we face in the Zope world---different distributions break
in totally diffrent ways, and sometimes we have to ask 30 questions to figure
out what might be going wrong :/  The nice thing is hat if someone installes
Python from the source, we know what's going to happen.  I don't know if
this is solvable, honestly.

> This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

I think Guido just wants to IPO and retire :-)

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org


From gward at cnri.reston.va.us  Thu Dec 16 22:03:26 1999
From: gward at cnri.reston.va.us (Greg Ward)
Date: Thu, 16 Dec 1999 16:03:26 -0500
Subject: [Python-Dev] distutils-sig/python-dev crosstalk
Message-ID: <19991216160325.H4289@cnri.reston.va.us>

Most recent threads on distutils-sig seem to have migrated to python-dev
pretty quickly.  This means that a) there are python-dev people on
distutils-sig (duh), b) they think what goes on there is important
enough to interest the other core developers (good!), and c) they assume
there are people on python-dev who are not also on distutils-sig.

Is this last assumption true?  If you read python-dev, are interested in
distutils issues, but do *not* read distutils-sig, please drop me a
note.  If no one says anything, I will (politely, tentatively) propose
that we keep the distutils threads on distutils-sig and leave python-dev
for, well, core Pythond development.

If you think that the two are inextricably linked and I might as well
just cross-post everything on distutils-sig to python-dev, let me know
about that too.  ;-)

        Greg
-- 
Greg Ward - software developer                    gward at cnri.reston.va.us
Corporation for National Research Initiatives    
1895 Preston White Drive                           voice: +1-703-620-8990
Reston, Virginia, USA  20191-5434                    fax: +1-703-620-0913


From gstein at lyra.org  Thu Dec 16 22:18:50 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:18:50 -0800 (PST)
Subject: [Python-Dev] distutils-sig/python-dev crosstalk
In-Reply-To: <19991216160325.H4289@cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161316580.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Greg Ward wrote:
>...
> If you think that the two are inextricably linked and I might as well
> just cross-post everything on distutils-sig to python-dev, let me know
> about that too.  ;-)

:-)  I think distutils is about the mechanics. And it is a large and
sophisticated problem (which why it has a SIG :-). You could almost view
it as a spinoff of the python-dev grand problem set.

When we get into the question of "what does Python ship with?", then I
think it belongs in python-dev, as that is a discussion of what
constitutes Python itself.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Thu Dec 16 22:21:12 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:21:12 -0800 (PST)
Subject: [Python-Dev] distutils-sig/python-dev crosstalk
In-Reply-To: <19991216160325.H4289@cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912161318550.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Greg Ward wrote:
> Most recent threads on distutils-sig seem to have migrated to python-dev
> pretty quickly.  This means that a) there are python-dev people on
> distutils-sig (duh), b) they think what goes on there is important
> enough to interest the other core developers (good!), and c) they assume
> there are people on python-dev who are not also on distutils-sig.

Oh. One more thing.

Actually, what I am somewhat worried about is whether there was relevant
discussion on python-dev that should have been visible to the distutils
people. Not sure if there was, but that is always a potential problem.
Same with the recent xml-sig / python-dev crosstalk. Specifically, Paul
Prescod is not on python-dev, so he may have missed a response or two.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From mal at lemburg.com  Thu Dec 16 22:23:30 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 16 Dec 1999 22:23:30 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com>
Message-ID: <38595852.E8054741@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "James C. Ahlstrom" wrote:
> 
> >    ftp://ftp.interet.com/pub/pylib.html
> 
> I just changed zipfile.py so that regular zip compression
> works.  And if zlib is available,
> its crc32() is used instead of the Python version.
> 
> I should mention that the current code rejects zip files which have
> an archive comment added to the end.  Accepting them would require
> a search, and I am not sure it is worth it.

I don't think it is needed for our purposes, but maybe a
subclass could provide it ?

FYI, I've tested the module against mxStack-0.3.0.zip which 
you can find on my Python Pages. It was created using Info-ZIP's
zip 2.2 on Linux.

Unfortunately, I always get the following traceback when trying
to print the directory:

>>> z.open('../projects/distribution/mxStack-0.3.0.zip','rb')
>>> z.dir()
File Name                             Modified             Size
Stack/mxStack/mxStack.h        1999-04-16 10:50:06         4368
Stack/mxStack/mxstdlib.h       1999-04-13 15:37:52         5433
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File "/home/lemburg/lib/zipfile.py", line 120, in dir
    bytes = self.read(name)     # Just to check CRC-32
  File "/home/lemburg/lib/zipfile.py", line 133, in read
    bytes = zlib.decompress(bytes, -15)
zlib.error: Error -5 while decompressing data

Some notes on the API:
----------------------
* I would find it more convenient if the filename and mode
would be constructor parameters, e.g.

	zfile = zipfile('myfile.zip','rb')

with compression defaulting to 8 rather than 0 (most zip files
will be deflated since this is the ZIP default).

* Also, I would like a method much like the os.listdir()
which returns a list of filenames rather than print it
to stdout.

* .is_zipfile() should probably be a separate function: it
doesn't use any of the class' features.

More wishes to come ;-)

So far: Great Work !

Aside: I found that you are using undocumented arguments to
zlib.compressobj() ... are these extra arguments left out of
the documentation on purpose or by simple oversight ? I couldn't
find them in the HTML docs and neither in the docstrings.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    15 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein at lyra.org  Thu Dec 16 22:32:09 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:32:09 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38595852.E8054741@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912161330570.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, M.-A. Lemburg wrote:
>...
> Some notes on the API:
> ----------------------
> * I would find it more convenient if the filename and mode
> would be constructor parameters, e.g.
> 
> 	zfile = zipfile('myfile.zip','rb')
> 
> with compression defaulting to 8 rather than 0 (most zip files
> will be deflated since this is the ZIP default).
> 
> * Also, I would like a method much like the os.listdir()
> which returns a list of filenames rather than print it
> to stdout.

The above two items were in my ramble, just not as clear as MAL :-)

> * .is_zipfile() should probably be a separate function: it
> doesn't use any of the class' features.

Ah! Good call. It is even more important to shift it out if the
constructor now opens a file.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fdrake at acm.org  Thu Dec 16 22:33:36 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 16 Dec 1999 16:33:36 -0500 (EST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <38595852.E8054741@lemburg.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
	<3857B97E.3684224F@interet.com>
	<38590844.769C3025@interet.com>
	<38591E65.4885A39D@interet.com>
	<38595852.E8054741@lemburg.com>
Message-ID: <14425.23216.636687.704436@weyr.cnri.reston.va.us>

M.-A. Lemburg writes:
 > Aside: I found that you are using undocumented arguments to
 > zlib.compressobj() ... are these extra arguments left out of
 > the documentation on purpose or by simple oversight ? I couldn't
 > find them in the HTML docs and neither in the docstrings.

  The documentation is way out of date and Jeremy Hylton and Andrew
Kuchling haven't updated it.  I'm not sure which of them changed the
signatures for that module, but I've pestered Jeremy about it a few
times.
  If anyone would like to update the documentation, I'd certainly
appreciate it.  I don't know the details of those interfaces, and this 
is somewhere where the details are pretty critical.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From bwarsaw at cnri.reston.va.us  Fri Dec 17 00:10:11 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Thu, 16 Dec 1999 18:10:11 -0500 (EST)
Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released
References: <199912161527.KAA08308@eric.cnri.reston.va.us>
	<Pine.LNX.4.10.9912161011050.16305-100000@nebula.lyra.org>
	<14425.13449.954026.960703@anthem.cnri.reston.va.us>
	<14425.14305.907618.978628@dolphin.mojam.com>
Message-ID: <14425.29011.429867.485070@anthem.cnri.reston.va.us>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> To remember which way it is, I find it useful to execute
    SM> "uname -r", check the second digit, then look down at my shirt
    SM> for a pocket protector.  The two pieces of information
    SM> together work for me.  I currently get "2.2.13-4mdk" from
    SM> uname.  I don't even have a pocket, let alone a pocket
    SM> protector, so even numbers must be stable releases...

What do you do if it's the second Thursday after the full moon, and
the local hockey team has just skated to a 3-3 tie?

-Barry


From mal at lemburg.com  Thu Dec 16 22:53:36 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 16 Dec 1999 22:53:36 +0100
Subject: [Python-Dev] Batteries Included?
References: <199912161902.OAA11345@eric.cnri.reston.va.us>
Message-ID: <38595F60.7C1B34FF@lemburg.com>

Guido van Rossum wrote:
> 
> I like the batteries included approach, but I also feel resistence
> against including stuff I cannot maintain. 
> ...
> This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

I think we should wait for distutils to get up and running
perfectly for everyone before taking such a step.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    15 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein at lyra.org  Fri Dec 17 09:31:38 1999
From: gstein at lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 00:31:38 -0800 (PST)
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <38595F60.7C1B34FF@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912170027530.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, M.-A. Lemburg wrote:
> Guido van Rossum wrote:
> > I like the batteries included approach, but I also feel resistence
> > against including stuff I cannot maintain. 

This is an interesting comment, and is similar to the Apache sentiment.
Nothing gets added to the standard distribution unless somebody in the
Group is willing to maintain it. It provides a good mechanism for keeping
the module set to a reasonable size and a set that can/will actually be
maintained.

> > ...
> > This isn't rocket science.  Red Hat Python?  I'm all for it! :-)
> 
> I think we should wait for distutils to get up and running
> perfectly for everyone before taking such a step.

You can also operate on the assumption that it will be done by the time
1.6 is ready to be released. In other words: do the work (distutils and
minimizing the release) in parallel, rather than in sequence.

I would also think that a large distro isn't going to be assembled with
distutils. Somebody will sit down, pull all the components together, and
make a big release.

However, I do see the distutils as being needed for the people who grab
the minimal distro. They need it to grab add'l packages.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fredrik at pythonware.com  Fri Dec 17 10:06:20 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 17 Dec 1999 10:06:20 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>
Message-ID: <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com>

James C. Ahlstrom wrote:
> > Did anyone look at this yet?
> > 
> >    ftp://ftp.interet.com/pub/pylib.html
> > 
> >    ftp://ftp.interet.com/pub/zipfile.py
> 
> I went to look for it, but I think that was before you put zipfile up.

just a few comments (from reading the docs):

-- it would be great if "open" could take an open file
object as well as a file name.

(in this case, you also need to document what you
expect from the underlying file object: read, write,
seek, tell should be enough, right?  haven't looked
at the code -- assuming it works, I'm only interested
in the interface)

-- or you could nuke "open" and pass those arguments
to the constructor instead.

-- I assume "open" adds "b" to the given mode argument.

-- "dir" looks a bit strange.  and hey, there's no "listdir"
in there.  I'd prefer a recursive "listdir" method, which
takes an optional "depth" argument (e.g. 0=this dir,
1=this dir and first subdir, None=infinity, i.e. the full
tree).

that's all for now.

</F>


From fredrik at pythonware.com  Fri Dec 17 13:21:03 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 17 Dec 1999 13:21:03 +0100
Subject: [Python-Dev] posix module
References: <14423.61493.90107.433664@weyr.cnri.reston.va.us>
Message-ID: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com>

> Ok, I think I'm done with the posix module updates, modulo bugs and
> additional symbols for the *conf*() tables.

gcc  -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H -c ./posixmodule.c
./posixmodule.c:3789: `_SC_AIO_LIST_MAX' undeclared here (not in a function)
./posixmodule.c:3789: initializer element for `posix_constants_sysconf[10].value' is not constant
make[1]: *** [posixmodule.o] Error 1
make[1]: Leaving directory `/data/repository/BleedingEdge/python/dist/src/Modules'

(current CVS stuff, on Red Hat 5.2)

</F>


From jim at interet.com  Fri Dec 17 15:33:31 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 09:33:31 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>
Message-ID: <385A49BB.4D064240@interet.com>

Greg Stein wrote:
> 
> On Thu, 16 Dec 1999, James C. Ahlstrom wrote:
> > Did anyone look at this yet?
> >
> >    ftp://ftp.interet.com/pub/pylib.html
> >
> >    ftp://ftp.interet.com/pub/zipfile.py
> 
> Looking at it now...  The writepy() as a method is questionable, I think.
> I think it should open the file at instantiation time. I don't see a
> reason to allow that to be deferred. Especially given that some of the
> methods fail if open() hasn't been called.

I eliminated open and added its args to the constructor.

> It would be good to have
> symbolic names for the 0 and 8 compression constants, and to fail if 8 is
> passed and zlib is not available (otherwise, it doesn't fail until
> read/write time, and with a NameError). There should probably be a
> __del__ that calls close(). Oh, and a "closed" attribute that can be
> checked and an error raised if an operation is done after the file has
> been closed.

All done.

> I think dir() should return the contents, rather than print
> them.

I added listdir() and documented self.TOC.  I kept printdir()
as example code.

> read() and write() ought to fail if the mode is incorrect. Oh, some
> symbolic constants for things like "PK\005\006" would be nice.

All done.

JimA


From guido at CNRI.Reston.VA.US  Fri Dec 17 15:43:23 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 17 Dec 1999 09:43:23 -0500
Subject: [Python-Dev] Batteries Included?
In-Reply-To: Your message of "Thu, 16 Dec 1999 22:53:36 +0100."
             <38595F60.7C1B34FF@lemburg.com> 
References: <199912161902.OAA11345@eric.cnri.reston.va.us>  
            <38595F60.7C1B34FF@lemburg.com> 
Message-ID: <199912171443.JAA12414@eric.cnri.reston.va.us>

> Guido van Rossum wrote:
> > 
> > I like the batteries included approach, but I also feel resistence
> > against including stuff I cannot maintain. 
> > ...
> > This isn't rocket science.  Red Hat Python?  I'm all for it! :-)

MAL:
> I think we should wait for distutils to get up and running
> perfectly for everyone before taking such a step.

Fair enough -- but in the mean time, no more pushing for new modules
in the core distribution (distutils excluded).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward at cnri.reston.va.us  Fri Dec 17 15:59:09 1999
From: gward at cnri.reston.va.us (Greg Ward)
Date: Fri, 17 Dec 1999 09:59:09 -0500
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us>; from guido@cnri.reston.va.us on Fri, Dec 17, 1999 at 09:43:23AM -0500
References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> <199912171443.JAA12414@eric.cnri.reston.va.us>
Message-ID: <19991217095908.B8799@cnri.reston.va.us>

On 17 December 1999, Guido van Rossum said:
> Fair enough -- but in the mean time, no more pushing for new modules
> in the core distribution (distutils excluded).

So anyone who wants a new module snuck into the core just has to
convince me to add it the distutils package, right?  >snicker<

        Greg


From jeremy at cnri.reston.va.us  Fri Dec 17 19:30:37 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Fri, 17 Dec 1999 13:30:37 -0500 (EST)
Subject: [Python-Dev] Batteries Included?
In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us>
References: <199912161902.OAA11345@eric.cnri.reston.va.us>
	<38595F60.7C1B34FF@lemburg.com>
	<199912171443.JAA12414@eric.cnri.reston.va.us>
Message-ID: <14426.33101.757523.853781@goon.cnri.reston.va.us>

>>>>> "GvR" == Guido van Rossum <guido at CNRI.Reston.VA.US> writes:

  >> Guido van Rossum wrote:  I like the batteries included
  >> approach, but I also feel resistence  against including stuff I
  >> cannot maintain.   ...   This isn't rocket science.  Red Hat
  >> Python?  I'm all for it! :-)

  >> MAL wrote:
  >> I think we should wait for distutils to get up and running
  >> perfectly for everyone before taking such a step.

  GvR> Fair enough -- but in the mean time, no more pushing for new
  GvR> modules in the core distribution (distutils excluded).

Perhaps the right long-term solution (post-distutils) is to split
Python into a core architected by Guido and a bazaar-style standard
library maintained in a more apache-style.

Jeremy


From jim at interet.com  Fri Dec 17 16:25:10 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 10:25:10 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com>
Message-ID: <385A55D6.A8A05EB9@interet.com>

"M.-A. Lemburg" wrote:

> Unfortunately, I always get the following traceback when trying
> to print the directory:

OK, I changed the decompress code (10:23 AM), please re-try.

> with compression defaulting to 8 rather than 0 (most zip files
> will be deflated since this is the ZIP default).

The compress mode only applies to writing.  On read, the
method recorded in the file controls.

JimA


From jim at interet.com  Fri Dec 17 15:49:20 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 09:49:20 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org> <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com>
Message-ID: <385A4D70.A162C584@interet.com>

Fredrik Lundh wrote:
> 
> James C. Ahlstrom wrote:
> > >
> > >    ftp://ftp.interet.com/pub/pylib.html

> -- it would be great if "open" could take an open file
> object as well as a file name.

I put these arguments into the constructor now.

> (in this case, you also need to document what you
> expect from the underlying file object: read, write,
> seek, tell should be enough, right?  haven't looked
> at the code -- assuming it works, I'm only interested
> in the interface)

OK, docs updated.

> -- I assume "open" adds "b" to the given mode argument.

Correct.  The mode can be either "w" or "wb" etc., and it works.

> -- "dir" looks a bit strange.  and hey, there's no "listdir"
> in there.  I'd prefer a recursive "listdir" method, which
> takes an optional "depth" argument (e.g. 0=this dir,
> 1=this dir and first subdir, None=infinity, i.e. the full
> tree).

I added a plain listdir() and changed dir() to printdir().  I also
documented self.TOC which gets you the values too.

JimA


From jim at interet.com  Fri Dec 17 15:39:51 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Fri, 17 Dec 1999 09:39:51 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com>
Message-ID: <385A4B37.333B9443@interet.com>

"M.-A. Lemburg" wrote:
> 
> "James C. Ahlstrom" wrote:
> > >    ftp://ftp.interet.com/pub/pylib.html
> >

> Unfortunately, I always get the following traceback when trying
> to print the directory:

Yes, compression isn't there yet.  I am looking into it.
 
> Some notes on the API:
> ----------------------
> * I would find it more convenient if the filename and mode
> would be constructor parameters, e.g.
> 
>         zfile = zipfile('myfile.zip','rb')

OK, done.
 
> with compression defaulting to 8 rather than 0 (most zip files
> will be deflated since this is the ZIP default).

Until compression works, and zlib ships with Python I
would rather default to no compression (method 0).  Otherwise
this is not useful as a Python import archive.
 
> * Also, I would like a method much like the os.listdir()
> which returns a list of filenames rather than print it
> to stdout.

OK, done.
 
> * .is_zipfile() should probably be a separate function: it
> doesn't use any of the class' features.

OK, done.
  
> Aside: I found that you are using undocumented arguments to
> zlib.compressobj() ... are these extra arguments left out of
> the documentation on purpose or by simple oversight ? I couldn't
> find them in the HTML docs and neither in the docstrings.

I am following the CNRI code blindly here.  I don't have
docs either.

JimA


From jack at oratrix.nl  Fri Dec 17 23:54:03 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 17 Dec 1999 23:54:03 +0100
Subject: [Python-Dev] Batteries Included? 
In-Reply-To: Message by Jeremy Hylton <jeremy@cnri.reston.va.us> ,
	     Fri, 17 Dec 1999 13:30:37 -0500 (EST) , <14426.33101.757523.853781@goon.cnri.reston.va.us> 
Message-ID: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl>

Recently, Jeremy Hylton <jeremy at cnri.reston.va.us> said:
> Perhaps the right long-term solution (post-distutils) is to split
> Python into a core architected by Guido and a bazaar-style standard
> library maintained in a more apache-style.

I can't help feeling uncomfortable with this. I've had quite some work 
to get an Apache with SSL up and running, even though someone gave me
quite precise instructions. With Perl I fared even worse, despite
their distutils-like package, when I wanted to try a PalmPilot package 
for Unix that needed Perl. I finally had to give up after quite some
effort because the addon installers kept finding the older version of
Perl that the system mgr had installed in stead of my newer version.

I think distutils will be wonderful for us, the Python community, but
something more RedHattish is needed for the general world who just want 
Python plus a certain set of extensions because some application needs 
it, so they can just download a fresh copy of ParrotPython 3.4.4 and
know the application will work, without interfering with another
application that happens to use Inquisition 1a5 and lives elsewhere on 
the disk.

And maybe the answer is a much simpler freezing process, like
MacPython BuildApplication where any Python user can drop a script on
it and end up with a fully self-contained app guaranteed (well.... No
reports to the contrary have been heard so far, at least:-) to contain
everything needed and not interfere with an existing MacPython
installation (or be interfered with by it). Then a popular app will
have prebuilt binaries available for all platforms quickly, made by
the Python community, and the enduser interested in the app but not in 
Python can simply download that.

--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From mal at lemburg.com  Sat Dec 18 14:17:52 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 18 Dec 1999 14:17:52 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A4B37.333B9443@interet.com>
Message-ID: <385B8980.11CDE9AC@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> > "James C. Ahlstrom" wrote:
> > > >    ftp://ftp.interet.com/pub/pylib.html
> > >
> 
> > Unfortunately, I always get the following traceback when trying
> > to print the directory:
> 
> Yes, compression isn't there yet.  I am looking into it.

Great :-)
 
> > Some notes on the API:
> > ----------------------
> > * I would find it more convenient if the filename and mode
> > would be constructor parameters, e.g.
> >
> >         zfile = zipfile('myfile.zip','rb')
> 
> OK, done.
> 
> > with compression defaulting to 8 rather than 0 (most zip files
> > will be deflated since this is the ZIP default).
> 
> Until compression works, and zlib ships with Python I
> would rather default to no compression (method 0).  Otherwise
> this is not useful as a Python import archive.

Point taken.

Perhaps it would be even better to not have a
default at all: that way people will have to think about the
issue *before* implementing it, rather than debug code
that produces tracebacks.

> > * Also, I would like a method much like the os.listdir()
> > which returns a list of filenames rather than print it
> > to stdout.
> 
> OK, done.
> 
> > * .is_zipfile() should probably be a separate function: it
> > doesn't use any of the class' features.
> 
> OK, done.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    13 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Sat Dec 18 16:16:44 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 18 Dec 1999 16:16:44 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com>
Message-ID: <385BA55C.9DFCA88D@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> 
> > Unfortunately, I always get the following traceback when trying
> > to print the directory:
> 
> OK, I changed the decompress code (10:23 AM), please re-try.

Everything is fine now... it's really impressive how easy
you can manipulate ZIP files with it.

One thing I'd suugest is to include some way to delete and
update contents, e.g. the write() method should overwrite
any existing entry in the archive (if it not already does --
I haven't tested it, just read the code and it seems to raise
an exception), plus maybe a .remove() method which deletes
an entry.
 
> > with compression defaulting to 8 rather than 0 (most zip files
> > will be deflated since this is the ZIP default).
> 
> The compress mode only applies to writing.  On read, the
> method recorded in the file controls.

True. How about making the compression argument mandatory
for file opened in 'wb' mode only ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    13 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From da at ski.org  Sat Dec 18 18:35:00 1999
From: da at ski.org (David Ascher)
Date: Sat, 18 Dec 1999 09:35:00 -0800
Subject: [Python-Dev] Year 2000 O'Reilly Python Conference
Message-ID: <003501bf497e$368f6f60$e655cfc0@ski.org>

I just got off the phone with someone at O'Reilly, who is starting to plan
the next O'Reilly Open Source Convention.  I've agreed to be the chair of
the Python conference, just so that there are no delays in getting the
conference organized.  If someone feels that I should not be chair, speak
now and we can figure out who takes the 'job'.

There are short-term and long-term issues to discuss:

Short term:

- We need a program committee -- If you're interested in being on said
committee or know someone who should be, let me know. I'd like to get
representatives from various subconstituencies on there (web types, zope
types, business types, scientist types, linux types, hackers, etc.)

- The call for papers is going on the O'Reilly website soon.  I will try and
get them to pass things by me first, but if we want to emphasize specific
kinds of paper submissions, we need to decide that soon.

- Greg or Barry, is it possible for one of you to setup a mailman mailing
list which will be used by the program committee?  eGroups is easy for me to
setup, but lots of people hated it last year.  I don't want to pollute
python-dev with conference discussions.

Longer term:

- The schedule for the conference is (supposedly) going to be the same as
last year.  conference-wide keynotes at the beginning of both days, and
4x90minute segments.

- We have two parallel tracks

- We have 4 half-day tutorial slots

- All of the paper materials have to be 'in' by March 1.  We need to decide
how much time we need to go through the review/revision process ourselves.
In other words, the deadline for submissions is up to us, but we don't have
that much time.

--david ascher


From jeremy at cnri.reston.va.us  Sat Dec 18 23:39:58 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Sat, 18 Dec 1999 17:39:58 -0500 (EST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <385A4B37.333B9443@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org>
	<3857B97E.3684224F@interet.com>
	<38590844.769C3025@interet.com>
	<38591E65.4885A39D@interet.com>
	<38595852.E8054741@lemburg.com>
	<385A4B37.333B9443@interet.com>
Message-ID: <14428.3390.671438.663889@bitdiddle.cnri.reston.va.us>

>>>>> "JCA" == James C Ahlstrom <jim at interet.com> writes:

  >> Aside: I found that you are using undocumented arguments to
  >> zlib.compressobj() ... are these extra arguments left out of the
  >> documentation on purpose or by simple oversight ? I couldn't find
  >> them in the HTML docs and neither in the docstrings.

  JCA> I am following the CNRI code blindly here.  I don't have docs
  JCA> either.

The docs for the zlib module are quite out of date, although I think
the docstrings may be better (not necessarily completely up-to-date
thought :-).  The specific parameters to pass to zlib don't seem to be
documented anywhere either; IIRC I dug them out of some example C code
somewhere that used zlib to read Zip files.

Jeremy


From gstein at lyra.org  Sun Dec 19 00:14:02 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 15:14:02 -0800 (PST)
Subject: [Python-Dev] Year 2000 O'Reilly Python Conference
In-Reply-To: <003501bf497e$368f6f60$e655cfc0@ski.org>
Message-ID: <Pine.LNX.4.10.9912181513020.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, David Ascher wrote:
>...
> - Greg or Barry, is it possible for one of you to setup a mailman mailing
> list which will be used by the program committee?  eGroups is easy for me to
> setup, but lots of people hated it last year.  I don't want to pollute
> python-dev with conference discussions.

Done. ora-pc at pythonpros.com.
http://mailman.pythonpros.com/mailman/listinfo/ora-pc

I also removed the old monterey-speakers mailing list :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From da at ski.org  Sun Dec 19 08:24:51 1999
From: da at ski.org (David Ascher)
Date: Sat, 18 Dec 1999 23:24:51 -0800
Subject: [Python-Dev] Year 2000 O'Reilly Python Conference
References: <Pine.LNX.4.10.9912181513020.16305-100000@nebula.lyra.org>
Message-ID: <013301bf49f2$243946f0$df55cfc0@ski.org>

From: Greg Stein <gstein at lyra.org>
> On Sat, 18 Dec 1999, David Ascher wrote:
> >...
> > - Greg or Barry, is it possible for one of you to setup a mailman
mailing
> > list which will be used by the program committee?

> Done. ora-pc at pythonpros.com.
> http://mailman.pythonpros.com/mailman/listinfo/ora-pc

Thanks, Greg.

Now, folks, please consider joining the program committee.  We need a few
volunteers - not too many, but somewhere between 5 and 10 would be good.
You don't even have to commit to making it to the conference, if that's a
concern.

-- david


From jim at interet.com  Mon Dec 20 15:18:17 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 09:18:17 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912161147360.16305-100000@nebula.lyra.org>
Message-ID: <385E3AA9.162BE568@interet.com>

Greg Stein wrote:

> Do you have a ZipImporter written?

Yes, it is ftp://ftp.interet.com/pub/importer.py

JimA


From jim at interet.com  Mon Dec 20 15:35:58 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 09:35:58 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com>
Message-ID: <385E3ECE.F8DCDE28@interet.com>

"M.-A. Lemburg" wrote:

> One thing I'd suugest is to include some way to delete and
> update contents, e.g. the write() method should overwrite
> any existing entry in the archive (if it not already does --
> I haven't tested it, just read the code and it seems to raise
> an exception), plus maybe a .remove() method which deletes
> an entry.

Currently, adding a file requires the "a" append mode, while
the "w" mode re-writes the file.  Adding a duplicate file name
produces an error message.  I can change this,
but removing a file would either waste space, or else the file
contents must be copied over the old file and all the offsets
updated.  I don't like this because it is complicated, and I think
it is fast enough to just re-write the archive.  But it
could be added if people want.

> True. How about making the compression argument mandatory
> for file opened in 'wb' mode only ?

The default of zero provides a little guidance that you should
use zero.  I added a warning message if 8 is used which should
discourage people from using 8.  Or I could disallow 8.
Is that OK?

JimA


From jim at interet.com  Mon Dec 20 16:34:02 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 10:34:02 -0500
Subject: [Python-Dev] Batteries Included?
References: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl>
Message-ID: <385E4C6A.BEC0F728@interet.com>

Jack Jansen wrote:

> And maybe the answer is a much simpler freezing process, like
> MacPython BuildApplication where any Python user can drop a script on
> it and end up with a fully self-contained app guaranteed (well.... No
> reports to the contrary have been heard so far, at least:-) to contain
> everything needed and not interfere with an existing MacPython
> installation (or be interfered with by it). Then a popular app will
> have prebuilt binaries available for all platforms quickly, made by
> the Python community, and the enduser interested in the app but not in
> Python can simply download that.

IMHO the "much simpler freezing process" is archive files.  A simple
script can build them, imputil can import them, and the only
remaining problem is to find them.  Please see:

ftp://ftp.interet.com/pub/bootmodule.html
ftp://ftp.interet.com/pub/pylib.html

JimA


From jack at oratrix.nl  Mon Dec 20 17:50:32 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Mon, 20 Dec 1999 17:50:32 +0100
Subject: [Python-Dev] Batteries Included? 
In-Reply-To: Message by "James C. Ahlstrom" <jim@interet.com> ,
	     Mon, 20 Dec 1999 10:34:02 -0500 , <385E4C6A.BEC0F728@interet.com> 
Message-ID: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl>

> IMHO the "much simpler freezing process" is archive files.  A simple
> script can build them, imputil can import them, and the only
> remaining problem is to find them.  Please see:

Archive files solves the problem for Python modules. But that leaves the 
problem of dynamically loaded modules. And resources for dialogs and such, if 
you use native GUI stuff on Mac or Windows.

And most serious applications that I've seen (GRiNS and Zope, to name two, 
Mailman is the only exception I can think of) depend on non-standard plugin 
modules.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From mal at lemburg.com  Mon Dec 20 15:44:42 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 20 Dec 1999 15:44:42 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com>
Message-ID: <385E40DA.37AD704F@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> 
> > One thing I'd suugest is to include some way to delete and
> > update contents, e.g. the write() method should overwrite
> > any existing entry in the archive (if it not already does --
> > I haven't tested it, just read the code and it seems to raise
> > an exception), plus maybe a .remove() method which deletes
> > an entry.
> 
> Currently, adding a file requires the "a" append mode, while
> the "w" mode re-writes the file.  Adding a duplicate file name
> produces an error message.  I can change this,
> but removing a file would either waste space, or else the file
> contents must be copied over the old file and all the offsets
> updated.  I don't like this because it is complicated, and I think
> it is fast enough to just re-write the archive.  But it
> could be added if people want.

I guess it would be ok to waste space. You could provide
a .cleanup() or .rewrite() method that takes care of
reorganizing the file to fill up the gaps.
 
> > True. How about making the compression argument mandatory
> > for file opened in 'wb' mode only ?
> 
> The default of zero provides a little guidance that you should
> use zero.  I added a warning message if 8 is used which should
> discourage people from using 8.  Or I could disallow 8.
> Is that OK?

Well the module seems to work just fine with compression
on, so disallowing it or issuing a warning would reduce its value,
IMHO. How about making compression a boolean value and then
converting any true value to 8 ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    11 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fdrake at acm.org  Mon Dec 20 19:52:41 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 20 Dec 1999 13:52:41 -0500 (EST)
Subject: [Python-Dev] posix module
In-Reply-To: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com>
References: <14423.61493.90107.433664@weyr.cnri.reston.va.us>
	<036d01bf4889$30b63470$f29b12c2@secret.pythonware.com>
Message-ID: <14430.31481.402469.896400@weyr.cnri.reston.va.us>

Fredrik Lundh writes:
 > (current CVS stuff, on Red Hat 5.2)

  Ok, Guido figured it out; this is a typo in the header
/usr/include/confname.h; the enum and the #define don't have the same
name.
  Do you know a way to detect the Linux kernel version using
pre-preprocessor macros?  (Seems very fragile.)  Would it be
reasonable to only add that table entry for kernel versions >= 2.2?


  -Fred

--
Fred L. Drake, Jr.	     <fdrake at acm.org>
Corporation for National Research Initiatives


From jim at interet.com  Mon Dec 20 20:25:27 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 14:25:27 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com>
Message-ID: <385E82A7.72345807@interet.com>

"M.-A. Lemburg" wrote:

> I guess it would be ok to waste space. You could provide
> a .cleanup() or .rewrite() method that takes care of
> reorganizing the file to fill up the gaps.

OK, adding a duplicate name replaces the old file.

> Well the module seems to work just fine with compression
> on, so disallowing it or issuing a warning would reduce its value,
> IMHO.

Yes compression works, but 90% of Python installations don't have
zlib, so it is an ERROR to create archives with compression when
these archives are distributed to other sites.

> How about making compression a boolean value and then
> converting any true value to 8 ?

It would close the door to future or other compression methods.
Currently the method must be 0 or 8 or a traceback will result.

JimA


From jim at interet.com  Mon Dec 20 20:33:11 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 14:33:11 -0500
Subject: [Python-Dev] Batteries Included?
References: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl>
Message-ID: <385E8477.F727E0F8@interet.com>

Jack Jansen wrote:

> Archive files solves the problem for Python modules. But that leaves the
> problem of dynamically loaded modules. And resources for dialogs and such, if
> you use native GUI stuff on Mac or Windows.

Point taken.

For dynamically loaded modules, I believe in following the
native system's DLL path, and not adding eccentric Python
logic.  But many disagreed a couple week's ago when I raised this.

For resources, I think the archive file can accommodate this,
although it seems highly system dependent.

Anyway, any file at all can live in the archive and the import
mechanism for *.pyc will not be damaged nor unduly slowed down
by its presence.

JimA


From gstein at lyra.org  Mon Dec 20 21:11:50 1999
From: gstein at lyra.org (Greg Stein)
Date: Mon, 20 Dec 1999 12:11:50 -0800 (PST)
Subject: [Python-Dev] zipfile.py
In-Reply-To: <385E82A7.72345807@interet.com>
Message-ID: <Pine.LNX.4.10.9912201208290.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, James C. Ahlstrom wrote:
> "M.-A. Lemburg" wrote:
> > I guess it would be ok to waste space. You could provide
> > a .cleanup() or .rewrite() method that takes care of
> > reorganizing the file to fill up the gaps.
> 
> OK, adding a duplicate name replaces the old file.

But it shouldn't print a warning(!). If an application wants to replace a
file, then stuff shouldn't appear on stdout as a result.

> > Well the module seems to work just fine with compression
> > on, so disallowing it or issuing a warning would reduce its value,
> > IMHO.
> 
> Yes compression works, but 90% of Python installations don't have
> zlib, so it is an ERROR to create archives with compression when
> these archives are distributed to other sites.

While it may be problem to distribute them to other sites, that is not up
to the library. If I want compression, then I should get compression. A
library module should not determine application-level policy.

The warning that __init__ prints shouldn't be there.

Really: there should not be a single "print" in the library (well,
printdir() is fine... that's what it is supposed to do; printing in the
test code would be fine). In normal, or even exceptional(!), operation
there should never be a print.

> > How about making compression a boolean value and then
> > converting any true value to 8 ?
> 
> It would close the door to future or other compression methods.
> Currently the method must be 0 or 8 or a traceback will result.

I definitely agree with JimA here. For example, maybe we want bzip
compression in there. Sure, non-portable, but that's my problem :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim at interet.com  Mon Dec 20 21:50:46 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 15:50:46 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912201208290.16305-100000@nebula.lyra.org>
Message-ID: <385E96A6.40CCF285@interet.com>

Greg Stein wrote:
> 
> On Mon, 20 Dec 1999, James C. Ahlstrom wrote:
> > "M.-A. Lemburg" wrote:
> But it shouldn't print a warning(!). If an application wants to replace a
> file, then stuff shouldn't appear on stdout as a result.

OK, no warning.
 
> The warning that __init__ prints shouldn't be there.

OK, it is gone.
 
> Really: there should not be a single "print" in the library (well,

No print unless _debug > 0

JimA


From mal at lemburg.com  Mon Dec 20 22:16:39 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 20 Dec 1999 22:16:39 +0100
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com>
Message-ID: <385E9CB7.5DE4848A@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> "M.-A. Lemburg" wrote:
> 
> > I guess it would be ok to waste space. You could provide
> > a .cleanup() or .rewrite() method that takes care of
> > reorganizing the file to fill up the gaps.
> 
> OK, adding a duplicate name replaces the old file.

Cool.
 
> > Well the module seems to work just fine with compression
> > on, so disallowing it or issuing a warning would reduce its value,
> > IMHO.
> 
> Yes compression works, but 90% of Python installations don't have
> zlib, so it is an ERROR to create archives with compression when
> these archives are distributed to other sites.

Sure, for the sake of creating Python code archives, but
your module is much more versatile: e.g. I could automatically
create ZIP archives of log files or sets of other files and
then have Python email them to someone who uses these archives
through standard tools such as WinZip -- the target doesn't always
have to be a Python process :-)

> > How about making compression a boolean value and then
> > converting any true value to 8 ?
> 
> It would close the door to future or other compression methods.
> Currently the method must be 0 or 8 or a traceback will result.

Ok.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    11 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jim at interet.com  Mon Dec 20 22:37:20 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 20 Dec 1999 16:37:20 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com> <385E9CB7.5DE4848A@lemburg.com>
Message-ID: <385EA190.6AF511BD@interet.com>

"M.-A. Lemburg" wrote:
>
> Sure, for the sake of creating Python code archives, but
> your module is much more versatile: e.g. I could automatically
> create ZIP archives of log files or sets of other files and

OK, zipfile.py no longer complains about compression != 0

JimA


From fdrake at acm.org  Tue Dec 21 23:42:26 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 21 Dec 1999 17:42:26 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212238.RAA13660@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
Message-ID: <14432.594.33416.600794@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > + 
 > + class GetoptError(Exception):
 > +     opt = ''
 > +     msg = ''
 > +     def __init__(self, *args):
 > +         self.args = args
 > +         if len(args) == 1:
 > +             self.msg = args[0]
 > +         elif len(args) == 2:
 > +             self.msg = args[0]
 > +             self.opt = args[1]
 > + 
 > +     def __str__(self):
 > +         return self.msg
 >   
 > ! error = GetoptError # backward compatibility

  This breaks as soon as the standard exceptions are strings; does
this mean -X will be removed in the next release?  (Please????)


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From bwarsaw at cnri.reston.va.us  Tue Dec 21 23:44:46 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 17:44:46 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
Message-ID: <14432.734.155183.508785@anthem.cnri.reston.va.us>

>>>>> "Fred" == Fred L Drake, Jr <fdrake at acm.org> writes:

    Fred>   This breaks as soon as the standard exceptions are
    Fred> strings; does this mean -X will be removed in the next
    Fred> release?  (Please????)

Pretty please? :)


From guido at CNRI.Reston.VA.US  Wed Dec 22 00:05:28 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:05:28 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 17:42:26 EST."
             <14432.594.33416.600794@weyr.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us>  
            <14432.594.33416.600794@weyr.cnri.reston.va.us> 
Message-ID: <199912212305.SAA13722@eric.cnri.reston.va.us>

> Guido van Rossum writes:
>  > + 
>  > + class GetoptError(Exception):
>  > +     opt = ''
>  > +     msg = ''
>  > +     def __init__(self, *args):
>  > +         self.args = args
>  > +         if len(args) == 1:
>  > +             self.msg = args[0]
>  > +         elif len(args) == 2:
>  > +             self.msg = args[0]
>  > +             self.opt = args[1]
>  > + 
>  > +     def __str__(self):
>  > +         return self.msg
>  >   
>  > ! error = GetoptError # backward compatibility

[Fred Drake]

>   This breaks as soon as the standard exceptions are strings; does
> this mean -X will be removed in the next release?  (Please????)

Not a bad idea.

Anybody got a reason why -X should stay?

(The next step would be to outlaw raise with a string argument; I
think I can't make that for 1.6.  But it would be a good idea to scan
the standard library for string exceptions and convert all of them.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw at cnri.reston.va.us  Wed Dec 22 00:21:38 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:21:38 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <14432.2946.857539.898577@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido at cnri.reston.va.us> writes:

    Guido> Anybody got a reason why -X should stay?

Kill it.

    Guido> (The next step would be to outlaw raise with a string
    Guido> argument; I think I can't make that for 1.6.  But it would
    Guido> be a good idea to scan the standard library for string
    Guido> exceptions and convert all of them.)

Or require that exception classes be derived from exceptions.Exception
:)

-Barry


From guido at CNRI.Reston.VA.US  Wed Dec 22 00:23:29 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:23:29 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 18:21:38 EST."
             <14432.2946.857539.898577@anthem.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>  
            <14432.2946.857539.898577@anthem.cnri.reston.va.us> 
Message-ID: <199912212323.SAA13803@eric.cnri.reston.va.us>

[Barry]
>     Guido> Anybody got a reason why -X should stay?
> 
> Kill it.

You already said that.

Anybody else?

>     Guido> (The next step would be to outlaw raise with a string
>     Guido> argument; I think I can't make that for 1.6.  But it would
>     Guido> be a good idea to scan the standard library for string
>     Guido> exceptions and convert all of them.)
> 
> Or require that exception classes be derived from exceptions.Exception
> :)

That's hard to require.  But it could easily be a requirement checked
by one of the hypothetical typecheckers that are being discussed in
the types-sig.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw at cnri.reston.va.us  Wed Dec 22 00:27:31 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:27:31 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
	<14432.2946.857539.898577@anthem.cnri.reston.va.us>
	<199912212323.SAA13803@eric.cnri.reston.va.us>
Message-ID: <14432.3299.404561.698836@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido at cnri.reston.va.us> writes:

    BAW> Or require that exception classes be derived from
    BAW> exceptions.Exception :)

    Guido> That's hard to require.  But it could easily be a
    Guido> requirement checked by one of the hypothetical typecheckers
    Guido> that are being discussed in the types-sig.

Hmm, the raise could probably enforce this, but it might not be that
useful.

-Barry


From guido at CNRI.Reston.VA.US  Wed Dec 22 00:40:22 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:40:22 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 18:27:31 EST."
             <14432.3299.404561.698836@anthem.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us>  
            <14432.3299.404561.698836@anthem.cnri.reston.va.us> 
Message-ID: <199912212340.SAA13851@eric.cnri.reston.va.us>

> >>>>> "Guido" == Guido van Rossum <guido at cnri.reston.va.us> writes:
> 
>     BAW> Or require that exception classes be derived from
>     BAW> exceptions.Exception :)
> 
>     Guido> That's hard to require.  But it could easily be a
>     Guido> requirement checked by one of the hypothetical typecheckers
>     Guido> that are being discussed in the types-sig.
> 
> Hmm, the raise could probably enforce this, but it might not be that
> useful.
> 
> -Barry

The raise could easily enforce this, but it would break lots of
existing code.

I wish I had done it right from the start -- then exceptions would
have been classes from the start and would have required inheritance
from the Exception base class.  Like in Java.  (And in C++?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw at CNRI.Reston.VA.US  Wed Dec 22 00:43:59 1999
From: bwarsaw at CNRI.Reston.VA.US (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:43:59 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
	<14432.2946.857539.898577@anthem.cnri.reston.va.us>
	<199912212323.SAA13803@eric.cnri.reston.va.us>
	<14432.3299.404561.698836@anthem.cnri.reston.va.us>
	<199912212340.SAA13851@eric.cnri.reston.va.us>
Message-ID: <14432.4287.543786.308468@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido at CNRI.Reston.VA.US> writes:

    Guido> The raise could easily enforce this, but it would break
    Guido> lots of existing code.

Maybe not (I'm not sure).  All the standard exceptions inherit from
Exception, and of course there'd be nothing to enforce for existing
user-defined string based exceptions.  How pervasive are user-defined
class based exceptions that don't inherit from Exception?  (I don't
know, and I haven't grepped, but I think we've been making that
recommendation from day 1 of class-based standard exceptions, and I
try to follow this recommendation in my own code).

    Guido> I wish I had done it right from the start -- then
    Guido> exceptions would have been classes from the start and would
    Guido> have required inheritance from the Exception base class.
    Guido> Like in Java.  (And in C++?)

All Hail, Python 2.0, our Savior and Redeemer! :)

-Barry


From guido at CNRI.Reston.VA.US  Wed Dec 22 00:49:09 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:49:09 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Tue, 21 Dec 1999 18:43:59 EST."
             <14432.4287.543786.308468@anthem.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us>  
            <14432.4287.543786.308468@anthem.cnri.reston.va.us> 
Message-ID: <199912212349.SAA13892@eric.cnri.reston.va.us>

> From: "Barry A. Warsaw" <bwarsaw at cnri.reston.va.us>

> >>>>> "Guido" == Guido van Rossum <guido at CNRI.Reston.VA.US> writes:
> 
>     Guido> The raise could easily enforce this, but it would break
>     Guido> lots of existing code.
> 
> Maybe not (I'm not sure).  All the standard exceptions inherit from
> Exception, and of course there'd be nothing to enforce for existing
> user-defined string based exceptions.  How pervasive are user-defined
> class based exceptions that don't inherit from Exception?  (I don't
> know, and I haven't grepped, but I think we've been making that
> recommendation from day 1 of class-based standard exceptions, and I
> try to follow this recommendation in my own code).

Yes, but class-based user exceptions existed many Python versions
before class-based standard exceptions!

Two examples in the standard library: ConfigParser.py and xdrlib.py.

> All Hail, Python 2.0, our Savior and Redeemer! :)

Or, the perfect excuse for procrastination :)

(But yes, 2.0 will enforce this.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein at lyra.org  Wed Dec 22 00:53:50 1999
From: gstein at lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 15:53:50 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 getopt.py,1.7,1.8
In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912211552380.16305-100000@nebula.lyra.org>

On Tue, 21 Dec 1999, Guido van Rossum wrote:
>...
> [Fred Drake]
> >   This breaks as soon as the standard exceptions are strings; does
> > this mean -X will be removed in the next release?  (Please????)
> 
> Not a bad idea.
> 
> Anybody got a reason why -X should stay?

Kill it.

> (The next step would be to outlaw raise with a string argument; I
> think I can't make that for 1.6.  But it would be a good idea to scan
> the standard library for string exceptions and convert all of them.)

Keep string exceptions. I think there is probably a lot of code that still
uses them. I know I do :-)

We can issues warnings about string exceptions via the type-checking tool.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From bwarsaw at CNRI.Reston.VA.US  Wed Dec 22 00:54:04 1999
From: bwarsaw at CNRI.Reston.VA.US (Barry A. Warsaw)
Date: Tue, 21 Dec 1999 18:54:04 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
	<14432.2946.857539.898577@anthem.cnri.reston.va.us>
	<199912212323.SAA13803@eric.cnri.reston.va.us>
	<14432.3299.404561.698836@anthem.cnri.reston.va.us>
	<199912212340.SAA13851@eric.cnri.reston.va.us>
	<14432.4287.543786.308468@anthem.cnri.reston.va.us>
	<199912212349.SAA13892@eric.cnri.reston.va.us>
Message-ID: <14432.4892.908107.421149@anthem.cnri.reston.va.us>

>>>>> "Guido" == Guido van Rossum <guido at CNRI.Reston.VA.US> writes:

    Guido> Yes, but class-based user exceptions existed many Python
    Guido> versions before class-based standard exceptions!

True, but I suspect that legacy class-based user exceptions are rare.
I might be wrong, but you're absolutely right that these would all be
broken.

    Guido> Two examples in the standard library: ConfigParser.py and
    Guido> xdrlib.py.

Fortunately these are fixed with two 11 character patches :)

I'm not necessarily arguing for or against tightening this.

-Barry


From gmcm at hypernet.com  Wed Dec 22 00:55:07 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Tue, 21 Dec 1999 18:55:07 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us>
References: Your message of "Tue, 21 Dec 1999 18:27:31 EST."             <14432.3299.404561.698836@anthem.cnri.reston.va.us> 
Message-ID: <1266302877-22249299@hypernet.com>

[Guido]

> I wish I had done it right from the start -- then exceptions
> would have been classes from the start and would have required
> inheritance from the Exception base class.  Like in Java.  (And
> in C++?)

In C++ you can throw anything at all. Strings, ints, that 
Warsaw blockhead...

off-topic-ly y'rs

- Gordon


From tismer at appliedbiometrics.com  Wed Dec 22 01:57:27 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 22 Dec 1999 01:57:27 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib 
 getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>  
	            <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us>
Message-ID: <386021F7.4F94C458@appliedbiometrics.com>


Guido van Rossum wrote:
> 
> [Barry]
> >     Guido> Anybody got a reason why -X should stay?
> >
> > Kill it.
> 
> You already said that.
> 
> Anybody else?

I'd say kill -X, but keep allowing string exceptions if
it doesn't cost too much. I think of C++, like Gordon said.

Also I'd take the chance and move the exceptions Python
module back into the core, as a frozen mdule or whatever.

Reason: At the moment, the CVS version of the Python library
is incompatible to 1.5.2, which makes testing against the
standard dist quite inconvenient. A compiled CVS Python
does not run under PythonWin when I put it into my standard
installation. Or is there an easy way to switch all settings
to a completely different path?

Anyway, I'm most probably off until Y2K.

See ya all then, provided we survive - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From guido at CNRI.Reston.VA.US  Wed Dec 22 02:01:16 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 20:01:16 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 01:57:27 +0100."
             <386021F7.4F94C458@appliedbiometrics.com> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us>  
            <386021F7.4F94C458@appliedbiometrics.com> 
Message-ID: <199912220101.UAA14109@eric.cnri.reston.va.us>

> I'd say kill -X, but keep allowing string exceptions if
> it doesn't cost too much. I think of C++, like Gordon said.

Agreed.

> Also I'd take the chance and move the exceptions Python
> module back into the core, as a frozen mdule or whatever.
> 
> Reason: At the moment, the CVS version of the Python library
> is incompatible to 1.5.2, which makes testing against the
> standard dist quite inconvenient. A compiled CVS Python
> does not run under PythonWin when I put it into my standard
> installation. Or is there an easy way to switch all settings
> to a completely different path?

Point the PYTHONHOME variable to the top of your install directory.
(On Windows you may have to kill the registry settings -- this is a
bug.)

> Anyway, I'm most probably off until Y2K.

Ditto.

> See ya all then, provided we survive - chris

Best wishes to all,

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at digicool.com  Wed Dec 22 14:54:41 1999
From: jim at digicool.com (Jim Fulton)
Date: Wed, 22 Dec 1999 08:54:41 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib 
 getopt.py,1.7,1.8
References: <199912212238.RAA13660@eric.cnri.reston.va.us>  
	            <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <3860D821.576B3146@digicool.com>

Guido van Rossum wrote:
> 
> (The next step would be to outlaw raise with a string argument; I
> think I can't make that for 1.6.  But it would be a good idea to scan
> the standard library for string exceptions and convert all of them.)

This would be waaaaay to big a change for Python 1.x. There are alot
of Python modules outside the standard distribution that use string 
exceptions. This would be a huge backward incompatability. 

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From fdrake at acm.org  Wed Dec 22 15:23:29 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 22 Dec 1999 09:23:29 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <14432.57057.535205.558@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > (The next step would be to outlaw raise with a string argument; I
 > think I can't make that for 1.6.  But it would be a good idea to scan
 > the standard library for string exceptions and convert all of them.)

  I don't know if requiring class-based exceptions will make the
runtime any simpler, but that seems the only reason to do it.
  The only reason to remove -X, and possibly the string exception
fallback code, is to ensure that we *can* subclass Exception and
friends without having to catch TypeError and do something different.


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From fdrake at acm.org  Wed Dec 22 15:25:33 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 22 Dec 1999 09:25:33 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <14432.2946.857539.898577@anthem.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
	<14432.2946.857539.898577@anthem.cnri.reston.va.us>
Message-ID: <14432.57181.944364.427093@weyr.cnri.reston.va.us>

Barry A. Warsaw writes:
 > Or require that exception classes be derived from exceptions.Exception
 > :)

  Ok, it's early, and maybe I haven't had enough coffee(!).  But is
this serious?  Does JPython gain some benefit from this, is it your
preference, or are you just yanking on my leg?  ("Pulling my arm" as
my 5-year-old says!)


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From guido at CNRI.Reston.VA.US  Wed Dec 22 15:40:39 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 22 Dec 1999 09:40:39 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 09:23:29 EST."
             <14432.57057.535205.558@weyr.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us>  
            <14432.57057.535205.558@weyr.cnri.reston.va.us> 
Message-ID: <199912221440.JAA16198@eric.cnri.reston.va.us>

> From: "Fred L. Drake, Jr." <fdrake at acm.org>
> 
> Guido van Rossum writes:
>  > (The next step would be to outlaw raise with a string argument; I
>  > think I can't make that for 1.6.  But it would be a good idea to scan
>  > the standard library for string exceptions and convert all of them.)
> 
>   I don't know if requiring class-based exceptions will make the
> runtime any simpler, but that seems the only reason to do it.

Do what?  *Require* class exceptions?  You're probably right, and I
think the gain is minimal.

There's another reason to scan the std library though -- not to set a
bad example.  I want to eventually (in 2.0) move to a
class-derived-from-Exception-only scheme.

>   The only reason to remove -X, and possibly the string exception
> fallback code, is to ensure that we *can* subclass Exception and
> friends without having to catch TypeError and do something different.

And that's a very good reason indeed.

Let me repeat my plans for 1.6.

- Remove -X; the standard exceptions are always class-based.

- Change all standard library and other example code to use
class-based exceptions with a standard exception as base class, to set
an example.

- Still allow string exceptions in user code.

- Still allow class exceptions that don't use a standard exception
base class in user code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From marangoz at python.inrialpes.fr  Wed Dec 22 19:09:47 1999
From: marangoz at python.inrialpes.fr (Vladimir Marangozov)
Date: Wed, 22 Dec 1999 19:09:47 +0100 (CET)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912221440.JAA16198@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 09:40:39 AM
Message-ID: <199912221809.TAA25322@python.inrialpes.fr>

Guido van Rossum wrote:
> 
> [Fred Drake]
> >   I don't know if requiring class-based exceptions will make the
> > runtime any simpler, but that seems the only reason to do it.
> 
> Do what?  *Require* class exceptions?  You're probably right, and I
> think the gain is minimal.

Yes. Besides, I still think that string-based exceptions are just
convenient for quick & dirty, throw-away test scripts.

> 
> Let me repeat my plans for 1.6.
> 
> - Remove -X; the standard exceptions are always class-based.
> 
> - Change all standard library and other example code to use
> class-based exceptions with a standard exception as base class, to set
> an example.
> 
> - Still allow string exceptions in user code.
> 
> - Still allow class exceptions that don't use a standard exception
> base class in user code.

Sounds okay.

---

PS: I'm particularly happy today :-) because I've finally published
 the new version of our Web site http://www.inrialpes.fr. Two things
 I'd like to mention:
 (1) it shouldn't have been possible without quick Python scripts ;)
 (2) I'll find the time to reinvoke some of the topics discussed here
     instead of being mute as a fish.

That said, Merry Christmas and a Happy New Year to all of you!

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov at inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From guido at CNRI.Reston.VA.US  Wed Dec 22 19:23:45 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 22 Dec 1999 13:23:45 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 19:09:47 +0100."
             <199912221809.TAA25322@python.inrialpes.fr> 
References: <199912221809.TAA25322@python.inrialpes.fr> 
Message-ID: <199912221823.NAA16517@eric.cnri.reston.va.us>

Vladimir.Marangozov at inrialpes.fr:

> Yes. Besides, I still think that string-based exceptions are just
> convenient for quick & dirty, throw-away test scripts.

They have a hard-to-understand quirk though: the id() of the string is
used to check rather than its value, so that except "foo" doesn't
necessarily catch raise "foo"; but due to various optimization, this
usually works, and people get bent out of shape when it doesn't.
Since you have to give your exception a name, how hard is it to say

class MyError(Exception): pass

rathern than

MyError = "MyError"

?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein at lyra.org  Wed Dec 22 19:33:19 1999
From: gstein at lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 10:33:19 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 getopt.py,1.7,1.8
In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912221031390.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, Guido van Rossum wrote:
> Vladimir.Marangozov at inrialpes.fr:
> > Yes. Besides, I still think that string-based exceptions are just
> > convenient for quick & dirty, throw-away test scripts.
> 
> They have a hard-to-understand quirk though: the id() of the string is
> used to check rather than its value, so that except "foo" doesn't
> necessarily catch raise "foo"; but due to various optimization, this
> usually works, and people get bent out of shape when it doesn't.
> Since you have to give your exception a name, how hard is it to say
> 
> class MyError(Exception): pass
> 
> rathern than
> 
> MyError = "MyError"
> 
> ?

It is very hard. My fingers do the typing for me, and they fill in
strings. I'm trying to teach them otherwise, but they insist.

You're also assuming that MyError gets defined. Sometimes, my little
fingers like typing:

  try:
    foo
  except:
    raise "foo broke for some reason"


Quick and dirty, indeed! :-)

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From fdrake at acm.org  Wed Dec 22 20:59:55 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 22 Dec 1999 14:59:55 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
	<14432.2946.857539.898577@anthem.cnri.reston.va.us>
	<199912212323.SAA13803@eric.cnri.reston.va.us>
	<14432.3299.404561.698836@anthem.cnri.reston.va.us>
	<199912212340.SAA13851@eric.cnri.reston.va.us>
Message-ID: <14433.11707.607533.698901@weyr.cnri.reston.va.us>

Guido van Rossum writes:
 > I wish I had done it right from the start -- then exceptions would
 > have been classes from the start and would have required inheritance
 > from the Exception base class.  Like in Java.  (And in C++?)

  I've seen this said or hinted at in a couple of places (the specific 
requirement that exception derive from Exception), but I've seen
nothing that indicates any reason or derived value for this.  Could
someone please clarify?


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From guido at CNRI.Reston.VA.US  Wed Dec 22 21:05:52 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 22 Dec 1999 15:05:52 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: Your message of "Wed, 22 Dec 1999 14:59:55 EST."
             <14433.11707.607533.698901@weyr.cnri.reston.va.us> 
References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us>  
            <14433.11707.607533.698901@weyr.cnri.reston.va.us> 
Message-ID: <199912222005.PAA17291@eric.cnri.reston.va.us>

> From: "Fred L. Drake, Jr." <fdrake at acm.org>

> Guido van Rossum writes:
>  > I wish I had done it right from the start -- then exceptions would
>  > have been classes from the start and would have required inheritance
>  > from the Exception base class.  Like in Java.  (And in C++?)
> 
>   I've seen this said or hinted at in a couple of places (the specific 
> requirement that exception derive from Exception), but I've seen
> nothing that indicates any reason or derived value for this.  Could
> someone please clarify?

It's simply an extra bit of checking that your program is reasonable
-- if you accidentally raise a non-exception class, there's probably
something wrong with your program, and it gives the reader a hint
about the intended use of the class.

Other languages (e.g. Modula-3) have a specific exception type that
can be used only for that one purpose.  However it's useful to allow
methods an subclassing of exceptions, so they might as well be
classes.  So, all exceptions are classes.  But not all classes are
exceptions.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein at lyra.org  Wed Dec 22 21:11:43 1999
From: gstein at lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 12:11:43 -0800 (PST)
Subject: [Python-Dev] Please test new dynamic load behavior
Message-ID: <Pine.LNX.4.10.9912221200290.16305-100000@nebula.lyra.org>

Hi all,

I reorganized Python's dynamic load/import code over the past few days.
Gudio provided some feedback, I did some more mods, and now it is checked
into CVS. The new loading behavior has been tested on Linux, IRIX, and
Solaris (and probably Windows by now).

For people with CVS access, I'd like to ask that you grab an updated copy
and shake out the new code. There have been updates to the "configure"
process, so you'll need to run configure again. Make sure that you alter
your Modules/Setup to build some shared modules, and then try it out.

Here are some of the platforms that I believe need specific testing:

- NetBSD, FreeBSD, OpenBSD, ...
- AIX
- HP/UX
- BeOS
- NeXT
- Mac
- OS/2
- Win16

I believe it should work for most people, but we may be looking for the
wrong "init<module>" symbol on some platforms. We might even be selecting
the wrong import mechanism (or missing it altogether!) on some platforms.

If you get a chance to test this, then please drop me a note with your
platform and whether it succeeded or failed (and how it failed).

Thanx!
-g

p.s. you can tell if dynamic loading is missing by watching for
DYNLOADFILE in the configure process and seeing if it used dynload_stub.
alternatively, you can import the "imp" module and see if "load_dynamic"
is missing.

-- 
Greg Stein, http://www.lyra.org/


From gvwilson at nevex.com  Thu Dec 23 04:43:40 1999
From: gvwilson at nevex.com (gvwilson at nevex.com)
Date: Wed, 22 Dec 1999 22:43:40 -0500 (EST)
Subject: [Python-Dev] re: Open Source design competition / Python / software tools
Message-ID: <Pine.LNX.4.10.9912222242070.4839-200000@akbar.nevex.com>

Hi, folks.  I hope you don't mind another mail out of the blue, but I got
notice on Saturday that the Department of Energy is giving me $860K over
two years to support development of easier-to-use software engineering
tools.  All of the work will be Open Source, and will be done in Python,
with a strong emphasis on design, testing, and documentation.  The
project's long-term objective is to encourage scientists and engineers to
treat programs in the same way as they do other experiments, i.e. to
calibrate, test, peer review, and so on.

To kick-start things, we're going to be holding a two-round design
competition.  Anyone (individual or team, professional or student) can
submit a short entry for the first round; the judges will pick four
candidates to go forward in each of four categories, and those
individuals or teams will be asked to submit full entries. The four
categories are:

* an issue tracking system to replace Gnats and Bugzilla;

* a build system to replace make;

* a platform inspection and configuration system to replace autoconf;
  and

* a testing framework to replace XUnit, Expect, and DejaGnu.

Would you be interested in participating in any way---judging, entering a
design, critiquing things from the pointer of view of end users, or
anything else? I realize that you're probably up past your eyeballs with
work, and that the money on offer is nothing special, but I think this
could be a lot of fun, and could help to shift the emphasis of the Open
Source community from hacking to design (both by drawing attention to, and
rewarding, design, and by creating a corpus of examples and commentary for
programmers to refer to).  It could also make life a lot easier for
computational scientists and engineers...

Please let me know if you'd like to be involved, or if you'd like more
information than is contained in the FAQ (attached).  Timescales are a
bit tight---I'd like to be able to make an announcement on January
14---but I'll be reading email at this address several times a day
during the holiday.

I look forward to hearing from you,

Greg Wilson

p.s. please note that the attached FAQ is a first draft; I'd be grateful
if you could show it to anyone you think might be interested, but I'd
also be grateful if you wouldn't broadcast it until it's gone through 
one more editing pass.
-------------- next part --------------
<HTML>
<HEAD>
<TITLE>Software Carpentry FAQ</TITLE>
</HEAD>
<BODY>

<H1 ALIGN="CENTER">Software Carpentry FAQ</H1>


<H2>General information</H2>

<OL>

<LI><EM>What is the Software Carpentry project? </EM>
<BR>
The aim of the Software Carpentry project is to make it easier for
programmers in general, and scientific programmers in particular, to
adopt better software development practices. The project will achieve
this by creating tools that are easier to learn and use, and by
documenting those tools and the practices they embody.
</LI>

<LI><EM>Where does the name come from?</EM>
<BR>
The name is a play on "software engineering", and is meant to indicate
that this project is initially concerned with medium-sized teams (up
to a dozen or two programmers) and medium-term timescales (a year or
two).
</LI>

<LI><EM>How did the project get started?</EM>
<BR>
The project has its origins in a <A
HREF="http://www.acl.lanl.gov/sc/resources/cse/index.html">series of
articles</A> that Greg Wilson organized for the Fall 1996 and Winter
1996 issues of <CITE>IEEE Computational Science and
Engineering</CITE>. These articles outlined what their authors thought
computer scientists should teach to physical scientists and
engineers. Most authors recommended numerical methods or the standard
Unix toolset, but Steve McConnell argued that better programming
practices would have the greatest impact on productivity.

<BR> As a result of that observation, Greg Wilson, Brent Gorda, and
Steve McConnell put together a 3-day course on software engineering
for scientists and engineers, which they taught several times at the
Los Alamos National Laboratory. Feedback on the course was very
positive, but many participants felt that the tools being
taught---Perl, Make, CVS, and so on---were unnecessarily difficult to
install, learn, and use. They were also frustrated by the scarcity of
examples of design documents, testing plans, and all of the other
things the course was trying to teach them.
</LI>

<LI><EM>Why Open Source?</EM>
<BR>
There are three reasons why the Software Carpentry project is
following the Open Source model:
</LI>

	<OL>

	<LI><EM>Leveraging existing knowledge. </EM>
	<BR>
	A closed project can only take advantage of a few minds. As
	Linux and other projects have shown, a well-run Open Source
	project can harness the experience and insight of thousands of
	people.
	</LI>

	<LI><EM>Lowering barriers to adoption. </EM>
	<BR>
	Freely-available tools are more likely to be picked up than
	their commercial equivalents. This is particularly true when
	the tool in question does something novel (at least from the
	point of the person adopting it), and in academia (where
	budgets are limited).
	</LI>

	<LI><EM>Encouraging peer review.</EM>
	<BR>
	Dan Gezelter?s <a
	href="http://www.openscience.org/talk/bnl/index.html">talk</a>
	at the first Open Source/Open Science conference discussed how
	the scientific tradition of peer review fits with the
	philosophy of the Open Source movement. By designing and
	building these tools in the open, the Software Carpentry
	project will both encourage peer review of the tools
	themselves, and demonstrate how this ought to be done for
	scientific and commercial software.
	</LI>

	</OL>

<LI><EM>Where does the funding come from? </EM>
<BR>
The funding comes from the U.S. Department of Energy, through the
Advanced Computing Laboratory at Los Alamos National Laboratory. The
project is being administered by Code Sourcery. US$480,000 has been
provided for 2000, and US$380,000 for 2001.
</LI>

<LI><EM>Why would the Department of Energy fund something like this?</EM>
<BR>
The funding has been provided partly because the DoE would like
scientists and engineers to be more productive, and partly because it
would like to find out whether the Open Source model and community can
meet the special needs of high-performance computational science. The
last few years have seen most manufacturers of special-purpose
supercomputers disappear or be bought out, and the rise of clusters
based on commercial off-the-shelf (COTS) hardware, Linux, MPI, the GNU
compiler toolset, and so on. There is a growing feeling that these
machines could bring scalable supercomputing into the mainstream, but
this will only happen if good tools and practices are accessible
enough.
</LI>

<LI><EM>I'm not a scientist or engineer---what's in it for me? </EM>
<BR>
The things that make many existing Open Source software development
tools difficult to learn and use---obscure syntax, arbitrary or
hard-to-follow behavior, and poor documentation---affect professional
programmers and computer science students just as much as they do
computational scientists and engineers. If the Open Source movement
can build tools that are simple enough to be learned by people who
have problems of their own to solve, and yet powerful enough to
support distributed development of hundreds of thousands of lines of
complex numerical and visualization code, then those tools will
probably also help people who want to build Internet chat rooms and
order-tracking systems.
<BR>
This project should also be interesting to the general programming
community because it is going to place more emphasis on design and
early feedback than most Open Source projects have to date. Instead of
growing someone?s pet project, Software Carpentry is going to
organize---and pay for---a design competition. If this works, it could
be an interesting model for other Open Source projects to adopt.
</LI>

<LI><EM>I think [tool] is good enough already---why are you re-inventing the wheel? </EM>
<BR>
The short answer to this is Alan Cooper's:


	<BLOCKQUOTE>
	The phrase "computer literate user" really means the person
	has been hurt so many times that the scar tissue is thick
	enough so he no longer feels the pain.
	<BR>
	-- Alan Cooper,
	<CITE>The Inmates are Running the Asylum</CITE>
	</BLOCKQUOTE>

The longer answer is that the "accidental complexity" of the standard
Unix command-line toolset is a major barrier to its adoption by people
who are not full-time programmers, or for whom programming is just
something that has to be done in order to do something else. Many
professional programmers---particularly those who enjoy programming
enough to be involved in the Open Source movement---have been using
these tools for so long that they simply don't remember how hard it is
to configure Gnats, or pass variable bindings between recursive calls
to Make.
<BR>
And let's face it: if Make or Autoconf were built from scratch today,
they would be written as extensible, embeddable modules in a
high-level scripting language. This would not only make them easier to
use, it would also make them easier to learn, since they would employ
one syntax for all purposes. Microsoft Visual Basic has shown just how
useful it can be to have a single general-purpose "glue" language
capable of binding disparate tools together; the aim of the first half
of this project is to bring those benefits to the Open Source
community.

</OL>

<H2>Development</H2>

<OL>

<LI><EM>What projects are currently under way? </EM>
<BR>Software Carpentry will start by producing:
</LI>

	<OL>

	<LI>a platform inspection tool similar to Autoconf;</LI>

	<LI>a build management tool similar to Make;</LI>

	<LI>an issue tracking system similar to Gnats or Bugzilla; and</LI>

	<LI>a unit and regression testing harness with the
	functionality of XUnit, Expect, and DejaGnu.</LI>

	</OL>

<LI><EM>Why were those tools chosen? </EM>
<BR>
These four tools were chosen as initial targets for several
reasons. First, the working practices they support are essential to
medium-scale software engineering. Second, the tools they are intended
to replace are generally recognized as being outdated or flawed. This
creates demand, and increases the odds that rational reimplementations
will be adopted. Third, enough people have enough experience with the
tools that are to be replaced to participate in the design competition
described later.
</LI>

<LI><EM>Why isn?t [tool] on this list?</EM>
<BR>
There are several other tools that could have been on this list, and
will be added if the first round of work goes well. A cross-platform
version control system that corrects the many deficiencies in CVS, for
example, is an obvious candidate, but is probably too large to be
tackled initially, and any work done by Software Carpentry could well
be superseded by BitKeeper. Similarly, the world needs a good Open
Source project management tool with the functionality of Microsoft
Project, but probably needs the four tools listed above more urgently.
</LI>

<LI><EM>What languages and tools will be used? </EM>
<BR>
All development work will be done in Python.
</LI>

<LI><EM>Why Python? </EM>
<BR>
This is actually three questions:

	<OL>

	<LI><EM>Why mandate a language? </EM>
	<BR>
	Building everything in a single language will encourage
	projects to share code, which will both keep the total volume
	of code manageable and raise the quality of the
	implementations (since the shared code will be exercised, and
	tested, in many different ways). Using a single language will
	also improve the comprehensibility, and hence the
	maintainability and extensibility, of the tools. The varying
	syntax of Make, Autoconf, and other tools is a large practical
	barrier to their adoption by people who have better (or at
	least more pressing) things to do than learn yet another
	syntax. Microsoft?s Visual Basic has shown how powerful it
	is to use a single, flexible language everywhere.
	</LI>

	<LI><EM>Why use a scripting language? </EM>
	<BR>
	A lot of anecdotal evidence shows that "relaxed" high-level
	languages (like Python, Perl, and Visual Basic) are more
	productive vehicles for process management, text processing,
	and similar tasks than their "strict" equivalents (like C++
	and Java).
	</LI>

	<LI><EM>Why use Python? </EM>
	<BR>
	The four candidates considered were Visual Basic, Perl, Tcl,
	and Python.

		<OL>

		<LI><EM>Visual Basic </EM>
		<BR>
		Visual Basic is proprietary, and there is no
		indication that a credible Open Source implementation
		will appear any time soon.
		</LI>

		<LI><EM>Perl</EM>
		<BR>
		Perl was a strong contender, primarily because of the
		many libraries that have been developed for it, and
		because of the number of books that document
		it. However, our experience teaching at Los Alamos was
		that Perl?s syntax is hard to learn, its behavior
		often arbitrary, and its size intimidating. While
		full-time professional programmers with several other
		languages under their belts might (and often do) say
		that it all makes sense once you know it, we want to
		make the learning curve as gentle as possible.
		</LI>


		<LI><EM>Tcl</EM>
		<BR>
		Tcl is easier to learn and read than Perl, but is not
		as well documented, and doesn?t come with as many
		libraries. Had Python not existed, Tcl would probably
		have been chosen for this project.
		</LI>

		<LI><EM>Python</EM>
		<BR>
		Python provides the same functionality as Perl or Tcl,
		but has proved to be easier to learn, read, and
		remember. (For example, words like "except" and
		"unless" appear much less often in Python reference
		material than they do in Perl reference material.)
		Python is not yet as extensively documented as Perl,
		but the number of books is growing, as is the number
		of modules and libraries. Finally, the Python
		community is still small enough for a project like
		this one to attract the attention of a significant
		proportion of it.
		</LI>

		</OL>
	</LI>
	</OL>

</LI>

<LI><EM>How will development be organized and coordinated? </EM>
<BR>
Everything the project produces---designs, critiques of those designs,
test suites, and examples, as well as actual source code---will be
available through the project?s Web site at
software-carpentry.codesourcery.com. Each project will have a
coordinator, whose job it will be to moderate discussion, synchronize
releases, track work items, and report on progress. The coordinator
will also be responsible for collating and editing feedback from
judges during the design competition.
</LI>

</OL>

<H2>Design competition</H2>

<OL>

<LI><EM>Why a design competition?</EM>
<BR>
Most Open Source packages have their roots in someone?s pet hobby
project, which others have picked up, extended, and modified. This
kind of organic growth has a lot of good features, but a
well-documented design is not one of them. As a result, programmers
often have to rely on folklore and reverse engineering if they want to
add to, or fix, these tools. In addition, there is a dearth of
examples of good design for new programmers to learn from. <BR> The
Software Carpentry project hopes to address both problems by running a
two-stage design competition. The best entries in both rounds will be
published, along with commentary from the competition?s
judges. This material will serve both to inform and guide further
development, and to show novices what experienced programmers think
about before they start coding.
</LI>

<LI><EM>Who can enter? </EM>
<BR>
Everyone: individuals and teams, students and professionals, from
anywhere in the world.
</LI>

<LI><EM>What are the rules? </EM>
<BR>The full rules are available at:
<CENTER>
software-carpentry.codesourcery.com/design-competition/rules.html
</CENTER>
Basically, initial submissions must be written in English, and can be
up to 10 pages long. Examples count against this limit, but diagrams
and a Unix-style man page do not. Any person or team may submit only
one entry in any given category, but can submit in as many of the four
categories as desired.
<BR>
The best four entries in each category will be awarded US$2500, and
asked to submit full designs. Participants will be strongly encouraged
to pool their efforts for the second round. The best second-round
submission will be awarded an additional US$7500, while the others
will receive another US$2500 each. The real reward will be seeing the
design implemented, and being in a good position to bid on the
implementation work.
</LI>

<LI><EM>What should first-round submissions contain? </EM>
<BR>
An example of what a submission should contain, and how it should be
formatted is available at:
<CENTER>
software-carpentry.codesourcery.com/design-competition/example.html
<CENTER>
First-round entries should focus primarily on what the tool will do,
and how it will be used: command-line options, input and output file
formats, sketches of Web and GUI interfaces (where appropriate), and
so on. Second-round submissions will then be expected to describe how
it?s all going to be implemented.
</LI>

<LI><EM>Who will the judges be? </EM>
<BR>
<B>Need to firm up the list of judges ASAP.</B>
</LI>

<LI><EM>When are the deadlines? </EM>
<BR>
The deadline for first-round submissions is March 31, 2000. The five
best proposals in each category will be announced on April 30,
2000. Full submissions are due on June 1, 2000, and winners will be
announced on June 30, 2000.
</LI>

<LI><EM>Won't prizes discourage co-operation? </EM>
<BR>
We don?t know. On the one hand, people might want to hoard their
best ideas; on the other hand, the best designs in both rounds are
going to be published, along with the judges? commentary, and we
will be encouraging participants to pool their efforts. Most of the
money that will be paid out will go to fund implementation, testing,
and documentation; we hope that people will collaborate in the early
stages, and treat the prizes as recognition for their effort, rather
than treating US$10,000 as their retirement fund.
</LI>

</OL>

<H2>Documentation</H2>

<OL>

<LI><EM>What documentation will be produced?</EM>
<BR>
The Software Carpentry project will produce several different kinds of
documentation:

	<OL>

	<LI><EM>Design documentation. </EM>
	<BR>
	As stated above, the best designs in each category will be
	published, along with the judges? commentary. This material
	ought to play the role that music criticism has played in the
	development of music, by giving newcomers (and experienced
	programmers) better insight into how good designers think.
	</LI>

	<LI><EM>User guides. </EM>
	<BR>
	The project will pay for the development of man pages, user
	guides, online help, and all the other documentation needed to
	turn a program into a product.
	</LI>

	<LI><EM>Test suites. </EM>
	<BR>
	The project will also pay for the development of
	industrial-strength test suites for all four tools. These
	suites will be published, both to serve as a starting point
	for other projects and to demonstrate good practice.
	</LI>

	<LI><EM>Case studies. </EM>
	<BR>
	It is often easier to show someone how to do something than to
	explain it to them. The Software Carpentry project will pay
	for case studies that describe how these tools, and (more
	importantly) the working practices they support, have been
	deployed in practice. Checklists, templates for forms, and
	other errata can be submitted.
	</LI>

	</OL>

</LI>

<LI><EM>What format(s) will be used? </EM>
<BR>
The primary format for all documentation will be HTML. The project
will migrate to XML when and as feasible.
</LI>

<LI><EM>What restrictions are there on using the documentation?</EM>
<BR>
Only those that also apply to the software, under the terms of its
Open Source license. You can copy and distribute the documentation in
any form, but only if its author(s) and origin are clearly shown, and
if you include a description of how readers can access the
originals. In particular, the documentation can be reproduced in
books, but only if the authors, origin, and location of the originals
is printed clearly on each page.
</LI>

</OL>

</BODY>
</HTML>

From jack at oratrix.nl  Thu Dec 23 11:24:26 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Thu, 23 Dec 1999 11:24:26 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib 
 getopt.py,1.7,1.8
In-Reply-To: Message by Guido van Rossum <guido@CNRI.Reston.VA.US> ,
	     Wed, 22 Dec 1999 13:23:45 -0500 , <199912221823.NAA16517@eric.cnri.reston.va.us> 
Message-ID: <19991223102426.CCB75370CF2@snelboot.oratrix.nl>

> Vladimir.Marangozov at inrialpes.fr:
> 
> > Yes. Besides, I still think that string-based exceptions are just
> > convenient for quick & dirty, throw-away test scripts.
> 
> They have a hard-to-understand quirk though: the id() of the string is
> used to check rather than its value, so that except "foo" doesn't
> necessarily catch raise "foo"; but due to various optimization, this
> usually works, and people get bent out of shape when it doesn't.

I sort-of use this feature when I'm debugging: if I want to know what happens 
in an exception that is usually caught somewhere higher up in the call stack I 
simply put quotes around the exception name and the exception will happen 
uncaught. The same trick works for except: clauses.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From harri.pasanen at trema.com  Thu Dec 23 12:44:04 1999
From: harri.pasanen at trema.com (Harri Pasanen)
Date: Thu, 23 Dec 1999 13:44:04 +0200
Subject: [Python-Dev] Re: [PSA MEMBERS] Please test new dynamic load behavior
References: <Pine.LNX.4.10.9912221200290.16305-100000@nebula.lyra.org>
Message-ID: <38620B04.7CC64485@trema.com>


Greg Stein wrote:
> 
> Hi all,
> 
> I reorganized Python's dynamic load/import code over the past few days.
> Gudio provided some feedback, I did some more mods, and now it is checked
> into CVS. The new loading behavior has been tested on Linux, IRIX, and
> Solaris (and probably Windows by now).
> 

...


What was the motivation behind this modification?

Just curious,

-Harri


From marangoz at python.inrialpes.fr  Thu Dec 23 13:12:40 1999
From: marangoz at python.inrialpes.fr (Vladimir Marangozov)
Date: Thu, 23 Dec 1999 13:12:40 +0100 (CET)
Subject: [Python-Dev] Please test new dynamic load behavior
In-Reply-To: <Pine.LNX.4.10.9912221200290.16305-100000@nebula.lyra.org> from "Greg Stein" at Dec 22, 1999 12:11:43 PM
Message-ID: <199912231212.NAA26572@python.inrialpes.fr>

Greg Stein wrote:
> 
> Hi all,
> 
> I reorganized Python's dynamic load/import code over the past few days.
> Gudio provided some feedback, I did some more mods, and now it is checked
> into CVS. The new loading behavior has been tested on Linux, IRIX, and
> Solaris (and probably Windows by now).
> 

Great work Greg!

> Here are some of the platforms that I believe need specific testing:
> 
> - NetBSD, FreeBSD, OpenBSD, ...
> - AIX
> - HP/UX
> - BeOS
> - NeXT
> - Mac
> - OS/2
> - Win16

AFAICT, the AIX version works perfectly okay.

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov at inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From jim at digicool.com  Thu Dec 23 15:41:23 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 09:41:23 -0500
Subject: [Python-Dev] str(1L) -> '1' ?
Message-ID: <38623493.E6BA6D6F@digicool.com>

In November there was an interesting discussion on comp.lang.python 
about the meaning of __str__ and __repr__.  One tidbit that came out
of this discussion was that __str__ for longs should drop the trailing 
'L'. Was there a decision on this? I'd really like this to happen.

We do alot of work with RDBMS systems and long integers seem to
come up alot with these systems (as do other fix-decimal number, 
but that's another topic ;).  For example, our latest Sybase and
Oracle support in Zope returns long integers for RDBMS types
like NUMBER(10,0).  The trailing 'L' in the string representation
is causeing us some headaches.  This seems also to be an issue when
using the current standard ODBC interface with Oracle, as indicated
in a DB-SIG post today.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From guido at CNRI.Reston.VA.US  Thu Dec 23 15:46:58 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 09:46:58 -0500
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: Your message of "Thu, 23 Dec 1999 09:41:23 EST."
             <38623493.E6BA6D6F@digicool.com> 
References: <38623493.E6BA6D6F@digicool.com> 
Message-ID: <199912231446.JAA22086@eric.cnri.reston.va.us>

[Jim F]
> In November there was an interesting discussion on comp.lang.python 
> about the meaning of __str__ and __repr__.  One tidbit that came out
> of this discussion was that __str__ for longs should drop the trailing 
> 'L'. Was there a decision on this? I'd really like this to happen.

Yes, I'd like it to happen.  I'd also like repr() of a float to return
the full precision (using the "%.17g" sprintf format).

I haven't done it for lack of time -- feel free to send a patch (don't
forget the disclaimer from http://www.python.org/1.5/bugrelease.html).

We haven't decided yet what to do with the greater topic of that
discussion (or was it a different one?) -- whether the values printed
by typing a bare expression in interactive mode should use str(),
repr(), or str-special-casing-the-snot-out-of-strings().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at digicool.com  Thu Dec 23 15:51:14 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 09:51:14 -0500
Subject: [Python-Dev] Fixed-decimal types
Message-ID: <386236E2.F97109D3@digicool.com>

While on the subject of RDBMS systems, a common need is to be able to
work with fixed-decimal data.  I think a standard Python fixed-decimal
type would help to make Python database interfaces alot more robust.
I even wonder if the Python long type might be hijacked for this purpose
by adding a "scale" that indicates the number of digits to the right
of the decimal point.  For example, an expression like:

  1000000000.2500L

would create a fixed decimal number with a scale of 4.

People have built Python classes for fixed-decimal
types, but when working with RDBMS data, one often deals with
lots of data and efficiency matters.  I also suspect that adding
scale to longs wouldn't be that hard and would be a fairly natural
extension.

In any case, a "standard" (being in the standard library would
be sufficient) fixed-decimal type would probably lead to better
database interfaces that (at least more) properly handled 
fixed-decimal data.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From guido at CNRI.Reston.VA.US  Thu Dec 23 15:56:33 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 09:56:33 -0500
Subject: [Python-Dev] Fixed-decimal types
In-Reply-To: Your message of "Thu, 23 Dec 1999 09:51:14 EST."
             <386236E2.F97109D3@digicool.com> 
References: <386236E2.F97109D3@digicool.com> 
Message-ID: <199912231456.JAA22134@eric.cnri.reston.va.us>

What would be scale of the product of two fixed-decimal numbers?
E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L?  There are
arguments for either.  Same question for division (harder, I think).

I like the idea of using the dd.ddL notation for this.

I have no time to implement it but would not be unwilling to accept
patches.  They would have to be accompanied with a wet signature, see
http://www.python.org/1.5/wetsign.html.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim at digicool.com  Thu Dec 23 16:00:25 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 10:00:25 -0500
Subject: [Python-Dev] re: Open Source design competition / Python / software 
 tools
References: <Pine.LNX.4.10.9912222242070.4839-200000@akbar.nevex.com>
Message-ID: <38623909.CDF41014@digicool.com>

gvwilson at nevex.com wrote:
> 
> Hi, folks.  I hope you don't mind another mail out of the blue, but I got
> notice on Saturday that the Department of Energy is giving me $860K over
> two years to support development of easier-to-use software engineering
> tools.  All of the work will be Open Source, and will be done in Python,
> with a strong emphasis on design, testing, and documentation.  The
> project's long-term objective is to encourage scientists and engineers to
> treat programs in the same way as they do other experiments, i.e. to
> calibrate, test, peer review, and so on.
> 
> To kick-start things, we're going to be holding a two-round design
> competition.  Anyone (individual or team, professional or student) can
> submit a short entry for the first round; the judges will pick four
> candidates to go forward in each of four categories, and those
> individuals or teams will be asked to submit full entries. The four
> categories are:
> 
> * an issue tracking system to replace Gnats and Bugzilla;
> 
> * a build system to replace make;
> 
> * a platform inspection and configuration system to replace autoconf;
>   and
> 
> * a testing framework to replace XUnit, Expect, and DejaGnu.
> 
> Would you be interested in participating in any way

Are these categories fixed? I see a very strong need for an 
open-source UML modeling tool. UML is extremely powerful, but current
UML tools largely suck and are very expensive.  We are contemplating
launching an open-source development effort to build UML modeling tools
using Zope or the Zope object database as a repository. A contest
like this could help to kick-start this effort, but tools to automate
requirements and design seem to be missing. This is odd, considering that
up-front activities like requirements and design have the largest impact
on software-engineering project success.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From captainrobbo at yahoo.com  Thu Dec 23 16:13:22 1999
From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Thu, 23 Dec 1999 07:13:22 -0800 (PST)
Subject: [Python-Dev] Fixed-decimal types
Message-ID: <19991223151322.5698.qmail@web604.mail.yahoo.com>

--- Guido van Rossum <guido at CNRI.Reston.VA.US> wrote:
> What would be scale of the product of two
> fixed-decimal numbers?
> E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to
> 4.00L?  There are
> arguments for either.  Same question for division
> (harder, I think).
Most commonly one is trying to avoid rounding errors
when dealing with money - a few cents rounding error
tends to result in a few billable hours with the
accountants at the end of the year!

SQL dialects and type-safe languages would make you
specify the precision of the variable to be assigned,
so the issue does not arise for other languages.  

For the work I do, simply taking the precision of the
most precise input (4.00L)would do the trick, but your
answer (4.0000L) is purer.  We should provide a
rounding function, and in practice anyone using such a
function would round (or floor, or ceiling) to get to
the desired precision immediately.

I'm not sure on division either but I'm sure there are
precedents to look at.

On the subject of adding new types to the standard
library, what are the plans on dates and times?  Would
a cut-down mxDateTime ever be considered?  It is fully
Open Source (unlike mxODBC) and was designed for the
DBAPI.

Regards,

Andy

=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com


From guido at CNRI.Reston.VA.US  Thu Dec 23 16:23:43 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 10:23:43 -0500
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
In-Reply-To: Your message of "Thu, 23 Dec 1999 07:13:22 PST."
             <19991223151322.5698.qmail@web604.mail.yahoo.com> 
References: <19991223151322.5698.qmail@web604.mail.yahoo.com> 
Message-ID: <199912231523.KAA22232@eric.cnri.reston.va.us>

> On the subject of adding new types to the standard
> library, what are the plans on dates and times?  Would
> a cut-down mxDateTime ever be considered?  It is fully
> Open Source (unlike mxODBC) and was designed for the
> DBAPI.

I don't know much about date/time types, or about mxDateTime.
My intuition is that there are too many ways to do it, and that being
compatible with commercial databases may not be the right way to do it
for core Python.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake at acm.org  Thu Dec 23 16:27:59 1999
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 23 Dec 1999 10:27:59 -0500 (EST)
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: <38623493.E6BA6D6F@digicool.com>
References: <38623493.E6BA6D6F@digicool.com>
Message-ID: <14434.16255.58344.646524@weyr.cnri.reston.va.us>

Jim Fulton writes:
 > In November there was an interesting discussion on comp.lang.python 
 > about the meaning of __str__ and __repr__.  One tidbit that came out
 > of this discussion was that __str__ for longs should drop the trailing 
 > 'L'. Was there a decision on this? I'd really like this to happen.

  I liked that result as well, and thought about it just the other
day.  Luckily, you sent a note this morning and made me think about
again.  I'll have something checked into CVS shortly.  ;)


  -Fred

--
Fred L. Drake, Jr.	  <fdrake at acm.org>
Corporation for National Research Initiatives


From Mike.Da.Silva at uk.fid-intl.com  Thu Dec 23 17:30:07 1999
From: Mike.Da.Silva at uk.fid-intl.com (Da Silva, Mike)
Date: Thu, 23 Dec 1999 16:30:07 -0000
Subject: [Python-Dev] Fixed Decimal types
Message-ID: <DBF3B37F7BF1D111B2A10000F6B14B1FDDAF86@ukhil704nts.hld.uk.fid-intl.com>

	Andy Robinson wrote:
		For the work I do, simply taking the precision of the
		most precise input (4.00L)would do the trick, but your
		answer (4.0000L) is purer.  We should provide a
		rounding function, and in practice anyone using such a
		function would round (or floor, or ceiling) to get to
		the desired precision immediately.

		I'm not sure on division either but I'm sure there are
		precedents to look at.

	The AS400 provides a useful example of the right way to do scaled
decimals.

	In the RPG programming language, all internal calculations (i.e.
multiplication, division) are performed to the maximum precision of the
intermediate result (in the multiplication example below), the intermediate
result would be 4.0000L.  When the intermediate result is assigned to the
target scaled decimal number, the decimal precision is automatically
extended or truncated to fit the target precision.  One extra wrinkle in all
of this is the option to "half-adjust" the intermediate value on assignment;
that is to apply automatic 5/4 rounding to the precision of the target.

	So, if the target field is defined as numeric(4,2), the result will
be 4.00L.

	These are probably the kind of semantics that a scaled decimal type
would require in Python also; i.e. allow unlimited precision in intermediate
calculations, with a sensible set of rules for assignment to a variable of
different scale and precision.

	However, unlike RPG, we should probably ensure that attempts to
overflow or underflow the scale result in NaN or Overflow conditions, rather
than assuming the user is right and losing the significant digits.

	Regards,
	Mike da Silva


From jim at digicool.com  Thu Dec 23 17:37:10 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 11:37:10 -0500
Subject: [Python-Dev] Fixed-decimal types
References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us>
Message-ID: <38624FB6.ED903F@digicool.com>

Guido van Rossum wrote:
> 
> What would be scale of the product of two fixed-decimal numbers?
> E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L?  There are
> arguments for either.  Same question for division (harder, I think).

I'd be inclined to start by doing some research to see if some standard
(SQL?) defines this somewhere.  It would be nice if someone has already 
done the requirements work for us. :)

> I like the idea of using the dd.ddL notation for this.
> 
> I have no time to implement

Me neither.

> it but would not be unwilling to accept patches. 

Cool.  If no one else volunteers, then I'll try to find a way
to get this done (not necessarily by me). I think it is pretty
important.

> They would have to be accompanied with a wet signature, see
> http://www.python.org/1.5/wetsign.html.

Yup.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From captainrobbo at yahoo.com  Thu Dec 23 17:38:50 1999
From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Thu, 23 Dec 1999 08:38:50 -0800 (PST)
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
Message-ID: <19991223163850.15619.qmail@web604.mail.yahoo.com>

Sorry, should have replied to the list...

--- Andy Robinson <captainrobbo at yahoo.com> wrote:
> Date: Thu, 23 Dec 1999 08:37:18 -0800 (PST)
> From: Andy Robinson <captainrobbo at yahoo.com>
> Reply-to: andy at robanal.demon.co.uk
> Subject: Re: [Python-Dev] Date and timetypes (was:
> Fixed-decimal types)
> To: Guido van Rossum <guido at CNRI.Reston.VA.US>
> 
> --- Guido van Rossum <guido at CNRI.Reston.VA.US>
> wrote:
> > I don't know much about date/time types, or about
> > mxDateTime.
> > My intuition is that there are too many ways to do
> > it, and that being
> > compatible with commercial databases may not be
> the
> > right way to do it
> > for core Python.
> > 
> 
> OK.  Let me rephrase it.  Say we form a consensus on
> 'the right way'.  Are you amenable to some solution
> which goes back before 1970 and after 2038 going
> into
> the standard library?
> 
> And does your answer change if it involves some
> compiled code as well?  
> 
> I mention mxDateTime because it was agreed by a
> Python
> SIG, is mature and stable, and I find it very
> useful. 
> And the core type is pretty small - much of the
> helper
> stuff in the package now could be kept separate from
> the main Python distribution.  
> 
> - Andy
> 
> 
> =====
> Andy Robinson
> Robinson Analytics Ltd.
> ------------------
> My opinions are the official policy of Robinson
> Analytics Ltd.
> They just vary from day to day.
> 
>
_________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at
> http://mail.yahoo.com
> 


=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


From guido at CNRI.Reston.VA.US  Thu Dec 23 17:42:33 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 11:42:33 -0500
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
In-Reply-To: Your message of "Thu, 23 Dec 1999 08:38:50 PST."
             <19991223163850.15619.qmail@web604.mail.yahoo.com> 
References: <19991223163850.15619.qmail@web604.mail.yahoo.com> 
Message-ID: <199912231642.LAA22598@eric.cnri.reston.va.us>

> > OK.  Let me rephrase it.  Say we form a consensus on 'the right
> > way'.  Are you amenable to some solution which goes back before
> > 1970 and after 2038 going into the standard library?

No problem.

> > And does your answer change if it involves some
> > compiled code as well?

I'd rather not.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip at mojam.com  Thu Dec 23 18:05:52 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 23 Dec 1999 11:05:52 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8
In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us>
References: <199912212238.RAA13660@eric.cnri.reston.va.us>
	<14432.594.33416.600794@weyr.cnri.reston.va.us>
	<199912212305.SAA13722@eric.cnri.reston.va.us>
Message-ID: <14434.22128.639699.738932@dolphin.mojam.com>

    Guido> (The next step would be to outlaw raise with a string argument; I
    Guido> think I can't make that for 1.6.  But it would be a good idea to
    Guido> scan the standard library for string exceptions and convert all
    Guido> of them.)

Agreed.  I know Zope uses (at least, my Zope-using code uses) stuff like 

    raise 'Redirect', url

to map names onto HTTP response codes.  Makes it easier on people to
remember names instead of numeric codes.  I suspect it will take the Zopers
awhile to convert to using class-based exceptions if they haven't already.
(For all I know I may be using a deprecated feature.)

Skip


From gvwilson at nevex.com  Thu Dec 23 18:24:05 1999
From: gvwilson at nevex.com (gvwilson at nevex.com)
Date: Thu, 23 Dec 1999 12:24:05 -0500 (EST)
Subject: [Python-Dev] re: Open Source design competition / Python /
 software  tools
In-Reply-To: <38623909.CDF41014@digicool.com>
Message-ID: <Pine.LNX.4.10.9912231219380.12516-100000@akbar.nevex.com>

Hi, everyone.  I'm sending my reply to Jim's message to the whole
python-dev list; I'll send follow-ups to individuals if people would
prefer.

> > * an issue tracking system to replace Gnats and Bugzilla;
> > 
> > * a build system to replace make;
> > 
> > * a platform inspection and configuration system to replace autoconf;
> >   and
> > 
> > * a testing framework to replace XUnit, Expect, and DejaGnu.

> Jim Fulton asked:
> Are these categories fixed?

For the first round, yes --- I have to prove that this model can solve
small problems before I'll be given the funding to tackle larger ones, and
I think that a UML modeling tool is definitely "large" :-).  I also have
to demonstrate uptake, and I think more people will adopt a sane
replacement for Autoconf in the next 18 months than would adopt a UML
modeler.  However, decent Open Source CASE tools are very (very) high on
my personal list --- if this works, I'd like to tackle them (along with
providing support for DDD, and a few other thingsl ike that).

Greg


From gstein at lyra.org  Thu Dec 23 19:26:44 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 23 Dec 1999 10:26:44 -0800 (PST)
Subject: [Python-Dev] Re: Please test new dynamic load behavior
In-Reply-To: <38620B04.7CC64485@trema.com>
Message-ID: <Pine.LNX.4.10.9912231022280.16305-100000@nebula.lyra.org>

On Thu, 23 Dec 1999, Harri Pasanen wrote:
> Greg Stein wrote:
> > Hi all,
> > 
> > I reorganized Python's dynamic load/import code over the past few days.
> > Gudio provided some feedback, I did some more mods, and now it is checked
> > into CVS. The new loading behavior has been tested on Linux, IRIX, and
> > Solaris (and probably Windows by now).
> 
> ...
> 
> What was the motivation behind this modification?

Harri -

With the new code structure, it is much easier to maintain Python's
loading code.

Each platform has its own file (e.g. dynload_aix.c) rather than being all
jammed together into importdl.c. This isn't a huge win by itself, but does
increase readability/maintainability. The big improvement, however, is
when you are adding support for new platforms or loading mechanisms. A new
dynload_*.c can be written and one line added to configure.in, and you're
done. No need to make importdl.c even uglier.  (actually, importdl.c no
longer contains *any* platform specific code; it has all been moved to the
dynload_*.c files)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim at digicool.com  Thu Dec 23 20:39:37 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 14:39:37 -0500
Subject: [Python-Dev] Fixed-decimal types
References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com>
Message-ID: <38627A79.BF379672@digicool.com>

Jim Fulton wrote:
> 
> Guido van Rossum wrote:
> >
> > What would be scale of the product of two fixed-decimal numbers?
> > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L?  There are
> > arguments for either.  Same question for division (harder, I think).
> 
> I'd be inclined to start by doing some research to see if some standard
> (SQL?) defines this somewhere.  It would be nice if someone has already
> done the requirements work for us. :)

Here is what the book "SQL-99 Complete, Really" says that the SQL
standard says:

  - for addition and subtraction of two "exact" (fixed-decimal)
    numbers, the result has the maximum of the scales.

  - for multiplication of two "exact" (fixed-decimal)
    numbers, the result has the sum of the scales.

  - punts on division

  - for addition, subtraction, multiplication or division
    between "exact" (fixed point) and "approximate" (floating point)
    yields an approximate result.  This means that fixed-decimal
    coerces to float.

I'm curious to see who else chips in with examples from other systems.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From jim at digicool.com  Thu Dec 23 20:43:41 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 14:43:41 -0500
Subject: [Python-Dev] Fixed Decimal types
References: <DBF3B37F7BF1D111B2A10000F6B14B1FDDAF86@ukhil704nts.hld.uk.fid-intl.com>
Message-ID: <38627B6D.447A9553@digicool.com>

"Da Silva, Mike" wrote:
> 
>         Andy Robinson wrote:
>                 For the work I do, simply taking the precision of the
>                 most precise input (4.00L)would do the trick, but your
>                 answer (4.0000L) is purer.  We should provide a
>                 rounding function, and in practice anyone using such a
>                 function would round (or floor, or ceiling) to get to
>                 the desired precision immediately.
> 
>                 I'm not sure on division either but I'm sure there are
>                 precedents to look at.
> 
>         The AS400 provides a useful example of the right way to do scaled
> decimals.
> 
>         In the RPG programming language, all internal calculations (i.e.
> multiplication, division) are performed to the maximum precision of the
> intermediate result (in the multiplication example below), the intermediate
> result would be 4.0000L.  When the intermediate result is assigned to the
> target scaled decimal number, the decimal precision is automatically
> extended or truncated to fit the target precision.  One extra wrinkle in all
> of this is the option to "half-adjust" the intermediate value on assignment;
> that is to apply automatic 5/4 rounding to the precision of the target.

Yee ha! This is great input. Anyone have any other examples of what
any other systems do? Anyone got a PL/I manual handy. ;)

>         So, if the target field is defined as numeric(4,2), the result will
> be 4.00L.

Since Python doesn't have types values, this is not an issue
internally, but would be an issue when binding to external databases.

>         These are probably the kind of semantics that a scaled decimal type
> would require in Python also; i.e. allow unlimited precision in intermediate
> calculations, with a sensible set of rules for assignment to a variable of
> different scale and precision.
> 
>         However, unlike RPG, we should probably ensure that attempts to
> overflow or underflow the scale result in NaN or Overflow conditions, rather
> than assuming the user is right and losing the significant digits.

Since this would be based on infinite-precision numbers, I don't
think that this would be an issue.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From guido at CNRI.Reston.VA.US  Thu Dec 23 20:44:36 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 14:44:36 -0500
Subject: [Python-Dev] Fixed-decimal types
In-Reply-To: Your message of "Thu, 23 Dec 1999 14:39:37 EST."
             <38627A79.BF379672@digicool.com> 
References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com>  
            <38627A79.BF379672@digicool.com> 
Message-ID: <199912231944.OAA23337@eric.cnri.reston.va.us>

Jim Fulton wrote:

>   - for addition and subtraction of two "exact" (fixed-decimal)
>     numbers, the result has the maximum of the scales.

One could argue that this is incorrect: if "3.1" means that I know the
value to one decimal of precision, and "2.01" means that I know that
value to two decimals of precision, stating the result of their sum as
"5.11" suggests that I know the result to two decimals of precision,
which is of course false: because I only knew one decimal of precision
for one of the operands, I only know (at most!) one decimal of
precision for the result.

Not arguing for this interpretation, just indicating that doing fixed
precision arithmetic right is hard.  I'm waiting for Tim Peters'
contribution, but he's on vacation so it may be a while.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm at hypernet.com  Thu Dec 23 21:48:56 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Thu, 23 Dec 1999 15:48:56 -0500
Subject: [Python-Dev] Fixed Decimal types
In-Reply-To: <38627B6D.447A9553@digicool.com>
Message-ID: <1266141247-31971518@hypernet.com>

Jim Fulton wrote:
> "Da Silva, Mike" wrote:

[AS400 RPG rules...]

> Yee ha! This is great input. Anyone have any other examples of
> what any other systems do? Anyone got a PL/I manual handy. ;)


From jim at digicool.com  Thu Dec 23 23:18:37 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 17:18:37 -0500
Subject: [Python-Dev] re: Open Source design competition / Python /software  
 tools
References: <Pine.LNX.4.10.9912231219380.12516-100000@akbar.nevex.com>
Message-ID: <38629FBD.3B8F47D4@digicool.com>

gvwilson at nevex.com wrote:
> 
> Hi, everyone.  I'm sending my reply to Jim's message to the whole
> python-dev list; I'll send follow-ups to individuals if people would
> prefer.
> 
> > > * an issue tracking system to replace Gnats and Bugzilla;
> > >
> > > * a build system to replace make;
> > >
> > > * a platform inspection and configuration system to replace autoconf;
> > >   and
> > >
> > > * a testing framework to replace XUnit, Expect, and DejaGnu.
> 
> > Jim Fulton asked:
> > Are these categories fixed?
> 
> For the first round, yes 

OK.

>--- I have to prove that this model can solve
> small problems before I'll be given the funding to tackle larger ones, and
> I think that a UML modeling tool is definitely "large" :-). 

Well, since you gave rational ..... :)

<speech>
Isn't the Open Source community especially good at large problems?
Note that I'm thinking more in terms of an open source UML community
of tools, based around an existing repository rather than on a single 
monolithic tool.  I envision a community of diagramming and other small
tools orbiting Zope or ZODB. The hardest part of a UML tool is the
repository, and I think we've mostly got that.

I think that what the Open Source community desperately needs 
are tools for managing and sharing the most important artifacts
in the development process.
</speech>

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From gstein at lyra.org  Fri Dec 24 01:09:29 1999
From: gstein at lyra.org (Greg Stein)
Date: Thu, 23 Dec 1999 16:09:29 -0800 (PST)
Subject: [Python-Dev] re: Open Source design competition / Python /software
   tools
In-Reply-To: <38629FBD.3B8F47D4@digicool.com>
Message-ID: <Pine.LNX.4.10.9912231605030.412-100000@nebula.lyra.org>

On Thu, 23 Dec 1999, Jim Fulton wrote:
> gvwilson at nevex.com wrote:
>...
> >--- I have to prove that this model can solve
> > small problems before I'll be given the funding to tackle larger ones, and
> > I think that a UML modeling tool is definitely "large" :-). 
> 
> Well, since you gave rational ..... :)
> 
> <speech>
> Isn't the Open Source community especially good at large problems?

Very true, I agree, but part of Greg's problem is "proving" that to the
DoE. Somebody has said those four problems are sufficient to do so, and
(probably) because they are reasonably constrained to allow completion
within a specified timeframe.

> Note that I'm thinking more in terms of an open source UML community
> of tools, based around an existing repository rather than on a single 
> monolithic tool.  I envision a community of diagramming and other small
> tools orbiting Zope or ZODB. The hardest part of a UML tool is the
> repository, and I think we've mostly got that.

Greg's proposal is quite specific. "A community" isn't, so it might not
help to create a proof to the DoE (otherwise, they could look at the Zope
community, or other communities!).

Jim: there isn't anything stopping or impeding the creation of an Open
Source community for UML modeling. This DoE competition won't affect
that...

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From jim at digicool.com  Fri Dec 24 01:27:53 1999
From: jim at digicool.com (Jim Fulton)
Date: Thu, 23 Dec 1999 19:27:53 -0500
Subject: [Python-Dev] re: Open Source design competition / Python 
 /softwaretools
References: <Pine.LNX.4.10.9912231605030.412-100000@nebula.lyra.org>
Message-ID: <3862BE09.9AF62090@digicool.com>

Greg Stein wrote:
> 
(snip)
> Jim: there isn't anything stopping or impeding the creation of an Open
> Source community for UML modeling.

Of course not.

> This DoE competition won't affect that...

Perhaps it could help it.
 
> Happy Holidays,

You too.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From ping at lfw.org  Fri Dec 24 09:55:28 1999
From: ping at lfw.org (Ka-Ping Yee)
Date: Fri, 24 Dec 1999 00:55:28 -0800 (PST)
Subject: [Python-Dev] re: Open Source design competition / Python /
 software tools
In-Reply-To: <Pine.LNX.4.10.9912222242070.4839-200000@akbar.nevex.com>
Message-ID: <Pine.LNX.4.10.9912240049360.655-100000@skuld.lfw.org>

On Wed, 22 Dec 1999 gvwilson at nevex.com wrote:
> To kick-start things, we're going to be holding a two-round design
> competition.  Anyone (individual or team, professional or student) can
> submit a short entry for the first round; the judges will pick four
> candidates to go forward in each of four categories, and those
> individuals or teams will be asked to submit full entries. The four
> categories are:
> 
> * an issue tracking system to replace Gnats and Bugzilla;

Hi there.

At ILM we've been using a system that i hacked up quickly in Python
called "Roundup".  It has a number of interesting properties that
have made it really useful to us, and arguably better than any of
the existing open-source bug-tracking things out there that i know
of.  It is not just a Web app; it lives between the Web and e-mail,
because we do so much of our communication that way.

For example, each request item gets its own virtual mailing list,
updated on the fly without the need for explicit subscription (if
you cc: somebody while discussing the bug, they get subscribed).
Empirically i've discovered that unsubscription is actually
unnecessary (!) because conversation will stop on a topic when it
gets resolved or when it ceases to be interesting.  These are
fine-grained discussion lists on a per-topic level.

This is just to let you know i'm interested.  I'm currently asking
for permission to open-source Roundup; if it can't be done, or
doesn't happen quickly enough, i'll just have to take a weekend and
rewrite the thing.  There were a few things i wanted to fix anyway.


-- ?!ng

"You should either succeed gloriously or fail miserably.  Just getting
by is the worst thing you can do."
    -- Larry Smith


From marangoz at python.inrialpes.fr  Fri Dec 24 13:07:05 1999
From: marangoz at python.inrialpes.fr (Vladimir Marangozov)
Date: Fri, 24 Dec 1999 13:07:05 +0100 (CET)
Subject: [Python-Dev] Exceptions
In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 01:23:45 PM
Message-ID: <199912241207.NAA18783@python.inrialpes.fr>

Guido van Rossum wrote:
> 
> Vladimir.Marangozov at inrialpes.fr:
> 
> > Yes. Besides, I still think that string-based exceptions are just
> > convenient for quick & dirty, throw-away test scripts.
> 
> They have a hard-to-understand quirk though: the id() of the string is
> used to check rather than its value, so that except "foo" doesn't
> necessarily catch raise "foo"; but due to various optimization, this
> usually works, and people get bent out of shape when it doesn't.

Which brings 2 important questions:

1. In the long run, which one is better -- compare and check exceptions by
   reference (by name) or by value?

   (currently, this is done by reference on predefined object types:
    strings, classes or instances)

   I'd say, exceptions have to be compared (catched) by value, i.e. use
   "e1 == e2" instead of "e1 is e2".

2. Should we limit the exception "types"?

   I'd say, no. My Pythonic view of things says that we raise "objects",
   be they classes, instances, strings or, why not, ints.

   However, if one wants to put some order in the "unordered set" of exceptions
   s/he uses, then classes is the way to do it, because classes were given some
   nice properties, like inheritance, that allow to group and to organize logically
   the objects we throw and catch as exceptions (+ other bonus properties coming
   from classes).

   Note that conceptually, when we say "strings and ints", we have in mind
   "string instances and int instances", whose "classes" are written in C.
   When there will be String and Int classes of some sort as first class objects,
   then we'll fall back to the terminology: Exceptions can be classes or instances.

If point 1 and (optionally) point 2 is implemented, the hard-to-understand quirk
wouldn't be an issue and string-based exceptions would have a legal reason to stay
and live.

> Since you have to give your exception a name, how hard is it to say
> 
> class MyError(Exception): pass
> 
> rathern than
> 
> MyError = "MyError"
> 
> ?

You know what I think about "names"...  I may have defined my exception conventions
and be interested in catching an exception named 404, implying that "a 404 bobo"
occured deeply in my code ("deeply in my code" meaning for example: database 4,
service 0, customer group 4, or just a standard HTTP "Code 404 - Not Found".)

Pushing this to the extreme to catapult your thoughts into the next millenium. :)
and to emphasize the importance of discussing and anwsering objectively the above
questions 1) and 2).

-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov at inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From mal at lemburg.com  Fri Dec 24 12:03:37 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 24 Dec 1999 12:03:37 +0100
Subject: [Python-Dev] str(1L) -> '1' ?
References: <38623493.E6BA6D6F@digicool.com> <199912231446.JAA22086@eric.cnri.reston.va.us>
Message-ID: <38635309.2AEFF18D@lemburg.com>

Guido van Rossum wrote:
> 
> [Jim F]
> > In November there was an interesting discussion on comp.lang.python
> > about the meaning of __str__ and __repr__.  One tidbit that came out
> > of this discussion was that __str__ for longs should drop the trailing
> > 'L'. Was there a decision on this? I'd really like this to happen.
> 
> Yes, I'd like it to happen.  I'd also like repr() of a float to return
> the full precision (using the "%.17g" sprintf format).

While we're at it: how about adding a PyLong_AsString() API
to the C interface ? I currently use PyObject_Str() in mxODBC
and then slice off the 'L' -- not very elegant. A PyLong_AsString()
API would much better suit the task.

Merry Christmas,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     7 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal at lemburg.com  Fri Dec 24 12:11:29 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 24 Dec 1999 12:11:29 +0100
Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types)
References: <19991223163850.15619.qmail@web604.mail.yahoo.com> <199912231642.LAA22598@eric.cnri.reston.va.us>
Message-ID: <386354E1.DA560F42@lemburg.com>

Guido van Rossum wrote:
> 
> > > OK.  Let me rephrase it.  Say we form a consensus on 'the right
> > > way'.  Are you amenable to some solution which goes back before
> > > 1970 and after 2038 going into the standard library?
> 
> No problem.
> 
> > > And does your answer change if it involves some
> > > compiled code as well?
> 
> I'd rather not.

As far as mxDateTime goes, I'd rather not see it in the core
distribution. Including the mx stuff in a separate PythonPowerTools
distribution would be cool though. For a start in this direction
see e.g.:

     http://startship.skyport.net/~lemburg/PPowerTools-0.2.zip

Note that I'll wrap all my mx extensions into a new mx package
which will come in several flavours next year. There will no
longer be separate packages due to the various naming
collisions and to enable intra-mx-package dependencies.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     7 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From captainrobbo at yahoo.com  Fri Dec 24 13:22:29 1999
From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Fri, 24 Dec 1999 04:22:29 -0800 (PST)
Subject: [Python-Dev] Fixed Decimal types
Message-ID: <19991224122229.23506.qmail@web606.mail.yahoo.com>

> >> However, unlike RPG, we should probably ensure 
> >> that attempts to overflow or underflow the scale 
> >> result in NaN or Overflow conditions, rather
> >> than assuming the user is right and losing 
> >> the significant digits.
>  
> > Since this would be based on infinite-precision
> numbers, I don't
> > think that this would be an issue.


Three very general observations before I disappear for
Christmas:

(1) I think there is great mileage in combining the
fixed-decimal concept with Martin Fowler's Quantity
pattern, so that a variable could be defined as not
just two decimal places but also (say) "GBP" or "USD",
and it would be an error to add the two.  Same applies
for adding metres, kilograms and other quantities. 
There has also been discussion that the 'type' of a
quantity should determine what math should apply.

(2) If Python is going to be used increasingly in
eCommerce, it should be good at dealing with money -
maybe not in the core language, but we should aim for
one standard package.  

(3) We have a python-finance list
(python-finance at egroups.com), recently generalized to
cover business systems, which is a good place to
discuss this if anyone wants to.  There are people
there who have time, would love to prototype something
(indeed some work started in this area 3 months back),
and would use it at work too.  This would be an ideal
first target for that group - or indeed for a
finance-sig.  I'll pursue this in the New Year.

Merry Christmas,

Andy

=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


From jack at oratrix.nl  Fri Dec 24 13:34:28 1999
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 24 Dec 1999 13:34:28 +0100
Subject: [Python-Dev] Fixed Decimal types 
In-Reply-To: Message by =?iso-8859-1?q?Andy=20Robinson?= 
 <captainrobbo@yahoo.com> ,
	     Fri, 24 Dec 1999 04:22:29 -0800 (PST) , <19991224122229.23506.qmail@web606.mail.yahoo.com> 
Message-ID: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl>

> (1) I think there is great mileage in combining the
> fixed-decimal concept with Martin Fowler's Quantity
> pattern, so that a variable could be defined as not
> just two decimal places but also (say) "GBP" or "USD",
> and it would be an error to add the two.  Same applies
> for adding metres, kilograms and other quantities. 
> There has also been discussion that the 'type' of a
> quantity should determine what math should apply.

Isn't this something that is ideally suited for implementation in a Python 
module, based on a core implementation of fixed decimal numbers?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From gstein at lyra.org  Fri Dec 24 21:05:22 1999
From: gstein at lyra.org (Greg Stein)
Date: Fri, 24 Dec 1999 12:05:22 -0800 (PST)
Subject: [Python-Dev] Fixed Decimal types 
In-Reply-To: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl>
Message-ID: <Pine.LNX.4.10.9912241204290.412-100000@nebula.lyra.org>

On Fri, 24 Dec 1999, Jack Jansen wrote:
> > (1) I think there is great mileage in combining the
> > fixed-decimal concept with Martin Fowler's Quantity
> > pattern, so that a variable could be defined as not
> > just two decimal places but also (say) "GBP" or "USD",
> > and it would be an error to add the two.  Same applies
> > for adding metres, kilograms and other quantities. 
> > There has also been discussion that the 'type' of a
> > quantity should determine what math should apply.
> 
> Isn't this something that is ideally suited for implementation in a Python 
> module, based on a core implementation of fixed decimal numbers?

I'd agree with Jack here.

The "simple" change of a scale for the Long values is nice. Starting to
lump in features like this begins to get a little messier...

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein at lyra.org  Fri Dec 24 21:13:50 1999
From: gstein at lyra.org (Greg Stein)
Date: Fri, 24 Dec 1999 12:13:50 -0800 (PST)
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: <38635309.2AEFF18D@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912241211460.412-100000@nebula.lyra.org>

On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> Guido van Rossum wrote:
> > [Jim F]
> > > In November there was an interesting discussion on comp.lang.python
> > > about the meaning of __str__ and __repr__.  One tidbit that came out
> > > of this discussion was that __str__ for longs should drop the trailing
> > > 'L'. Was there a decision on this? I'd really like this to happen.
> > 
> > Yes, I'd like it to happen.  I'd also like repr() of a float to return
> > the full precision (using the "%.17g" sprintf format).
> 
> While we're at it: how about adding a PyLong_AsString() API
> to the C interface ? I currently use PyObject_Str() in mxODBC
> and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> API would much better suit the task.

Fred just checked in a change yesterday. PyObject_Str() on a Long no
longer includes the 'L'.

You're going to need to update your code :-)
[ I've got some here and there to fix, too, with the idiom:
     if type(v) is type(1L): return str(v)[:-1]
  ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From mal at lemburg.com  Sun Dec 26 23:29:28 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 26 Dec 1999 23:29:28 +0100
Subject: [Python-Dev] str(1L) -> '1' ?
References: <Pine.LNX.4.10.9912241211460.412-100000@nebula.lyra.org>
Message-ID: <386696C8.6EBBF428@lemburg.com>

Greg Stein wrote:
> 
> On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> > While we're at it: how about adding a PyLong_AsString() API
> > to the C interface ? I currently use PyObject_Str() in mxODBC
> > and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> > API would much better suit the task.
> 
> Fred just checked in a change yesterday. PyObject_Str() on a Long no
> longer includes the 'L'.

Ah, ok... scanning the patches: they don't provide an externed
C interface... I would like to have such a beast if possible
(basically, the new long_format() as PyLong_AsString()).

> You're going to need to update your code :-)
> [ I've got some here and there to fix, too, with the idiom:
>      if type(v) is type(1L): return str(v)[:-1]
>   ]

Your above example will effectively divide the long value by 10
which will probably break things in very subtle ways... hmm, this
change ought to be made *very* visible to people upgrading to
1.6, IMHO.

I'll fix mxODBC to only truncate the string value iff
the 'L' is present.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     5 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From andy at robanal.demon.co.uk  Mon Dec 27 11:43:17 1999
From: andy at robanal.demon.co.uk (Andy Robinson)
Date: Mon, 27 Dec 1999 10:43:17 GMT
Subject: [Python-Dev] Fixed Decimal types 
In-Reply-To: <Pine.LNX.4.10.9912241204290.412-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912241204290.412-100000@nebula.lyra.org>
Message-ID: <38674259.5377973@post.demon.co.uk>

On Fri, 24 Dec 1999 12:05:22 -0800 (PST), you wrote:

>On Fri, 24 Dec 1999, Jack Jansen wrote:
>> > (1) I think there is great mileage in combining the
>> > fixed-decimal concept with Martin Fowler's Quantity
>> > pattern, so that a variable could be defined as not
>> > just two decimal places but also (say) "GBP" or "USD",
>> > and it would be an error to add the two.  Same applies
>> > for adding metres, kilograms and other quantities. 
>> > There has also been discussion that the 'type' of a
>> > quantity should determine what math should apply.
>> 
>> Isn't this something that is ideally suited for implementation in a Python 
>> module, based on a core implementation of fixed decimal numbers?
>
>I'd agree with Jack here.
>
Me too - I thought I said that in point 2, but in retrospect I didn't
say it clearly enough :-)


- Andy


From gstein at lyra.org  Mon Dec 27 12:31:29 1999
From: gstein at lyra.org (Greg Stein)
Date: Mon, 27 Dec 1999 03:31:29 -0800 (PST)
Subject: [Python-Dev] str(1L) -> '1' ?
In-Reply-To: <386696C8.6EBBF428@lemburg.com>
Message-ID: <Pine.LNX.4.10.9912270330180.412-100000@nebula.lyra.org>

On Sun, 26 Dec 1999, M.-A. Lemburg wrote:
> Greg Stein wrote:
> > On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> > > While we're at it: how about adding a PyLong_AsString() API
> > > to the C interface ? I currently use PyObject_Str() in mxODBC
> > > and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> > > API would much better suit the task.
> > 
> > Fred just checked in a change yesterday. PyObject_Str() on a Long no
> > longer includes the 'L'.
> 
> Ah, ok... scanning the patches: they don't provide an externed
> C interface... I would like to have such a beast if possible
> (basically, the new long_format() as PyLong_AsString()).

What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry
Point.

> > You're going to need to update your code :-)
> > [ I've got some here and there to fix, too, with the idiom:
> >      if type(v) is type(1L): return str(v)[:-1]
> >   ]
> 
> Your above example will effectively divide the long value by 10
> which will probably break things in very subtle ways... hmm, this

Yah :-(  Not a lot of fun, but I think for the best.

> change ought to be made *very* visible to people upgrading to
> 1.6, IMHO.

Yes.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From mal at lemburg.com  Mon Dec 27 13:51:36 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 27 Dec 1999 13:51:36 +0100
Subject: [Python-Dev] str(1L) -> '1' ?
References: <Pine.LNX.4.10.9912270330180.412-100000@nebula.lyra.org>
Message-ID: <386760D8.E897FADF@lemburg.com>

Greg Stein wrote:
> 
> On Sun, 26 Dec 1999, M.-A. Lemburg wrote:
> > Greg Stein wrote:
> > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote:
> > > > While we're at it: how about adding a PyLong_AsString() API
> > > > to the C interface ? I currently use PyObject_Str() in mxODBC
> > > > and then slice off the 'L' -- not very elegant. A PyLong_AsString()
> > > > API would much better suit the task.
> > >
> > > Fred just checked in a change yesterday. PyObject_Str() on a Long no
> > > longer includes the 'L'.
> >
> > Ah, ok... scanning the patches: they don't provide an externed
> > C interface... I would like to have such a beast if possible
> > (basically, the new long_format() as PyLong_AsString()).
> 
> What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry
> Point.

What's wrong with a rich C API :-) ?

The long_format function would be very useful for programs
interacting with other software at C level. Making it
external would give the programmer the ability to pass
long string representations in any base to other programs,
which is very useful for e.g. database interaction or
crypto software.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                     4 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From bkc at murkworks.com  Mon Dec 27 23:04:25 1999
From: bkc at murkworks.com (Brad Clements)
Date: Mon, 27 Dec 1999 17:04:25 -0500
Subject: [Python-Dev] Re: [PSA MEMBERS] Re: Please test new dynamic load behavior
In-Reply-To: <Pine.LNX.4.10.9912231022280.16305-100000@nebula.lyra.org>
References: <38620B04.7CC64485@trema.com>
Message-ID: <199912272204.RAA26173@anvil.murkworks.com>

On 23 Dec 99, at 10:26, Greg Stein wrote:

> > > I reorganized Python's dynamic load/import code over the past few days.
> > > Gudio provided some feedback, I did some more mods, and now it is checked
> > > into CVS. The new loading behavior has been tested on Linux, IRIX, and
> > > Solaris (and probably Windows by now).


FYI, I downloaded the import stuff from CVS and used it in my port of 
Python to NetWare. Good timing, as I was just tackling dynamic 
loading on NetWare when I saw your message.

The new scheme is much better, and works for me.

Though I do need to add some special "un-import" code similar to what 
BEOS does. 


Brad Clements,                bkc at murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
netmeeting: ils://ils.murkworks.com               AOL-IM: BKClements


From skip at mojam.com  Tue Dec 28 22:41:33 1999
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 28 Dec 1999 15:41:33 -0600
Subject: [Python-Dev] Better text processing support in py2k?
Message-ID: <199912282141.PAA31426@dolphin.mojam.com>

It just occurred to me as I was replying to a request on the main list, that
Python's text handling capabilities could be a bit better than they are.
This will probably not come as a revelation to many of you, but I finally
put it together with the standard argument against beefing things up

    One fix would be to add regular expressions to the language core and
    have special syntax for them, as Perl has done. However, I don't like
    this solution because Python is a general-purpose language, and regular
    expressions are used for the single application domain of text
    processing. For other application domains, regular expressions may be of
    no interest, and you might want to remove them to save memory and code
    size.

and the observation that Python does support some builtin objects and syntax
that are fairly specific to some much more restricted application domains
than text processing.

I stole the above quote from Andrew Kuchling's Python Warts page, which I
also happened to read earlier today.

What AMK says makes perfect sense until you examine some of the other things
that are in the language, like the Ellipsis object and complex numbers.  If
I recall correctly both were added as a result of the NumPy package
development.

I have nothing against ellipses or complex numbers.  They are fine first
class objects that should remain in the language. But I have never used
either one in my day-to-day work.  On the other hand, I read files and
manipulate them with regular expressions all the time.  I rather suspect
that more people use Python for some sort of text processing than any other
single application domain.  Python should be good at it.

While I don't want to turn Python into Perl, I would like to see it do a
better job of what most people probably use the language for.  Here is a
very short list of things I think need attention:

    1. When using something like the simple file i/o idiom

       for line in f.readlines():
	   dofunstuff(line)

       the programmer should not have to care how big the file is.  It
       should just work in a reasonably efficient manner without gobbling up
       all of memory.  I realize this may require some change to the syntax
       of the common idiom.

    2. The re module needs to be sped up, if not to catch up with Perl, then
       to catch up with the deprecated regex module.  Depending how far
       people want to go with things, adding some language syntax to support
       regular expressions might be in order.  I don't see that as
       compelling as adding complex numbers however.  Another possibility,
       now that Barry Warsaw has opened the floodgates, is to add regular
       expression methods to strings.

    3. I've not yet used it, but I am told the pattern matching in
       Marc-Andre Lemburg's mxTextTools
       (http://starship.python.net/crew/lemburg/) is both powerful and
       efficient (though it certainly appears complex).  Perhaps it deserves
       consideration for incorporation into the core Python distribution.

I'm sure other people will come up with other suggestions.

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From akuchlin at mems-exchange.org  Tue Dec 28 23:00:11 1999
From: akuchlin at mems-exchange.org (Andrew M. Kuchling)
Date: Tue, 28 Dec 1999 17:00:11 -0500 (EST)
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com>
References: <199912282141.PAA31426@dolphin.mojam.com>
Message-ID: <14441.13035.802146.730160@amarok.cnri.reston.va.us>

Skip Montanaro writes:
>What AMK says makes perfect sense until you examine some of the other things
>that are in the language, like the Ellipsis object and complex numbers.  If
>I recall correctly both were added as a result of the NumPy package
>development.

True, but note that you can compile Python with WITHOUT_COMPLEX
defined to remove complex numbers.

>    1. When using something like the simple file i/o idiom
>       for line in f.readlines():
>	   dofunstuff(line)
>       the programmer should not have to care how big the file is.

What about 'for line in fileinput.input()', which already exists?
(Hmmm... if you have an already open file object, I don't think you
can pass it to fileinput.input(); maybe that should be fixed.)

On a vaguely related note, since there are many things like parser
generators and XML stuff and mxTextTools, I've been speculating about
a text processing topic guide.  If you know of Python packages related
to text processing, please send me a private e-mail with a link.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Constraints often boost creativity.
    -- Jim Hugunin, 11 Feb 1999


From skip at mojam.com  Tue Dec 28 23:26:53 1999
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 28 Dec 1999 16:26:53 -0600 (CST)
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <14441.13035.802146.730160@amarok.cnri.reston.va.us>
References: <199912282141.PAA31426@dolphin.mojam.com>
	<14441.13035.802146.730160@amarok.cnri.reston.va.us>
Message-ID: <14441.14637.682862.999776@dolphin.mojam.com>

    Andrew> True, but note that you can compile Python with WITHOUT_COMPLEX
    Andrew> defined to remove complex numbers.

That's true, but that wasn't my point.  I'm not arguing for or against space
efficiency, just that the the rather timeworn argument about not doing
anything special to support text processing because Python is a general
purpose language is a red herring.

    >> 1. When using something like the simple file i/o idiom
    >> for line in f.readlines():
    >>   dofunstuff(line)
    >> the programmer should not have to care how big the file is.

    Andrew> What about 'for line in fileinput.input()', which already
    Andrew> exists?  (Hmmm... if you have an already open file object, I
    Andrew> don't think you can pass it to fileinput.input(); maybe that
    Andrew> should be fixed.)

Well, a couple reasons jump to mind:

   1. fileinput.FileInput isn't particularly efficient.  At its heart, its
      __getitem__ method makes a simple readline() call instead of buffering
      some amount of readlines(sizehint) bytes.  This can be fixed, but I'm
      not sure what would happen to its semantics.

   2. As you pointed out, it's not all that general.

My point, not at all well stated, is that the programmer shouldn't have to
worry (much?) about the conditions under which he does file i/o.   Right
now, if I know the file is small(ish), I can do

    for line in f.readlines():
        dofunstuff(line)

but I have to know that the file won't be big, because readlines() will
behave badly (perhaps even generate a MemoryError exception) if the file is
large.  In that case, I have to fall back to the safer (and slower)

    line = f.readline()
    while line:
        dofunstuff(line)
	line = f.readline()

or the more efficient, but more cumbersome

    lines = f.readlines(sizehint)
    while lines:
        for line in lines:
	    dofunstuff(line)
	lines = f.readlines(sizehint)

That's three separate idioms the programmer has to be aware of when writing
code to read a text file based upon the perceived need for speed, memory
usage and desired clarity:

    fast/memory-intensive/clear
    slow/memory-conserving/not-as-clear
    fast/memory-conserving/fairly-muddy

Any particular reason that the readline method can't return an iterator that
supports __getitem__ and buffers input?  (Again, remember this is for py2k,
so the potential breakage such a change might cause is a consideration, but
not a showstopper.)

    Andrew> On a vaguely related note, since there are many things like
    Andrew> parser generators and XML stuff and mxTextTools, I've been
    Andrew> speculating about a text processing topic guide.  If you know of
    Andrew> Python packages related to text processing, please send me a
    Andrew> private e-mail with a link.

This sounds like a good idea to me.

Skip Montanaro | http://www.mojam.com/
skip at mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From captainrobbo at yahoo.com  Wed Dec 29 09:34:43 1999
From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=)
Date: Wed, 29 Dec 1999 00:34:43 -0800 (PST)
Subject: [Python-Dev] Better text processing support in py2k?
Message-ID: <19991229083443.27817.qmail@web6005.mail.yahoo.com>

--- Skip Montanaro <skip at mojam.com> wrote:
>     fast/memory-intensive/clear
>     slow/memory-conserving/not-as-clear
>     fast/memory-conserving/fairly-muddy
> 
> Any particular reason that the readline method can't
> return an iterator that
> supports __getitem__ and buffers input?  (Again,
> remember this is for py2k,
> so the potential breakage such a change might cause
> is a consideration, but
> not a showstopper.)

Why not generalize fileinput to do buffering instead?

More generally, Java has the notion of 'stackable
streams' - e.g. construct a 'BufferedFile' around a
'File', maybe construct a 'Line-oriented file' around
that etc.  Each one takes a file-like object as an
argument to the constructor.  Things you might want to
do:
- buffering
- international encoding conversions
- line delimiters other than CR/LF/CRLF
- read/write Python objects (i.e. use pickle/marshal)
- easy interfaces to parsers

This took me a couple of hours to get used to (and at
the time I thought 'Yuk!' when I saw first saw four
nested constructors), but gives you very precise
control and a lot of versatility when handling files. 
It's an idiom Python does not use much but maybe it
should.

I'd argue that maybe some enhancements to fileinput.py
- adding some streams to provide building blocks for
these operations - would get us the power you want and
a lot more versatility besides.


=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://messenger.yahoo.com


From mal at lemburg.com  Wed Dec 29 17:55:21 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 29 Dec 1999 17:55:21 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <19991229083443.27817.qmail@web6005.mail.yahoo.com>
Message-ID: <386A3CF9.8AF0EA60@lemburg.com>

Andy Robinson wrote:
> 
> --- Skip Montanaro <skip at mojam.com> wrote:
> >     fast/memory-intensive/clear
> >     slow/memory-conserving/not-as-clear
> >     fast/memory-conserving/fairly-muddy
> >
> > Any particular reason that the readline method can't
> > return an iterator that
> > supports __getitem__ and buffers input?  (Again,
> > remember this is for py2k,
> > so the potential breakage such a change might cause
> > is a consideration, but
> > not a showstopper.)
> 
> Why not generalize fileinput to do buffering instead?
> 
> More generally, Java has the notion of 'stackable
> streams' - e.g. construct a 'BufferedFile' around a
> 'File', maybe construct a 'Line-oriented file' around
> that etc.  Each one takes a file-like object as an
> argument to the constructor.  Things you might want to
> do:
> - buffering
> - international encoding conversions
> - line delimiters other than CR/LF/CRLF
> - read/write Python objects (i.e. use pickle/marshal)
> - easy interfaces to parsers

If all goes well we'll have something like this
in Python 1.6 at least for the encoding/decoding
part file reading and writing. You basically take
a file object and then wrap some StreamCodecs around
it to get the functionality you need. Very simple
and very intuitive.

> This took me a couple of hours to get used to (and at
> the time I thought 'Yuk!' when I saw first saw four
> nested constructors), but gives you very precise
> control and a lot of versatility when handling files.
> It's an idiom Python does not use much but maybe it
> should.
> 
> I'd argue that maybe some enhancements to fileinput.py
> - adding some streams to provide building blocks for
> these operations - would get us the power you want and
> a lot more versatility besides.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Get ready to party !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From bckfnn at pipmail.dknet.dk  Wed Dec 29 19:51:52 1999
From: bckfnn at pipmail.dknet.dk (Finn Bock)
Date: Wed, 29 Dec 1999 18:51:52 GMT
Subject: [Python-Dev] zipfile.py
In-Reply-To: <3857B97E.3684224F@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com>
Message-ID: <386a582d.6762574@pipmail.dknet.dk>

James C. Ahlstrom wrote:

>  ftp://ftp.interet.com/pub/pylib.html

I feel that it smell a bit too much like a tool and too little like an general
programming api.

- It can only add disk files. The ability to write data to a zip entry through 
  a file-like object or from a string would make it more like an API, IMHO
-  Some kind of access to the TOC entry fields (date, size, compressed
  size etc) also seems like a nice feature.
- The data for an entry must be available in memory. Could be a problem 
  for huge files, but most like not in practical use.

I admit that I am fond of the api from java.util.zip.ZipFile and
java.util.zip.ZipOutputStream.

Regards,
Finn Bock


From tim_one at email.msn.com  Thu Dec 30 07:08:58 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 01:08:58 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com>
Message-ID: <000001bf528c$5cbdb9a0$a02d153f@tim>

[Skip Montanaro, wants nicer text facilities]
> ...
> I rather suspect that more people use Python for some sort of
> text processing than any other single application domain.

Hmm.  You're probably right, but I'm an exception.

> Python should be good at it.

And I guess I'm an exception mostly *because* Perl is better at easy text
crunching and Icon is better at hard text-crunching -- that is, I use the
right tool for the job <wink>.

> While I don't want to turn Python into Perl, I would like to see
> it do a better job of what most people probably use the language
> for.  Here is a very short list of things I think need attention:
>
>     1. [*A* clear way to do memory- and time-efficient textfile
>         input]

I agree, but unsure how to fix it.  The best way to write this now is

    # f is some open file object.
    while 1:
        lines = f.readlines(BUFSIZE)
        if not lines:
            break
        for line in lines:
            process(line)

and it's not something anyone figures out on their own -- or enjoys typing
or explaining afterwards.

Perl gets its line-at-a-time speed by peeking and poking C FILE structs
directly in compiler- and platform-specific ways -- ways that vendors
*should* have done in their own fgets implementations, but almost never do.
I have no idea whether it works well with Perl's nascent notions of
threading, but in the absence of that "the system" doesn't know Perl is
cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one
line at a time -- even mixing in C-level ungetc calls works (well, sometimes
<0.1 wink -- they don't always peek and poke enough fields>)).

The Python QIO extension module is much easier to port but less compatible
(it doesn't use stdio, so QIO-opened files don't play well with others) and
slower (although that's likely repairable -- he's got two passes over the
buffer where one hairier pass should suffice).

>     2. The re module needs to be sped up, if not to catch up with
>        Perl, then to catch up with the deprecated regex module.

The irony here is that the re engine is very often unboundedly faster than
the regex engine -- provided you're chewing over large strings.  Some tests
/F ran showed that the length-independent *overhead* of invoking re is about
10x higher than for regex.  Presumably the bulk of that is due to re.py,
i.e. that you get to the re engine via going thru Python layers on your way
in and out, while regex was pure C.

In any case, /F is working on a new engine (for Unicode), and I believe he
has this all well in hand.

> Depending how far people want to go with things, adding some
> language syntax to support regular expressions might be in order.
> ...
>     3. I've not yet used it, but I am told the pattern matching in
>        Marc-Andre Lemburg's mxTextTools
>       (http://starship.python.net/crew/lemburg/)
>        is both powerful and efficient (though it certainly appears
>        complex).  Perhaps it deserves consideration for
>        incorporation into the core Python distribution.

It's not complex, it's complicated -- and *that's* what makes it un-Pythonic
<wink>.  Tony Ibbs has written a friendly wrapper around mxTextTools that
suppresses much of the non-essential complication.  OTOH, if you go into
this with a regexp mindset, it will run much slower than a real regexp
package, because the bulk of the latter is devoted to doing optimization;
mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls
if you e.g. try to implement naive backtracking).

You should go to the REBOL site and look at the description of REBOL's PARSE
verb in the FAQ ... mumble, mumble ... at

    http://www.rebol.com/faq.html#11550948

Here's an example pulled from that page (this is a REBOL code fragment):

    digit: charset "0123456789"
    expr: [term ["+" | "-"] expr | term]
    term: [factor ["*" | "/"] term | factor]
    factor: [primary "**" factor | primary]
    primary: [value | "(" expr ")"]
    value: [digit value | digit]

    parse "1 + 2 ** 9" expr

There hasn't been a pattern scheme this clean, convenient or powerful since
SNOBOL4.  It exploits REBOL's Forth-like (lack of!) syntax, and
Smalltalk-like penchant for passing around thunks (anonymous closures --
"[...]" in REBOL builds a lexically-scoped entity called "a block", which
can be treated as code (executed) or data (manipulated like a Python list)
at will).

Now the example doesn't show this, but you can freely mix computations into
the middle of the patterns; only *some* of the words in the blocks have
special meaning to PARSE.  The fragment above is already way beyond what can
be accomplished with regexps, but that's just the start of it.  Perl too is
slamming in more & more ways to get user code to interact with its regexp
engine.

So REBOL has a *very* nice approach to this; I believe it's unreasonably
clumsy to mimic in Python primarily because of forward references (note e.g.
that the block attached to "expr" above refers to "term" before the latter
has been bound -- but the stuff inside [...] is just a closure so that
doesn't matter -- it only matters that term gets bound before expr is
*executed*).  I hit a similar snag years ago when trying to mimic SNOBOL4's
approach in Python.

Perl's endless abuse of regexps is making that language more absurd by the
month.

The other major approach to mixing patterns with computation is due to Icon,
another language where a regexp mindset is fatal.  On a whim, I whipped up
the attached, which illustrates a bit of the Icon approach in Pythonic terms
(but without language support for generators, the *heart* of it can't really
be captured).  Here's an example of how this could be used to implement (the
simplest form of) string.split:

def mysplit(str):
    s = Searcher(str)
    white = CharSet(" \t\n")
    result = []
    s.many(white)            # consume initial whitespace
    while s.notmany(white):  # consume non-whitespace
        result.append(s.get_match())
        s.many(white)
    return result

>>> mysplit("   \t Hey,   that's\tpretty\n\n neat!  ")
['Hey,', "that's", 'pretty', 'neat!']
>>>

The primary thing to note is that there's no seam between analyzing the
string and doing computation on the partial results -- "the program is the
pattern".  This is what Icon does to perfection, Perl is moving toward, and
REBOL is arriving at from a different direction.  It's The Future <0.9
wink>.

Without generators it's difficult to work backtracking into the Searcher
class, but, as above, in my experience the backtracking feature of regexps
is rarely *needed*!  For example, at various points "split" wants to suck up
all the whitespace characters, and that's *it* -- the backtracking
possibility in the regexp \s+ is often a bug just waiting for unexpected
*context* to trigger it.  A hairy regexp is pure hell; but what simpler
regexps can do don't require all that funky regexp machinery.

BTW, the mxTextTools engine could be used to get blazing implementations of
the primary Searcher methods (it excels at simple analysis).  OTOH, making
lots of calls to analyze short strings is slow.  The only clean solutions to
that are Perl's and Icon's (build everyting into one language so the
compiler can optimize stuff away), and REBOL's (make no distinction between
code and data, so that code can be analyzed & optimized at runtime -- and
build the entire implementation around making closures and calls
supernaturally fast).

the-less-you-use-regexps-the-less-you-miss-'em<wink>-ly y'rs  - tim

class CharSet:
    def __init__(self, seq):
        self.seq = seq
        d = {}
        for ch in seq:
            d[ch] = 1
        self.haskey = d.has_key

    def __call__(self, ch):
        return self.haskey(ch)

    def __add__(self, other):
        if isinstance(other, CharSet):
            other = other.seq
        return CharSet(self.seq + other)

def _normalize_index(i, n):
    assert n >= 0
    if i >= 0:
        return min(i, n)
    elif n == 0:
        return 0
    # want smallest q s.t. i + q*n >= 0
    # <->  q*n >= -i
    # <->  q >= -i/n
    # so q = ceiling(-i/n) = -floor(i/n)
    return i - (i/n)*n

class Searcher:
    def __init__(self, str, lo=0, hi=None):
        """Create object to search in str[lo:hi].

        lo defaults to 0.
        hi defaults to len(str).
        len(str) is repeatedly added to negative lo or hi until
        reaching a number >= 0.
        If lo > hi, a uselessly empty slice will be searched.
        The search cursor is initialized to lo.
        """

        self.s = str
        self.lo = _normalize_index(lo, len(str))
        if hi is None:
            self.hi = len(str)
        else:
            self.hi = _normalize_index(hi, len(str))
        if self.lo > self.hi:
            self.hi = self.lo
        self.i = self.lo
        self.lastmatch = None, None

    def any(self, charset, consume=1):
        """Try to match single character in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i = self.i
        if i < self.hi and charset(self.s[i]):
            if consume:
                self.__consume(i+1)
            return 1
        return 0

    def notany(self, charset, consume=1):
        """Try to match single character not in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i = self.i
        if i < self.hi and not charset(self.s[i]):
            if consume:
                self.__consume(i+1)
            return 1
        return 0

    def many(self, charset, consume=1):
        """Try to match one or more characters in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i, n, s = self.i, self.hi, self.s
        j = i
        while j < n and charset(s[j]):
            j = j+1
        if i < j:
            if consume:
                self.__consume(j)
            return 1
        return 0

    def notmany(self, charset, consume=1):
        """Try to match one or more characters not in charset.

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i, n, s = self.i, self.hi, self.s
        j = i
        while j < n and not charset(s[j]):
            j = j+1
        if i < j:
            if consume:
                self.__consume(j)
            return 1
        return 0

    def match(self, str, consume=1):
        """Try to match string "str".

        Return true iff match succeeded.
        Advance cursor iff success and optional arg "consume" is true.
        """

        i = self.i
        j = i + len(str)
        if self.s[i:j] == str:
            if consume:
                self.__consume(j)
            return 1
        return 0

    def get_str(self):
        """Return subject string."""
        return self.s

    def get_lo(self):
        """Return low slice bound."""
        return self.lo

    def get_hi(self):
        """Return high slice bound."""
        return self.hi

    def get_pos(self):
        """Return current value of search cursor."""
        return self.i

    def get_match_indices(self):
        """Return slice indices of last "consumed" match."""
        return self.lastmatch

    def get_match(self):
        """Return last "consumed" matching substring."""
        i, j = self.lastmatch
        if i is None:
            return ValueError("no match to return!")
        return self.s[i:j]

    def set_pos(self, pos, consume=1):
        """Set search cursor to new value.  No return value.

        If optional arg "consume" is true, the last match is set to
        the slice between pos and the current cursor position.
        """

        p = _normalize_index(pos, len(self.s))
        if not self.lo <= p <= self.hi:
            raise ValueError("pos out of bounds: " + `pos`)
        if consume:
            self.__consume(p)
        else:
            self.i = p

    def move_pos(self, incr, consume=1):
        """Move the cursor by incr characters.  No return value.

        If the new value is outside the slice bounds, it's clipped.
        If optional arg "consume" is true, the last match is set to
        the slice between the old and new cursor positions.
        """

        newi = self.i + incr
        if newi < self.lo:
            newi = self.lo
        elif newi > self.hi:
            newi = self.hi
        if consume:
            self.__consume(newi)
        else:
            self.i = newi

    def __consume(self, newi):
        i, j = self.i, newi
        if i > j:
            i, j = j, i
        self.lastmatch = i, j
        self.i = newi


From tim_one at email.msn.com  Thu Dec 30 07:09:14 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 01:09:14 -0500
Subject: [Python-Dev] Fixed-decimal types
In-Reply-To: <199912231944.OAA23337@eric.cnri.reston.va.us>
Message-ID: <000201bf528c$657c3080$a02d153f@tim>

[Guido]
> ...
> Not arguing for this interpretation, just indicating that doing
> fixed precision arithmetic right is hard.

It's not so much hard as it is arbitrary.  The floating-point world is
standardized now, but the fixed-point world remains a mish-mash of
incompatible legacy schemes carried across generations of products for no
reason other than product-specific compatibility.  So despite that
fixed-point has a specialty audience, whatever rules Python chooses will
leave it incompatible with much of that audience's (mixed!) expectations.

If fixed-point is needed, and my FixedPoint.py isn't good enough (all other
fixed point pkgs I've seen for Python were braindead), then it should be
implemented such that developers can control both rounding and precision
propagation.  I'll attach suitable kernels; they haven't been tested but any
bugs discovered will be trivial to fix (there are no difficulties here, but
typos are likely); the kernels supply the bulk of what's required, whether
implemented in Python or C; various packages can wrap them to supply
whatever policies they like; see FixedPoint.py for exact string<->FixedPoint
and exact float->FixedPoint conversions; and that's the end of my
involvement in fixed-point <wink>.

Python should certainly *not* add a "scale factor" to its current long
implementation; fixed-point should be a distinct type, as scale-factor
fiddling is clumsy and pervasive (long arithmetic is challenging enough to
get correct and quick without this obfuscating distraction; and by leaving
scale factors out of it, it's much easier to plug in alternative bigint
implementations (like GMP)).

One other point:  some people are going to want BCD (binary-coded decimal),
which suffers the same mish-mash of legacy policies, but with a different
data representation.  The point is that many commercial applications spend
much more time doing I/O conversions than arithmetic, and BCD accepts slow
arithmetic (in the absence of special HW support) in return for fast scaling
& I/O conversion.

Forgetting the database-heads for a moment, decimal *floating*-point is what
calculators do, so that's what "real people" are most comfortable with.  The
IEEE-854 std (IEEE-754's younger and friendlier brother) specifies that
completely.  Add a means to boost "global" precision (a la REXX), and it's a
powerful tool even for experts (benefits approximating those of unbounded
rational arithmetic but with bounded & user-controllable expense).

can-never-have-too-many-numeric-types-but-always-have-
    too-few-literal-notations-ly y'rs  - tim


# Kernels for fixed-point decimal arithmetic.

# _add, _sub, _mul, _div all have arglist
#     n1, p1, n2, p2, p, round=DEFAULT_ROUND
# n1 and n2 are longs; p1, p2 and p ints >= 0.
# The inputs are exactly n1/10**p1 and n2/10**p2.
#
# The return value is the integer n such that n/10**p is the best
# approximation to the infinite-precision result.  In other words, p1
# and p2 are the input precisions and p is the desired output
# precision, where precision is the # of digits *after* the decimal
# point.
#
# What "best approximation" means is determined by the round function.
# In many cases rounding isn't required, but when it is
#     round(top, bot)
# is returned.  top and bot are longs, with bot > 0 guaranteed.  The
# infinite-precision result is top/bot.  round must return an integer
# (long) approximation to top/bot, using whichever rounding discipline
# you want.  By default, IEEE round-to-nearest/even is used; see the
# _roundXXX functions for examples of suitable rounding functions.
#
# Note:  The only code here that knows we're working in decimal is
# function _tento; simply change the "10L" in that to do fixed-point
# arithmetic in some other base.
#
# Example:
#
# >>> r7 = _div(1L, 0, 7L, 0, 20)  # 1/7
# >>> r7
# 14285714285714285714L
# >>> r5 = _div(1L, 0, 5L, 0, 20)  # 1/5
# >>> r5
# 20000000000000000000L
# >>> sum = _add(r7, 20, r5, 20, 20)  # 1/7 + 1/5 = 12/35
# >>> sum
# 34285714285714285714L
# >>> _mul(sum, 20, 35L, 0, 20)
# 1199999999999999999990L
# >>> _mul(sum, 20, 35L, 0, 18)
# 12000000000000000000L
# >>> _mul(sum, 20, 35L, 0, 0)
# 12L
# >>>

###################################################################
# Sample rounding functions.
###################################################################

# Round to minus infinity.

def _roundminf(top, bot):
    assert bot > 0
    return top / bot

# Round to plus infinity.

def _roundpinf(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    if r:
        q = q + 1
    return q

# IEEE nearest/even rounding (closest integer; in case of tie closest
# even integer).

def _roundne(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    c = cmp(r << 1, bot)
    # c < 0 <-> r < bot/2, etc
    if c > 0 or (c == 0 and (q & 1) == 1):
        q = q + 1
    return q

# "Add a half and chop" rounding (remainder < 1/2 toward 0; remainder
# >= half away from 0).

def _roundhalf(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    c = cmp(r << 1, bot)
    # c < 0 <-> r < bot/2, etc
    if c > 0 or (c == 0 and q >= 0):
        q = q + 1
    return q

# Round toward 0 (throw away remainder).

def _roundchop(top, bot):
    assert bot > 0
    q, r = divmod(top, bot)
    # answer is exactly q + r/bot; and 0 <= r < bot
    if r and q < 0:
        q = q + 1
    return q

###################################################################
# Kernels for + - * /.
###################################################################

DEFAULT_ROUND = _roundne

def _add(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    # (n1/10**p1 + n2/10**p2) * 10**p ==
    # (n1*10**(max-p1) + n2*10**(max-p2))/10**max * 10**p
    max = p1    # until proven otherwise
    if p1 < p2:
        n1 = n1 * _tento(p2 - p1)
        max = p2
    elif p2 < p1:
        n2 = n2 * _tento(p1 - p2)
    n3 = n1 + n2
    p3 = p - max
    if p3 > 0:
        n3 = n3 * _tento(p3)
    elif p3 < 0:
        n3 = round(n3, _tento(-p3))
    return n3

def _sub(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    return _add(n1, p1, -n2, p2, p, round)

def _mul(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    # (n1/10**p1 * n2/10**p2) * 10**p ==
    # (n1*n2)/10**(p1+p2) * 10**p
    n3 = n1 * n2
    p3 = p - p1 - p2
    if p3 > 0:
        n3 = n3 * _tento(p3)
    elif p3 < 0:
        n3 = round(n3, _tento(-p3))
    return n3

def _div(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
    assert p1 >= 0
    assert p2 >= 0
    assert p >= 0
    if n2 == 0:
        raise ZeroDivisionError("scaled integer")
    # (n1/10**p1 / n2/10**p2) * 10**p ==
    # (n1/n2) * 10**(p2-p1+p)
    p3 = p2 - p1 + p
    if p3 > 0:
        n1 = n1 * _tento(p3)
    elif p3 < 0:
        n2 = n2 * _tento(-p3)
    if n2 < 0:
        n1 = -n1
        n2 = -n2
    return round(n1, n2)

def _tento(i, _cache={}):
    assert i >= 0
    try:
        return _cache[i]
    except KeyError:
        answer = _cache[i] = 10L ** i
        return answer


From fredrik at pythonware.com  Thu Dec 30 12:05:45 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 30 Dec 1999 12:05:45 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <000001bf528c$5cbdb9a0$a02d153f@tim>
Message-ID: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com>

Tim Peters is back from his vacation:
> > While I don't want to turn Python into Perl, I would like to see
> > it do a better job of what most people probably use the language
> > for.  Here is a very short list of things I think need attention:
> >
> >     1. [*A* clear way to do memory- and time-efficient textfile
> >         input]
> 
> I agree, but unsure how to fix it.  The best way to write this now is
> 
>     # f is some open file object.
>     while 1:
>         lines = f.readlines(BUFSIZE)
>         if not lines:
>             break
>         for line in lines:
>             process(line)
> 
> and it's not something anyone figures out on their own -- or enjoys typing
> or explaining afterwards.
> 
> Perl gets its line-at-a-time speed by peeking and poking C FILE structs
> directly in compiler- and platform-specific ways -- ways that vendors
> *should* have done in their own fgets implementations, but almost never do.
> I have no idea whether it works well with Perl's nascent notions of
> threading, but in the absence of that "the system" doesn't know Perl is
> cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one
> line at a time -- even mixing in C-level ungetc calls works (well, sometimes
> <0.1 wink -- they don't always peek and poke enough fields>)).
> 
> The Python QIO extension module is much easier to port but less compatible
> (it doesn't use stdio, so QIO-opened files don't play well with others) and
> slower (although that's likely repairable -- he's got two passes over the
> buffer where one hairier pass should suffice).

we have something called SIO which uses memory mapping
where possible, and just a more aggressive read-ahead for
other cases.  on a windows box, a traditional while/readline
loop runs 3-5 times faster than before.  with SRE instead of
re, a while/readline/match loop runs up to 10 times faster
than before.

note that this is without *any* changes to the Python
source code...

> >     2. The re module needs to be sped up, if not to catch up with
> >        Perl, then to catch up with the deprecated regex module.
> 
> The irony here is that the re engine is very often unboundedly faster than
> the regex engine -- provided you're chewing over large strings.  Some tests
> /F ran showed that the length-independent *overhead* of invoking re is about
> 10x higher than for regex.  Presumably the bulk of that is due to re.py,
> i.e. that you get to the re engine via going thru Python layers on your way
> in and out, while regex was pure C.

I've attached some old benchmarks.  I think the current code
base is a bit faster, but you get the idea.

> In any case, /F is working on a new engine (for Unicode), and I believe he
> has this all well in hand.

with a little luck, the new module will replace both pcre
and regex...

not to mention that it's fairly easy to write your own front-
end to the matching engine -- the expression parser and the
compiler are both written in good old python.

</F>

$ python sre_bench.py
          0     5    50   250  1000  5000 25000
----- ----- ----- ----- ----- ----- ----- -----
search for Python|Perl in Perl ->
sre8  0.007 0.008 0.010 0.010 0.020 0.073 0.349
sre16 0.007 0.007 0.008 0.010 0.020 0.075 0.353
re    0.097 0.097 0.101 0.103 0.118 0.175 0.480
regex 0.007 0.007 0.009 0.020 0.059 0.271 1.320

search for (Python|Perl) in Perl ->
sre8  0.007 0.007 0.007 0.010 0.020 0.074 0.344
sre16 0.007 0.007 0.008 0.010 0.020 0.074 0.347
re    0.110 0.104 0.111 0.115 0.125 0.184 0.559
regex 0.006 0.006 0.009 0.019 0.057 0.285 1.432

search for Python in Python ->
sre8  0.007 0.007 0.007 0.011 0.021 0.072 0.387
sre16 0.007 0.007 0.008 0.010 0.022 0.082 0.365
re    0.107 0.097 0.105 0.102 0.118 0.175 0.511
regex 0.009 0.008 0.010 0.018 0.036 0.139 0.708

search for .*Python in Python ->
sre8  0.008 0.007 0.008 0.011 0.021 0.079 0.379
sre16 0.008 0.008 0.008 0.011 0.022 0.075 0.402
re    0.102 0.108 0.119 0.183 0.400 1.545 7.284
regex 0.013 0.019 0.072 0.318 1.231 8.035 45.366

search for .*Python.* in Python ->
sre8  0.008 0.008 0.008 0.011 0.021 0.080 0.383
sre16 0.008 0.008 0.008 0.011 0.021 0.079 0.395
re    0.103 0.108 0.119 0.184 0.418 1.685 8.378
regex 0.013 0.020 0.073 0.326 1.264 9.961 46.511

search for .*(Python) in Python ->
sre8  0.007 0.008 0.008 0.011 0.021 0.077 0.378
sre16 0.007 0.008 0.008 0.011 0.021 0.077 0.444
re    0.108 0.107 0.134 0.240 0.637 2.765 13.395
regex 0.026 0.112 3.820 87.322 (skipped)

search for .*P.*y.*t.*h.*o.*n.* in Python ->
sre8  0.010 0.010 0.014 0.031 0.093 0.419 2.212
sre16 0.010 0.011 0.014 0.030 0.093 0.419 2.292
re    0.112 0.121 0.195 0.521 1.747 8.298 40.877
regex 0.026 0.048 0.248 1.148 4.550 24.720 ...

(searching for patterns in padded strings; sre8
is the sre engine compiled for 8-bit characters,
sre16 is the same engine compiled for 16-bit
characters)


From mal at lemburg.com  Thu Dec 30 12:52:50 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 30 Dec 1999 12:52:50 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <000001bf528c$5cbdb9a0$a02d153f@tim>
Message-ID: <386B4792.A551022A@lemburg.com>

Tim Peters wrote:
> 
> [Skip Montanaro, wants nicer text facilities]
> > While I don't want to turn Python into Perl, I would like to see
> > it do a better job of what most people probably use the language
> > for.  Here is a very short list of things I think need attention:
> >
> >     1. [*A* clear way to do memory- and time-efficient textfile
> >         input]
>
> ...
> 
> The Python QIO extension module is much easier to port but less compatible
> (it doesn't use stdio, so QIO-opened files don't play well with others) and
> slower (although that's likely repairable -- he's got two passes over the
> buffer where one hairier pass should suffice).

What is QIO ?
 
> > Depending how far people want to go with things, adding some
> > language syntax to support regular expressions might be in order.
> > ...
> >     3. I've not yet used it, but I am told the pattern matching in
> >        Marc-Andre Lemburg's mxTextTools
> >       (http://starship.python.net/crew/lemburg/)
> >        is both powerful and efficient (though it certainly appears
> >        complex).  Perhaps it deserves consideration for
> >        incorporation into the core Python distribution.
> 
> It's not complex, it's complicated -- and *that's* what makes it un-Pythonic
> <wink>.  Tony Ibbs has written a friendly wrapper around mxTextTools that
> suppresses much of the non-essential complication.  OTOH, if you go into
> this with a regexp mindset, it will run much slower than a real regexp
> package, because the bulk of the latter is devoted to doing optimization;
> mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls
> if you e.g. try to implement naive backtracking).

All true. mxTextTools provides the tools, not the magic. But this
is also its strength: you can optimize the hell out of your particular
parsing requirement without having to think about how the RE optimizer
works.

> You should go to the REBOL site and look at the description of REBOL's PARSE
> verb in the FAQ ... mumble, mumble ... at
> 
>     http://www.rebol.com/faq.html#11550948
> 
> Here's an example pulled from that page (this is a REBOL code fragment):
> 
>     digit: charset "0123456789"
>     expr: [term ["+" | "-"] expr | term]
>     term: [factor ["*" | "/"] term | factor]
>     factor: [primary "**" factor | primary]
>     primary: [value | "(" expr ")"]
>     value: [digit value | digit]
> 
>     parse "1 + 2 ** 9" expr
> 
> There hasn't been a pattern scheme this clean, convenient or powerful since
> SNOBOL4.  It exploits REBOL's Forth-like (lack of!) syntax, and
> Smalltalk-like penchant for passing around thunks (anonymous closures --
> "[...]" in REBOL builds a lexically-scoped entity called "a block", which
> can be treated as code (executed) or data (manipulated like a Python list)
> at will).

Looks nice indeed, but how does executable code fit into
that definition ? (mxTextTools allows you to write your own
parsing elements in Python, BTW; it should be possible to
use those mechanisms to achieve a similar intergration.)
 
> ...
>
> BTW, the mxTextTools engine could be used to get blazing implementations of
> the primary Searcher methods (it excels at simple analysis).  OTOH, making
> lots of calls to analyze short strings is slow.

That's why mxTextTools converts these search idioms into byte codes
which it executes at C level. Some future version will even "precompile"
the tuple input and then omit the type checks during the search...
that should give another noticeable speedup. Note that recursion
etc. can be done at C level too -- Python function calls are not
needed.

> The only clean solutions to
> that are Perl's and Icon's (build everyting into one language so the
> compiler can optimize stuff away), and REBOL's (make no distinction between
> code and data, so that code can be analyzed & optimized at runtime -- and
> build the entire implementation around making closures and calls
> supernaturally fast).

Just for kicks, here is the mysplit() function using mxTextTools:

from mx.TextTools import *

table = (
    # Match all whitespace
    (None,AllInSet,whitespace_set,+1),
    # Match and tag all non-whitespace
    ('text',AllInSet + AppendMatch,nonwhitespace_set,+1),
    # Loop until EOF
    (None,EOF,Here,-2),
    )

def mysplit(text):

    return tag(text,table)[1]

The timings:
 mysplit: 5.84 sec.
 string.split: 3.62 sec.

Note that you can customize the above to split text at any
character set you like, not just whitespace... without
compiling or writing C code. The function mx.TextTools.setsplit()
provides this functionality as pure C function.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Get ready to party !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jim at interet.com  Thu Dec 30 15:21:36 1999
From: jim at interet.com (James C. Ahlstrom)
Date: Thu, 30 Dec 1999 09:21:36 -0500
Subject: [Python-Dev] zipfile.py
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk>
Message-ID: <386B6A70.3C9A0042@interet.com>

Finn Bock wrote:
> 
> James C. Ahlstrom wrote:
> 
> >  ftp://ftp.interet.com/pub/pylib.html
> 
> I feel that it smell a bit too much like a tool and too little like an general
> programming api.

It was meant to be an API except for writepy(), which is clearly a tool.
 
> - It can only add disk files. The ability to write data to a zip entry through
>   a file-like object or from a string would make it more like an API, IMHO

I could add a method
     writestr(self, string, year, month, day, hour, minute, second, ...)
There are a lot of fields required which usually come from the file.

> -  Some kind of access to the TOC entry fields (date, size, compressed
>   size etc) also seems like a nice feature.

This access is provided directly by self.TOC, and the fields are
documented.

> - The data for an entry must be available in memory. Could be a problem
>   for huge files, but most like not in practical use.

I agree, but adding loops will make it slower.  What do others think?
 
> I admit that I am fond of the api from java.util.zip.ZipFile and
> java.util.zip.ZipOutputStream.

I don't know this API.  If writestr() is not sufficient, what
API would you like?

JimA


From bckfnn at pipmail.dknet.dk  Thu Dec 30 20:14:14 1999
From: bckfnn at pipmail.dknet.dk (Finn Bock)
Date: Thu, 30 Dec 1999 19:14:14 GMT
Subject: [Python-Dev] zipfile.py
In-Reply-To: <386B6A70.3C9A0042@interet.com>
References: <Pine.LNX.4.10.9912141121110.16305-100000@nebula.lyra.org> <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk> <386B6A70.3C9A0042@interet.com>
Message-ID: <386baec9.2867733@pipmail.dknet.dk>

[I wrote]

> - It can only add disk files. The ability to write data to a zip entry through
>   a file-like object or from a string would make it more like an API, IMHO

[JimA wrote]

>I could add a method
>     writestr(self, string, year, month, day, hour, minute, second, ...)
>There are a lot of fields required which usually come from the file.

Something like that seems fine to me. 

[I wrote]

> -  Some kind of access to the TOC entry fields (date, size, compressed
>   size etc) also seems like a nice feature.

[JimA answers]

>This access is provided directly by self.TOC, and the fields are
>documented.

Good enough. My bad, I was looking for getter methods. (me being a java dude)

[I wrote]

> I admit that I am fond of the api from java.util.zip.ZipFile and
> java.util.zip.ZipOutputStream.

[JimA asks]

>I don't know this API.  If writestr() is not sufficient, what
>API would you like?

This is only meant as a source for inspiration, certainly as a request for
change. writestr would answer my complaint nicely. Below, only one ZipEntry can
be actively read or written to at a time. All the small details of performance
and implementation complexity are ignored. 

class ZipFile:
    def getEntry(name):
          ...
          self.activeentry = ZipEntry(name)
          return self.activeentry

class ZipEntry:
     #enough methods and fields to fake file-ness to casual users like me.
     def write(list): ...
     def writelines(str): ...
     def read(size=None): ...
     def readlines(sizehint=-1): ...

     def seek(offset): ...
     def flush(): ...
     def close(str): ...

     def getSize(): ....
     def getCompressedSize(): ....
     def getFlags(): ....


regards,
finn


From tim_one at email.msn.com  Fri Dec 31 04:35:18 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 22:35:18 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <386B4792.A551022A@lemburg.com>
Message-ID: <000001bf5340$0fb20300$e12d153f@tim>

[M.-A. Lemburg]
> What is QIO ?

See DejaNews (I don't save URLs).  "Quick" line-oriented text input adapted
from INN.  Someone rewrote that as a Python extension module.

>>     http://www.rebol.com/faq.html#11550948

> Looks nice indeed, but how does executable code fit into
> that definition ?

See the URL above I didn't save <wink>.  PARSE's "pattern" argument is a
block.  Blocks can be (& often are) nested.  Whether any given block is code
or data is all the same to REBOL, so passing nested code blocks in PARSE's
pattern argument is easy.  Because blocks are lexically scoped, assignments
(etc) inside a block are (well, can be) visible to its context; etc.  It's a
very Lispish approach.  REBOL is essentially Scheme under the covers, but
with syntax much more like Forth's (whitespace-separated strings of
arbitrary non-whitespace characters, with few pre-assigned meanings or
restrictions -- in fact, it's impossible for a compiler to determine where a
REBOL function call begins or ends!  can't be known until runtime).

> (mxTextTools allows you to write your own parsing elements
> in Python, BTW; it should be possible to use those mechanisms
> to achieve a similar intergration.)

It can't capture the flavor -- although I don't know that it needs to
<wink>.  There's no distinction between "the pattern language" and "the
computational language" in REBOL or Icon, and it's hard to explain what a
maddening distinction that can be once you've lived without it.  mxTextTools
embedding would feel more like Icon, where the matching engine is fully
exposed to the programmer (REBOL hides it, allowing only "approved"
interactions).

>> OTOH, making lots of calls to analyze short strings is slow.

> That's why mxTextTools converts these search idioms into byte
> codes which it executes at C level. Some future version will
> even "precompile" the tuple input and then omit the type checks
> during the search...that should give another noticeable speedup.
> Note that recursion etc. can be done at C level too -- Python
> function calls are not needed.

That's also the curse of having distinct languages; e.g., Python already had
recursion, but you needed to reimplement it in a different way with
different syntax and different rules in your pattern language.  In Icon etc,
there's no difference between a recursive pattern and a recursive function,
except in *what* it computes.  The machinery is all the same, and both more
powerful and easier to learn because of that.

> ...
> Just for kicks, here is the mysplit() function using mxTextTools:
>
> from mx.TextTools import *
>
> table = (
>     # Match all whitespace
>     (None,AllInSet,whitespace_set,+1),
>     # Match and tag all non-whitespace
>     ('text',AllInSet + AppendMatch,nonwhitespace_set,+1),
>     # Loop until EOF
>     (None,EOF,Here,-2),
>     )
>
> def mysplit(text):
>
>     return tag(text,table)[1]
>
> The timings:
>  mysplit: 5.84 sec.
>  string.split: 3.62 sec.
>
> Note that you can customize the above to split text at any
> character set you like, not just whitespace... without
> compiling or writing C code.

That's equally true of the example I posted <wink>.  Now what if I wanted to
stop splitting right after I find a keyword, recognized as such because it's
a key in some passed-in dictionary?  In my example, I make an obvious local
code change, from

    while s.notmany(white):  # consume non-whitespace
        result.append(s.get_match())
        s.many(white)

to

    while s.notmany(white):  # consume non-whitespace
        word = s.get_match()
        result.append(word)
        if dictionary.has_key(word):
            break
        s.many(white)

What does it do to your example?  Or what if the target string isn't "a
string" (the code I posted only assumes the "str" object responds to
indexing and slicing -- any buffer object is fine -- so my example doesn't
change at all)?  Or what if you need to pass the tokens on as they're found,
pipeline style?  Etc.  This is why I do complex string processing in Icon
<0.9 wink>.

OTOH, at what it does well, mxTextTools runs quicker than Icon.  Its biggest
problem has always been that e.g. nobody knows what the hell

     (None,EOF,Here,-2),

*means* at first glance -- or third <wink>.

an-extreme-on-the-transparency-vs-speed-curve-ly y'rs  - tim


From mal at lemburg.com  Fri Dec 31 12:18:57 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 31 Dec 1999 12:18:57 +0100
Subject: [Python-Dev] Better text processing support in py2k?
References: <000001bf5340$0fb20300$e12d153f@tim>
Message-ID: <386C9121.E9D9DC01@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > What is QIO ?
> 
> See DejaNews (I don't save URLs).  "Quick" line-oriented text input adapted
> from INN.  Someone rewrote that as a Python extension module.

Ok, thanks.
 
> >>     http://www.rebol.com/faq.html#11550948
> 
> > Looks nice indeed, but how does executable code fit into
> > that definition ?
> 
> See the URL above I didn't save <wink>.  PARSE's "pattern" argument is a
> block.  Blocks can be (& often are) nested.  Whether any given block is code
> or data is all the same to REBOL, so passing nested code blocks in PARSE's
> pattern argument is easy.  Because blocks are lexically scoped, assignments
> (etc) inside a block are (well, can be) visible to its context; etc.  It's a
> very Lispish approach.  REBOL is essentially Scheme under the covers, but
> with syntax much more like Forth's (whitespace-separated strings of
> arbitrary non-whitespace characters, with few pre-assigned meanings or
> restrictions -- in fact, it's impossible for a compiler to determine where a
> REBOL function call begins or ends!  can't be known until runtime).

If I understand the concept correctly, I think Python could do
pretty much the same thing. The bummer is of course the need
for new keywords and byte codes (although these could be
split out into a separate text scanning engine). Using Python
function calls would slow down things to an extent that would
render the added functionality useless, well IMHO anyways ;-)

> > (mxTextTools allows you to write your own parsing elements
> > in Python, BTW; it should be possible to use those mechanisms
> > to achieve a similar intergration.)
> 
> It can't capture the flavor -- although I don't know that it needs to
> <wink>.  There's no distinction between "the pattern language" and "the
> computational language" in REBOL or Icon, and it's hard to explain what a
> maddening distinction that can be once you've lived without it.  mxTextTools
> embedding would feel more like Icon, where the matching engine is fully
> exposed to the programmer (REBOL hides it, allowing only "approved"
> interactions).

Of course its hard for a Turing Machine to capture the flavor
of any high level language :-) When you're programming
the mxTextTools Tagging Engine directly you feel like writing
assembler... but things are moving in the right direction:
Tony Ibbs has a nice meta-language and M.C. Fletcher his
SimpleParse to cover up these insufficiencies.
 
> >> OTOH, making lots of calls to analyze short strings is slow.
> 
> > That's why mxTextTools converts these search idioms into byte
> > codes which it executes at C level. Some future version will
> > even "precompile" the tuple input and then omit the type checks
> > during the search...that should give another noticeable speedup.
> > Note that recursion etc. can be done at C level too -- Python
> > function calls are not needed.
> 
> That's also the curse of having distinct languages; e.g., Python already had
> recursion, but you needed to reimplement it in a different way with
> different syntax and different rules in your pattern language.  In Icon etc,
> there's no difference between a recursive pattern and a recursive function,
> except in *what* it computes.  The machinery is all the same, and both more
> powerful and easier to learn because of that.

Agreed.
 
> > ...
> > Just for kicks, here is the mysplit() function using mxTextTools:
> >
> > from mx.TextTools import *
> >
> > table = (
> >     # Match all whitespace
> >     (None,AllInSet,whitespace_set,+1),
> >     # Match and tag all non-whitespace
> >     ('text',AllInSet + AppendMatch,nonwhitespace_set,+1),
> >     # Loop until EOF
> >     (None,EOF,Here,-2),
> >     )
> >
> > def mysplit(text):
> >
> >     return tag(text,table)[1]
> >
> > The timings:
> >  mysplit: 5.84 sec.
> >  string.split: 3.62 sec.
> >
> > Note that you can customize the above to split text at any
> > character set you like, not just whitespace... without
> > compiling or writing C code.
> 
> That's equally true of the example I posted <wink>.  Now what if I wanted to
> stop splitting right after I find a keyword, recognized as such because it's
> a key in some passed-in dictionary?  In my example, I make an obvious local
> code change, from
> 
>     while s.notmany(white):  # consume non-whitespace
>         result.append(s.get_match())
>         s.many(white)
> 
> to
> 
>     while s.notmany(white):  # consume non-whitespace
>         word = s.get_match()
>         result.append(word)
>         if dictionary.has_key(word):
>             break
>         s.many(white)
> 
> What does it do to your example? 

You'd replace the 'text' tagobj with a callable object and
write AllInSet + CallTag as command. The Tagging Engine will
then call the object with arguments (taglist,text,l,r,subtags)
and let it decide what to do.

In your example it would check the dictionary and raise an
exception in case a keyword is found to stop any further
scanning. If it's not a keyword, it would simply append
the found string to the taglist and return None.

Here's the code:

from mx.TextTools import *

import exceptions

stoplist = {'abc':1, 'def':1}

class KeywordFound(exceptions.StandardError):
    def __init__(self, taglist):
        self.taglist = taglist

def callable(taglist,text,l,r,subtags):

    taglist.append(text[l:r])
    if stoplist.has_key(text[l:r]):
        raise KeywordFound(taglist)

table = (
    # Match all whitespace
    (None,AllInSet,whitespace_set,+1),
    # Match and tag all non-whitespace
    (callable,AllInSet + CallTag,nonwhitespace_set,+1),
    # Loop until EOF
    (None,EOF,Here,-2),
    )

def mysplitex(text):

    try:
        return tag(text,table)[1]
    except KeywordFound,data:
        return data.taglist

> Or what if the target string isn't "a
> string" (the code I posted only assumes the "str" object responds to
> indexing and slicing -- any buffer object is fine -- so my example doesn't
> change at all)? 

The current version only handles string objects, but I am
already beginning to convert all the APIs in mxTextTools to
"s#" or "t#" style (can't decide which to use... "s#" is great
for processing raw data, while "t#" more closely refers to
text processing).

> Or what if you need to pass the tokens on as they're found,
> pipeline style?  Etc.  This is why I do complex string processing in Icon
> <0.9 wink>.

You can have all that extra magic via callable tag objects
or callable matching functions. It's not exactly nice to
write, but I'm sure that a meta-language could do the 
conversions for you.
 
> OTOH, at what it does well, mxTextTools runs quicker than Icon.  Its biggest
> problem has always been that e.g. nobody knows what the hell
> 
>      (None,EOF,Here,-2),
> 
> *means* at first glance -- or third <wink>.

The structure of those tag tables is very simple:

(tagobject, command, argument[, jump offset in case of failure
			     [, jump offset in case of success]])
                               
Please remember that this is byte code, not some higher level
abstraction. The design is very much inverted from what you'd
usually do: design a nice language and then try to find suitable
set of byte codes to make it work as intended.

Anyway, I'll keep focussing on the speed aspect of mxTextTools;
others can focus on abstractions, so that eventually everybody
will be happy :-)

Happy New Year,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Get ready to party !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim_one at email.msn.com  Fri Dec 31 23:53:49 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Fri, 31 Dec 1999 17:53:49 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com>
Message-ID: <000701bf53e1$e7119760$472d153f@tim>

[Fredrik Lundh, whose very nice eMatter book is on sale until
  the end of the 20th century (as real people think of it),
  although the eMatter distribution scheme has lots of problems
  [just an editorial note from a bot who has to-- for unknown
   reasons Fatbrain "is working on" --delete the Fatbrain
   registry tree and reregister the book almost every time he
   tries to open it <wink>
  ]
]

> we have something called SIO which uses memory mapping
> where possible, and just a more aggressive read-ahead for
> other cases.  on a windows box, a traditional while/readline
> loop runs 3-5 times faster than before.  with SRE instead of
> re, a while/readline/match loop runs up to 10 times faster
> than before.
>
> note that this is without *any* changes to the Python
> source code...

If so, there's potential for significantly more speed.  Python does its
line-at-a-time input with a character-at-a-time macro-in-a-loop, the same
way naive vendors (read "almost all vendors") implement fgets.  It's
replacing that inner loop with direct peeking into the FILE buffer that gets
Perl its dramatic speed -- despite that Perl has fancier input functionality
(the oft-requested automagical "input record separator").  So it sounds like
the Perl trick is orthogonal to SIO's tricks; Perl isn't doing mmaps or
read-aheads or anything else fancy under the covers -- it only optimizes the
inner loop!

> ...
> with a little luck, the new module will replace both pcre
> and regex...

If something more tangible than luck would help to make this come true, feel
free to mention it <wink>.

> not to mention that it's fairly easy to write your own front-
> end to the matching engine -- the expression parser and the
> compiler are both written in good old python.

Ah, good news / bad news.  Perl refugees aren't accustomed to "precompiling"
regexp objects, so write code that will cause regexps to get recompiled over
& over.  Even if you cache the results under the covers, the overhead of the
Python call to the regexp compiler will likely take as long as the engine
takes to search.

Personally, in such cases, I think they should learn how to use the language
<0.5 wink>.


From tim_one at email.msn.com  Fri Dec 31 23:53:56 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Fri, 31 Dec 1999 17:53:56 -0500
Subject: [Python-Dev] Better text processing support in py2k?
In-Reply-To: <386C9121.E9D9DC01@lemburg.com>
Message-ID: <000901bf53e1$eb4248c0$472d153f@tim>

>> This is why I do complex string processing in Icon <0.9 wink>.

[MAL]
> You can have all that extra magic via callable tag objects
> or callable matching functions. It's not exactly nice to
> write, but I'm sure that a meta-language could do the
> conversions for you.

That wasn't my point:  I do it in Icon because it *is* "exactly nice to
write", and doesn't require any yet-another meta-language.  It's all
straightforward, in a way that separate schemes pasted together can never be
(simply because they *are* "separate schemes pasted together" <wink>).

The point of my Python examples wasn't that they could do something
mxTextTools can't do, but that they were *Python* examples:  every variation
I mentioned (or that you're likely to think of) was easy to handle for any
Python programmer because the "control flow" and "data type" etc aspects
could be handled exactly the way they always are in *non* pattern-matching
Python code too, rather than recoded in pattern-scheme-specific different
ways (e.g., where I had a vanailla "if/break", you set up a special
exception to tickle the matching engine).

I'm not attacking mxTextTools, so don't feel compelled to defend it --
people using regexps in those examples are dead in the water.  mxTextTools
is very good at what it does; if we have a real disagreement, it's probably
that I'm less optimistic about the prospects for higher-level wrappers
(e.g., MikeF's SimpleParse is much slower than "a real" BNF parsing system
(ARBNFPS), in part because he isn't doing all the optimizations ARBNFPS
does, but also in part because ARBNFPS uses an underlying engine more
optimized to its specific task than mxTextTool's more-general engine *can*
be).  So I don't see mxTextTools as being the answer to everything -- and if
you hadn't written it, you would agree with that on first glance <wink>.

> Anyway, I'll keep focussing on the speed aspect of mxTextTools;
> others can focus on abstractions, so that eventually everybody
> will be happy :-)

You and I will be, anyway <wink>.