From guido@CNRI.Reston.VA.US Wed Dec 1 17:32:08 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 12:32:08 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Fri, 19 Nov 1999 14:59:11 CST." <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> Message-ID: <199912011732.MAA10419@eric.cnri.reston.va.us> > My first Python-Dev post. :-) Welcome! > >We had some discussion a while back about enabling thread support by > >default, if the underlying OS supports it obviously. I agree with this. MacOS seems to be the only OS without threads these days. > What's the consensus about Python microthreads -- a likely candidate > for incorporation in 1.6 (or later)? What are microthreads? If you think about threads implemented in the Python VM instead of in the OS, forget it. > Also, we have a couple minor convenience functions for Python in an > MSDEV environment, an exposure of OutputDebugString for writing to > the DevStudio log window and a means of tripping DevStudio C/C++ layer > breakpoints from Python code (currently experimental). The msvcrt > module seems like a likely candidate for these, would these be > welcome additions? Sure -- send patches. --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli@amber.org Wed Dec 1 17:39:00 1999 From: petrilli@amber.org (Christopher Petrilli) Date: Wed, 1 Dec 1999 12:39:00 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: <199912011732.MAA10419@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Wed, Dec 01, 1999 at 12:32:08PM -0500 References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us> Message-ID: <19991201123900.A7419@trump.amber.org> Guido van Rossum [guido@CNRI.Reston.VA.US] wrote: > > >We had some discussion a while back about enabling thread support by > > >default, if the underlying OS supports it obviously. > > I agree with this. MacOS seems to be the only OS without threads > these days. I believe the new GUISI package has pthread-API compatible threads implemented, which talk to the underlying ThreadManager. With MacOSX being impending before 1.6 (i.e. early 2000), I'd say this is a good way to go. Threads are VERY useful for a lot of problem domains. Chris -- | Christopher Petrilli | petrilli@amber.org From guido@CNRI.Reston.VA.US Wed Dec 1 17:54:53 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 12:54:53 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Wed, 01 Dec 1999 12:39:00 EST." <19991201123900.A7419@trump.amber.org> References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us> <19991201123900.A7419@trump.amber.org> Message-ID: <199912011754.MAA10465@eric.cnri.reston.va.us> > > I agree with this. MacOS seems to be the only OS without threads > > these days. > > I believe the new GUISI package has pthread-API compatible threads > implemented, which talk to the underlying ThreadManager. With MacOSX > being impending before 1.6 (i.e. early 2000), I'd say this is a good > way to go. Threads are VERY useful for a lot of problem domains. What's GUISI? The son of GUSI? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Wed Dec 1 17:55:19 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 12:55:19 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Wed, 01 Dec 1999 12:32:08 EST." <199912011732.MAA10419@eric.cnri.reston.va.us> References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us> Message-ID: <199912011755.MAA10476@eric.cnri.reston.va.us> > > Also, we have a couple minor convenience functions for Python in an > > MSDEV environment, an exposure of OutputDebugString for writing to > > the DevStudio log window and a means of tripping DevStudio C/C++ layer > > breakpoints from Python code (currently experimental). The msvcrt > > module seems like a likely candidate for these, would these be > > welcome additions? > > Sure -- send patches. I hadn't seen Mark Hammond's response -- I take it back. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Wed Dec 1 18:15:26 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 13:15:26 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Sat, 20 Nov 1999 11:04:28 +1100." <005f01bf32ea$d0b82b90$0501a8c0@bobcat> References: <005f01bf32ea$d0b82b90$0501a8c0@bobcat> Message-ID: <199912011815.NAA10506@eric.cnri.reston.va.us> > This is really a pointer to the fact that some or all of the win32api > should be moved into the core - registry access is the thing people > most want, but there are plenty of other useful things that people > reguarly use... > > Guido objects to the coding style, but hopefully that wont be a big > issue. IMO, the coding style isnt "bad" - it is just more an "MS" > flavour than a "Python" flavour - presumably people reading the code > will have some experience with Windows, so it wont look completely > foreign to them. The good thing about taking it "as-is" is that it > has been fairly well bashed on over a few years, so is really quite > stable. The final "coding style" issue is that there are no "doc > strings" - all documentation is embedded in C comments, and extracted > using a tool called "autoduck" (similar to "autodoc"). However, Im > sure we can arrange something there, too. That's a good summary of the status quo. I would appreciate it if win32all could become part of the core. However the coding style issues need to be addressed (I also believe that it needs to be compiled in C++ mode). One concern that Mark doesn't mention is that there are some safety issues -- you can abuse some of the calls to cause segfaults, whether intentional or by mistake, and that's not a good thing. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Wed Dec 1 18:55:40 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 13:55:40 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Wed, 24 Nov 1999 09:43:57 EST." <383BF9AD.E183FB98@interet.com> References: <383BF9AD.E183FB98@interet.com> Message-ID: <199912011855.NAA10662@eric.cnri.reston.va.us> > I would like to argue that on Windows, import of dynamic libraries is > broken. If a file something.pyd is imported, then sys.path is searched > to find the module. If a file something.dll is imported, the same thing > happens. But Windows defines its own search order for *.dll files which > Python ignores. I would suggest that this is wrong for files named > *.dll, > but OK for files named *.pyd. I think you misunderstand some of the issues. Python cannot import every .dll file. Only .dll files that conform to the convention for Python extension modules can be imported. (The convention is that it must export an init function.) On most other platforms, shared libraries must have a specific extension (e.g. .so on most Unix). Python allows you to drop such a file into any directory where is looks for modules, and it will then direct the dynamic load support to load that specific file. This seems logical -- Python extensions must live in directories that Python searches (Python must do its own search because the search order is significant). On Windows, Python uses the same strategy. The only modification is that it is allowed to give the file a different extension, namely .pyd, to indicate that this really is a Python extension and not a regular DLL. This was mostly introduced because it is apparently common to have an existing DLL "foo.dll" and write a Python wrapper for it that is also called "foo". Clearly, two files foo.dll are too confusing, so we let you name the wrapper foo.pyd. But because the file format is essentially that of a DLL, we don't *require* this renaming; some ways of creating DLLs in the first place may make it difficult to do. > A SysAdmin should be able to install and maintain *.dll as she has > been trained to do. This makes maintaining Python installations > simpler and more un-surprising. I don't see that a SysAdmin needs to do much DLL management. This is up to installer scripts. Anyway how hard can it be for a SysAdmin to leave DLLs in specific directories alone? > I have no solution to the backward compatibilty problem. But the > code is only a couple lines. A LoadLibrary() call does its own > path searching. But at what point should this LoadLibrary() call be called? The import statement contains no clue that a DLL is requested -- the sys.path search reveals that. I claim that there is nothing with the current strategy. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Dec 1 19:01:12 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Wed, 1 Dec 1999 14:01:12 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs References: <199911161700.MAA02716@eric.cnri.reston.va.us> <14389.31511.706588.20840@anthem.cnri.reston.va.us> Message-ID: <14405.28792.184298.298597@anthem.cnri.reston.va.us> >>>>> "BAW" == Barry A Warsaw writes: BAW> There was a suggestion to start augmenting the checkin emails BAW> to include the diffs of the checkin. This would let you keep BAW> a current snapshot of the tree without having to do a direct BAW> `cvs update'. The voting has stopped, with the "yeah" vote slightly head of the "nay" vote. We'll go with context diffs, and we'll be implementing Greg Stein's approach with the xml-checkins list: truncating diffs to H number of lines at the top and T number of lines at the bottom, so as not to overwhelm incoming email. I'll try to get this going sometime today (no promises). You'll likely see a number of tests coming through python-checkins in the meantime. I'll send a message out when it's done. -Barry From da@ski.org Wed Dec 1 19:34:56 1999 From: da@ski.org (David Ascher) Date: Wed, 1 Dec 1999 11:34:56 -0800 (Pacific Standard Time) Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues In-Reply-To: <14405.25141.297349.76968@gargle.gargle.HOWL> Message-ID: On Wed, 1 Dec 1999, Geoffrey Furnish wrote: [...] > Well, like I said above, I haven't analyzed your posts for technical > details, so I can't say whether you made avoidable mistakes. But I > definitely do agree with you that it is roughly 100 times harder than > it needs to be, to use Python from C++. The charter of this sig is to > fix that, by developing the additional software that would allow > Python's compiled interface to be exploited from C++ "with ease". > > The first and most basic issue, is compiling Python so it initializes > C++ global objects correctly. There is a patch on the sig's www site > to help with that. Any opinions from this esteemed body re: integrating said patch in the main tree? --david From jim@interet.com Wed Dec 1 19:47:14 1999 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 14:47:14 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us> Message-ID: <38457B42.85552AC@interet.com> Guido van Rossum wrote: > > > I would like to argue that on Windows, import of dynamic libraries is > > broken. If a file something.pyd is imported, then sys.path is searched > > to find the module. If a file something.dll is imported, the same thing > > happens. But Windows defines its own search order for *.dll files which > > Python ignores. I would suggest that this is wrong for files named > > *.dll, > > but OK for files named *.pyd. > > I think you misunderstand some of the issues. > > Python cannot import every .dll file. Only .dll files that conform to > the convention for Python extension modules can be imported. (The > convention is that it must export an init function.) Of course I meant that the test is LoadLibrary(module) followed by GetProcAddress(h, "init" + module). Both must succeed. > This seems logical -- Python extensions must live in directories that > Python searches (Python must do its own search because the search > order is significant). The PYTHONPATH search path is what I am trying to get away from. If I eliminate PYTHONPATH I still can not use the Windows DLL search path (which is superior) because DLLs are searched on PYTHONPATH too; thus my post. I don't believe it is important for Python module.dll to be located on PYTHONPATH. > > A SysAdmin should be able to install and maintain *.dll as she has > > been trained to do. This makes maintaining Python installations > > simpler and more un-surprising. > > I don't see that a SysAdmin needs to do much DLL management. This is > up to installer scripts. Anyway how hard can it be for a SysAdmin to > leave DLLs in specific directories alone? The problem is maintaining PYTHONPATH plus having DLL's on a non-standard search path. Yes, PythonDev[:] and professional SysAdmins can do it. But it is not as simple as it could be. Someone has to write the install scripts. And what if something doesn't work? Think of Python being used as a teaching language for the 8th grade. Think of the 8th grade teacher trying to get all this right. The only thing that works is simplicity. > But at what point should this LoadLibrary() call be called? The > import statement contains no clue that a DLL is requested -- the > sys.path search reveals that. Just after built-in and frozen modules. > I claim that there is nothing with the current strategy. Thank you for thoughtfully considering and commenting at length on this issue. Lets ignore it for the moment. The other problems with PYTHONPATH are more pressing. But if those issues are solved, this one will stick out. JimA From da@ski.org Wed Dec 1 19:59:44 1999 From: da@ski.org (David Ascher) Date: Wed, 1 Dec 1999 11:59:44 -0800 (Pacific Standard Time) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <38457B42.85552AC@interet.com> Message-ID: On Wed, 1 Dec 1999, James C. Ahlstrom wrote: > > This seems logical -- Python extensions must live in directories that > > Python searches (Python must do its own search because the search > > order is significant). > > The PYTHONPATH search path is what I am trying to get away > from. If I eliminate PYTHONPATH I still can not use the > Windows DLL search path (which is superior) because DLLs > are searched on PYTHONPATH too; thus my post. I don't believe > it is important for Python module.dll to be located on PYTHONPATH. Why is the DLL search path superior? In my experience, the DLL search path (PATH for short) is problematic because it requires either using the System control panel or modifying autoexec.bat, both of which can have massive systemic effects completely unrelated to Python if a mistake is made during the modification. On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH, although I think there are significant variations in how that works across platforms. Most beginning unix users have no idea how to modify their LD_LIBRARY_PATH, as they typically don't understand the configuration mechanisms on Unix (system vs. user-specific, login vs. shell-specific, different shell configuration languages, etc.). I know it's not what you had in mind, but have you tried doing something like: import sys, os, string sys.path.extend(string.split(os.environ['PATH'], ';')) --david From gmcm@hypernet.com Wed Dec 1 20:19:13 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 1 Dec 1999 15:19:13 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: References: <38457B42.85552AC@interet.com> Message-ID: <1268042932-41354568@hypernet.com> David Ascher wrote: > On Wed, 1 Dec 1999, James C. Ahlstrom wrote: > > > > This seems logical -- Python extensions must live in > > > directories that Python searches (Python must do its own > > > search because the search order is significant). > > > > The PYTHONPATH search path is what I am trying to get away > > from. If I eliminate PYTHONPATH I still can not use the > > Windows DLL search path (which is superior) because DLLs are > > searched on PYTHONPATH too; thus my post. I don't believe it > > is important for Python module.dll to be located on PYTHONPATH. > > Why is the DLL search path superior? > > In my experience, the DLL search path (PATH for short) Make that: [ os.path.dirname(sys.executable), os.getcwd(), win32api.GetSystemDirectory(), os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'), win32api.GetWindowsDirectory() ] + string.split(os.environ['PATH'], ';') > is > problematic because it requires either using the System control > panel or modifying autoexec.bat, both of which can have massive > systemic effects completely unrelated to Python if a mistake is > made during the modification. Hear, hear! [snip] - Gordon From jim@interet.com Wed Dec 1 20:36:04 1999 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 15:36:04 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: Message-ID: <384586B4.48905B32@interet.com> David Ascher wrote: > Why is the DLL search path superior? > > In my experience, the DLL search path (PATH for short) is problematic > because it requires either using the System control panel or modifying > autoexec.bat, both of which can have massive systemic effects completely > unrelated to Python if a mistake is made during the modification. I agree that altering PATH is problematic. So is altering PYTHONPATH and for exactly the same reason. That is why I think PYTHONPATH is a bad idea. The reason the DLL search path is superior is that it is not just PATH. It defines a path which includes the install directory of the application plus the system directories, and this path is discovered at runtime. So it is not necessary to set a global PYTHONPATH, nor make registry entries, nor do anything at all. It Just Works. The Windows DLL search path is: 1) The directory of the executable program. That means you can just throw all your DLL's in with the *.exe's, and it all Just Works. 2) The current directory. Also useful. 3) The Windows system directory (call GetSystemDirectory() to get this). 4) The Windows directory (call GetWindowsDirectory() to get this). These two directories are used for system files. Think of /sbin, /bin. Windows apps usually throw some of their DLL's here, especially if they are of general interest. 5) The directories in PATH. This is relatively useless, and AFAIK it is seldom used in a real installation. It is a left-over from DOS. That is also why it appears last. > On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH, > although I think there are significant variations in how that works across > platforms. Most beginning unix users have no idea how to modify their > LD_LIBRARY_PATH, as they typically don't understand the configuration > mechanisms on Unix (system vs. user-specific, login vs. shell-specific, > different shell configuration languages, etc.). I agree. > > I know it's not what you had in mind, but have you tried doing something > like: > > import sys, os, string > sys.path.extend(string.split(os.environ['PATH'], ';')) Adding PATH (or anything else) to PYTHONPATH is making it worse. Have you tried "import sys; print sys.path" on Windows? It is junk. JimA From jim@interet.com Wed Dec 1 20:44:00 1999 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 15:44:00 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38457B42.85552AC@interet.com> <1268042932-41354568@hypernet.com> Message-ID: <38458890.BCB36FE2@interet.com> Gordon McMillan wrote: > Make that: > [ os.path.dirname(sys.executable), > os.getcwd(), > win32api.GetSystemDirectory(), > os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'), > win32api.GetWindowsDirectory() > ] + string.split(os.environ['PATH'], ';') Very nice! "../SYSTEM" needed on NT I guess. JimA From fredrik@pythonware.com Wed Dec 1 20:56:16 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 1 Dec 1999 21:56:16 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> Message-ID: <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> James C. Ahlstrom wrote: > Adding PATH (or anything else) to PYTHONPATH is making it worse. Have > you tried "import sys; print sys.path" on Windows? It is junk. not on my machine. it would help if you stopped assuming that every- one have the same problems as you have. we've distributed several python apps on windows, and frankly, I don't understand what you're talking about. From jim@interet.com Wed Dec 1 21:26:37 1999 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 16:26:37 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> Message-ID: <3845928D.C0462322@interet.com> Fredrik Lundh wrote: > > you tried "import sys; print sys.path" on Windows? It is junk. > > not on my machine. On my Windows machine I get: ['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib', '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin'] PYTHONPATH is N:/prd/winlease/vest. os.path.dirname(sys.executable) is F:/bin. The others are junk. What do you get? Did you change sys.path from the default? > it would help if you stopped assuming that every- > one have the same problems as you have. we've > distributed several python apps on windows, and > frankly, I don't understand what you're talking > about. We distribute our app by freezing all *.py files into a DLL, and we don't set PYTHONPATH on the target machine. The files are located with the executable file and are found there. This works fine and we don't have a problem with it. It would help me a lot if you could describe how you distribute your app. Do you set PYTHONPATH on the target machine? JimA From da@ski.org Wed Dec 1 21:41:31 1999 From: da@ski.org (David Ascher) Date: Wed, 1 Dec 1999 13:41:31 -0800 (Pacific Standard Time) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <384586B4.48905B32@interet.com> Message-ID: On Wed, 1 Dec 1999, James C. Ahlstrom wrote: > > In my experience, the DLL search path (PATH for short) is problematic > > because it requires either using the System control panel or modifying > > autoexec.bat, both of which can have massive systemic effects completely > > unrelated to Python if a mistake is made during the modification. > > I agree that altering PATH is problematic. So is altering PYTHONPATH > and for exactly the same reason. That is why I think PYTHONPATH is > a bad idea. I see. Thanks for the explanation. I didn't know the complete story of the "Windows DLL search path". BTW, I think a huge difference b/w PYTHONPATH and PATH is the system-wide nature of PATH, vs. the Python-restriced nature of PYTHONPATH. --david From mhammond@skippinet.com.au Wed Dec 1 22:29:38 1999 From: mhammond@skippinet.com.au (Mark Hammond) Date: Thu, 2 Dec 1999 09:29:38 +1100 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Message-ID: <009c01bf3c4b$8f119090$0501a8c0@bobcat> > I see. Thanks for the explanation. I didn't know the > complete story of > the "Windows DLL search path". BTW, I think a huge difference b/w > PYTHONPATH and PATH is the system-wide nature of PATH, vs. the > Python-restriced nature of PYTHONPATH. And more to the point - and the critical distinction - is that PYTHONPATH is actually specific to the Python _app_, not just Python on the machine. Sure - the standard Python installation puts a "default" PYTHONPATH suitable for general purpose development - but any distributed application _can_ define their own PYTHONPATH that is independant of any other Python systems or applications. People have been doing this for years, including MS :-) Sorry Jim, but count this as another vote against it - which isnt to argue that the current system is perfect, simply (IMO) better than the Windows path and DLL search order. Mark. From guido@CNRI.Reston.VA.US Wed Dec 1 23:00:21 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:00:21 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Wed, 01 Dec 1999 16:26:37 EST." <3845928D.C0462322@interet.com> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> Message-ID: <199912012300.SAA10861@eric.cnri.reston.va.us> > Fredrik Lundh wrote: > > > > you tried "import sys; print sys.path" on Windows? It is junk. > > > > not on my machine. > > On my Windows machine I get: > > ['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib', > '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin'] > > PYTHONPATH is N:/prd/winlease/vest. > os.path.dirname(sys.executable) is F:/bin. > The others are junk. What do you get? Did > you change sys.path from the default? You must not have used the standard Python installer; if you had used it you wouldn't have had this problem (and perhaps we wouldn't have had this discussion). The problem is that you apparently have installed python.exe in f:\bin. "Modern" Python versions execute some code at startup that comes up with a suitable value for sys.path; the Windows version of this code is in PC/getpathp.c -- I recommend that you study it. This code tries to find the Python install directory by looking for a "landmark" file relative to the executable path, and then adds a bunch of directory entries to the path relative to the install directory. If it fails, it defaults to "." for the install directory. The entries '.\\DLLs', '.\\lib', '.\\lib\\plat-win', '.\\lib\\lib-tk' are all a result of this failing. As long as this works, there is no need for the user (or anyone) to ever set the PYTHONPATH variable -- that variable is only needed to add directories in front of sys.path for stuff that getpathp.c doesn't know about (e.g. PIL, Numeric, etc.). With packagized versions of those modules, even that won't be necessary, because the packages will be dropped in the Python install directory (typically C:\Program Files\Python). I believe that most of your desire to get rid of PYTHONPATH comes from your insistence to bypass the default installer. There's probably a way to install your app in such a way that the getpathp.c algorithm actually succeeds? There's also a separate env variable, PYTHONHOME, which overrides the Python install directory; if getpathp.c sees that it is set, it will bypass the search relative to the executable's path. I take blame for not documenting all this well enough. However I wish you stopped criticizing the design -- I think the design is quite solid. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Wed Dec 1 23:09:43 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:09:43 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Wed, 01 Dec 1999 14:47:14 EST." <38457B42.85552AC@interet.com> References: <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us> <38457B42.85552AC@interet.com> Message-ID: <199912012309.SAA10873@eric.cnri.reston.va.us> > > This seems logical -- Python extensions must live in directories that > > Python searches (Python must do its own search because the search > > order is significant). > > The PYTHONPATH search path is what I am trying to get away > from. If I eliminate PYTHONPATH I still can not use the > Windows DLL search path (which is superior) because DLLs > are searched on PYTHONPATH too; thus my post. I don't believe > it is important for Python module.dll to be located on PYTHONPATH. But I do. First of all, I'm not sure whether you're talking here about sys.path or PYTHONPATH. As I explained in a previous post, you should normally not have to set PYTHONPATH at all. Let's assume you really meant sys.path. Let's assume sys.path is [A, B]. Let's assume there's a foo.py and a foo.dll. If foo.py lives in A and foo.dll lives in B, then import foo should load foo.py. If it's the other way around, it should load foo.dll. If we were to use the default DLL search path, there's no way that we can get this behavior: either you have to look for a DLL first, which means there's no way for foo.py to override foo.dll, or you have to look for a DLL last, and then there's no way for a foo.dll to override foo.py. It is desirable that both overrides are possible: we want to be able to have foo.dll override foo.py, because perhaps foo.py should only be used when for some reason foo.dll can't be loaded (say foo.py does the same thing only slower); but we also want to be able to have foo.py override foo.dll (by simply placing it in a directory that's earlier on the path) e.g. in a situation where the dll version does something undesirable and we want to create a safe substitute. (Deleting files is not always an option.) > The problem is maintaining PYTHONPATH plus having DLL's on a > non-standard search path. I've commented already that PYTHONPATH maintenance is probably a red herring due to your non-standard install. I'm not sure what the problem is with having a DLL on a non-std path? > Yes, PythonDev[:] and professional > SysAdmins can do it. But it is not as simple as it could be. > Someone has to write the install scripts. The distutil-sig (a.k.a. Greg Ward :-) is taking care of this as we speak. > And what if something > doesn't work? Think of Python being used as a teaching language > for the 8th grade. Think of the 8th grade teacher trying to get > all this right. The only thing that works is simplicity. We will provide an installer that Just Works [tm]. > > But at what point should this LoadLibrary() call be called? The > > import statement contains no clue that a DLL is requested -- the > > sys.path search reveals that. > > Just after built-in and frozen modules. See my long comment above. > > I claim that there is nothing with the current strategy. > > Thank you for thoughtfully considering and commenting at length > on this issue. Lets ignore it for the moment. The other > problems with PYTHONPATH are more pressing. But if those > issues are solved, this one will stick out. And those other issues should be resolved in a different way than what you have been proposing. See other post. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Wed Dec 1 23:11:28 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:11:28 -0500 Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues In-Reply-To: Your message of "Wed, 01 Dec 1999 11:34:56 PST." References: Message-ID: <199912012311.SAA10888@eric.cnri.reston.va.us> > > The first and most basic issue, is compiling Python so it initializes > > C++ global objects correctly. There is a patch on the sig's www site > > to help with that. > > Any opinions from this esteemed body re: integrating said patch in the > main tree? I presume you meant me :-) I'll give it a try tonight. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@cnri.reston.va.us Wed Dec 1 23:24:06 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Wed, 1 Dec 1999 18:24:06 -0500 (EST) Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01 Message-ID: <14405.44566.832799.96438@goon.cnri.reston.va.us> It looks like there has been some mail glitch that result in no digests being sent between 11/26 and 12/01 and no messages being archived between 11/24 and 12/01. Does anyone keep a personal archive that has those messages? I'd like to read them. Jeremy From guido@CNRI.Reston.VA.US Wed Dec 1 23:28:14 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:28:14 -0500 Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01 In-Reply-To: Your message of "Wed, 01 Dec 1999 18:24:06 EST." <14405.44566.832799.96438@goon.cnri.reston.va.us> References: <14405.44566.832799.96438@goon.cnri.reston.va.us> Message-ID: <199912012328.SAA12879@eric.cnri.reston.va.us> > It looks like there has been some mail glitch that result in no > digests being sent between 11/26 and 12/01 and no messages being > archived between 11/24 and 12/01. Does anyone keep a personal archive > that has those messages? I'd like to read them. I do :-) I'll provide Jeremy with an archive. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu Dec 2 04:24:03 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Wed, 1 Dec 1999 23:24:03 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs References: <199911161700.MAA02716@eric.cnri.reston.va.us> <14389.31511.706588.20840@anthem.cnri.reston.va.us> Message-ID: <14405.62563.345566.500106@anthem.cnri.reston.va.us> Okay folks, I think I've got the diff thing working now. The trick (for you CVS heads) was that you can't do a `cvs diff' while you're executing a loginfo script. Lock contention (repeat after me: "I Love CVS!"). Anyway, let's see how you all like it. Note that based on a suggestion by Greg Stein, seconded by GvR, I do not send out the entire diff of every file (which could potentially be huge). I send out 20 lines from the head of the diff and 20 lines from the tail, and suppress everything inbetween. Those numbers can be easily tweaked, and I'm not sure what the ideal is. Let's see what the emails look like when stuff starts getting checked in. Enjoy, -Barry From jack@oratrix.nl Thu Dec 2 11:00:45 1999 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 02 Dec 1999 12:00:45 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Message by Guido van Rossum , Wed, 01 Dec 1999 18:09:43 -0500 , <199912012309.SAA10873@eric.cnri.reston.va.us> Message-ID: <19991202110045.96F33370CF2@snelboot.oratrix.nl> On the Mac I've introduced "magic cookies" into sys.path, which allow you to do interesting searches (like searching for a DLL or PYC-resource in the application itself) at known places in the import process. There isn't a cookie for "search along the standard MacOS dll search path" (which is somewhat similar to the Windows dll search path) because I haven't seen a reason for it, but there's nothing to stop it. And if you'd insert that cookie it would be perfectly clear (at least, it should be) that only dll modules will be found in that step, not .py modules. Actually I'm so happy with the magic cookie scheme that I've advocated at various times in the past that something similar also be used for determining where builtin modules and frozen modules appear in sys.path... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido@CNRI.Reston.VA.US Thu Dec 2 11:59:34 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 06:59:34 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Thu, 02 Dec 1999 12:00:45 +0100." <19991202110045.96F33370CF2@snelboot.oratrix.nl> References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> Message-ID: <199912021159.GAA13732@eric.cnri.reston.va.us> > On the Mac I've introduced "magic cookies" into sys.path, which > allow you to do interesting searches (like searching for a DLL or > PYC-resource in the application itself) at known places in the > import process. > There isn't a cookie for "search along the standard MacOS dll search > path" (which is somewhat similar to the Windows dll search path) > because I haven't seen a reason for it, but there's nothing to stop > it. And if you'd insert that cookie it would be perfectly clear (at > least, it should be) that only dll modules will be found in that > step, not .py modules. > Actually I'm so happy with the magic cookie scheme that I've > advocated at various times in the past that something similar also > be used for determining where builtin modules and frozen modules > appear in sys.path... I see the magic cookies as a poor man's (but more compatible!) version of a chain of importers as advocated by Greg Stein and other imputil fans. I like the idea, except that I think that the chain should be manipulatable more easily than the current imputil implementation. (I'll have more comments on Greg's comments later, when I've actually read them through.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Thu Dec 2 12:09:40 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 2 Dec 1999 04:09:40 -0800 (PST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <199912021159.GAA13732@eric.cnri.reston.va.us> Message-ID: On Thu, 2 Dec 1999, Guido van Rossum wrote: >... > I see the magic cookies as a poor man's (but more compatible!) version > of a chain of importers as advocated by Greg Stein and other imputil > fans. I like the idea, except that I think that the chain should be > manipulatable more easily than the current imputil implementation. > (I'll have more comments on Greg's comments later, when I've actually > read them through.) Anything in sys.path that is not a string pointing to a directory is not very compatible. My current proposal keeps the existing semantics for sys.path (the proposal adds functionality thru other mechanisms, rather than changing/interfering with existing ones). I look forward to your comments! I'll definitely provide new solutions where you find problems :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik@pythonware.com Thu Dec 2 12:53:03 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 2 Dec 1999 13:53:03 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> <199912021159.GAA13732@eric.cnri.reston.va.us> Message-ID: <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> Guido van Rossum wrote: > > Actually I'm so happy with the magic cookie scheme that I've > > advocated at various times in the past that something similar also > > be used for determining where builtin modules and frozen modules > > appear in sys.path... > > I see the magic cookies as a poor man's (but more compatible!) version > of a chain of importers as advocated by Greg Stein and other imputil > fans. I like the idea, except that I think that the chain should be > manipulatable more easily than the current imputil implementation. I know this has been asked before, but cannot recall any of the arguments against it: how about replacing Jack's magic cookies with importer objects? (in other words, if a path item is a string, import as usual. otherwise, ask the importer for a code object or maybe better, a module object). From jack@oratrix.nl Thu Dec 2 13:23:31 1999 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 02 Dec 1999 14:23:31 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Message by "Fredrik Lundh" , Thu, 2 Dec 1999 13:53:03 +0100 , <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> Message-ID: <19991202132331.E3F8D370CF2@snelboot.oratrix.nl> > > I see the magic cookies as a poor man's (but more compatible!) version > > of a chain of importers as advocated by Greg Stein and other imputil > > fans. [...] > > I know this has been asked before, but cannot recall > any of the arguments against it: how about replacing > Jack's magic cookies with importer objects? For the record: I definitely agree with both comments here. The only thing that would need solving (but maybe it already is? Greg?) is the external representation of an importer, as I'd definitely want to be able to name them in PYTHONPATH (or the mac equivalent). -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jim@interet.com Thu Dec 2 14:19:31 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 09:19:31 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <009c01bf3c4b$8f119090$0501a8c0@bobcat> Message-ID: <38467FF3.D938EE4@interet.com> Mark Hammond wrote: > Sure - the standard Python installation puts a "default" PYTHONPATH > suitable for general purpose development - but any distributed > application _can_ define their own PYTHONPATH that is independant of > any other Python systems or applications. People have been doing this > for years, including MS :-) How is this done? > Sorry Jim, but count this as another vote against it - which isnt to > argue that the current system is perfect, simply (IMO) better than the > Windows path and DLL search order. Sigh..... JimA From jim@interet.com Thu Dec 2 15:49:10 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 10:49:10 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> Message-ID: <384694F6.E5D74221@interet.com> Guido van Rossum wrote: > You must not have used the standard Python installer; if you had used > it you wouldn't have had this problem (and perhaps we wouldn't have > had this discussion). Correct, I did not use the standard Python installer. I compiled Python from the source distribution. There are good reasons for this in my case. First, my real issue is how to DISTRIBUTE Python programs, not to get Python working on my own machine. We have 12 machines on a network. It is not acceptable to run a Python installation script on every one of them just to run a simple Python program. OK, I guess I could do 12, but what about a larger company? And we ship to hundreds of customers. I can distribute simple C or C++ programs without a hassle, why not Python? It is not acceptable to ask our customers to run a separate Python installer. We have our own Wise installer to install our software. Every commercial vendor has Wise, Install Shield or other installer in place. No commercial vendor is going to abandon Wise et al. and move to The Official Python Installer because it will not have the features of Wise (such as binary patches across the network), and because what it does won't be documented, and because it is Just Different. Second, I can not run ANY installer on my development machine, Python or otherwise. This is a general Windows problem not specific to Python. Right now our help system is broken on every office machine except the one where the help system installer was run (where we develop help). If I run a Python installer, it may Just Work here. So testing is fine, but when I distribute the program to customers where the install program has not been run it fails. The installer made registry entries, installed files, etc. And what did it do?? No one knows. And how do I install at a customer site if I don't have documentation on what the Help installer or Python installer did?? No one knows. Who fixes it if something goes wrong?? Hours on the phone to Help System customer support. Does it work on Windows 2000?? No one knows. > f:\bin. "Modern" Python versions execute some code at startup that > comes up with a suitable value for sys.path; the Windows version of > this code is in PC/getpathp.c -- I recommend that you study it. This > [ Highly useful discussion of startup...] Thank you, I will study this. > know about (e.g. PIL, Numeric, etc.). With packagized versions of > those modules, even that won't be necessary, because the packages will > be dropped in the Python install directory (typically C:\Program > Files\Python). Yes, this is essential. Packages must be easily installed. I was hoping for single file package archive files. > I believe that most of your desire to get rid of PYTHONPATH comes from > your insistence to bypass the default installer. Correct, I refuse to execute the default installer. And I am a patient person who loves Python, so I will read getpathp.c to see what is happening. But other commercial developers, students, teachers, SysAdmins etc. are not so patient. In the interest of promoting Python, there should be documentation on the official way to easily install Python programs. > There's probably a > way to install your app in such a way that the getpathp.c algorithm > actually succeeds? There's also a separate env variable, PYTHONHOME, Perhaps, and if there is it should be prominently documented in the How to Distribute Your App section of the manual. I am worried about supporting versioning, but I will think about it. > I take blame for not documenting all this well enough. However I wish > you stopped criticizing the design -- I think the design is quite > solid. Thank you for the explanation. I will study the design again. I always wondered what PYTHONHOME did. JimA From guido@CNRI.Reston.VA.US Thu Dec 2 16:03:09 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 11:03:09 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Thu, 02 Dec 1999 10:49:10 EST." <384694F6.E5D74221@interet.com> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> Message-ID: <199912021603.LAA14455@eric.cnri.reston.va.us> > Perhaps, and if there is it should be prominently documented in the > How to Distribute Your App section of the manual. I > am worried about supporting versioning, but I will think about it. Join the distutil-SIG, they are discussing just this. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Dec 2 15:48:40 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 02 Dec 1999 16:48:40 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> <199912021159.GAA13732@eric.cnri.reston.va.us> <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> Message-ID: <384694D8.DCA3D75E@lemburg.com> Fredrik Lundh wrote: > > Guido van Rossum wrote: > > > Actually I'm so happy with the magic cookie scheme that I've > > > advocated at various times in the past that something similar also > > > be used for determining where builtin modules and frozen modules > > > appear in sys.path... > > > > I see the magic cookies as a poor man's (but more compatible!) version > > of a chain of importers as advocated by Greg Stein and other imputil > > fans. I like the idea, except that I think that the chain should be > > manipulatable more easily than the current imputil implementation. > > I know this has been asked before, but cannot recall > any of the arguments against it: how about replacing > Jack's magic cookies with importer objects? > > (in other words, if a path item is a string, import as > usual. otherwise, ask the importer for a code object > or maybe better, a module object). Plus, for backward compatibility, make sure that str(importerobj) returns something which resembles a non-existing directory. Note that the builtin importer skips non-string entries in sys.path, so the above will only be needed for existing import hooks. Still, I would like to rephrase my 0.02EUR which I already posted twice... why not start to think about what these importers would do first ? If there are only a handful of wishes we could just add them to the builtin machinery and be done with it... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 29 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@CNRI.Reston.VA.US Thu Dec 2 16:28:28 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 11:28:28 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Fri, 19 Nov 1999 22:43:32 EST." <1269053086-27079185@hypernet.com> References: <1269053086-27079185@hypernet.com> Message-ID: <199912021628.LAA14506@eric.cnri.reston.va.us> > No success whatsoever in either direction across Samba. In > fact the mtime of my Linux home directory as seen from NT is > Jan 1, 1980. That's only the case for an NT mount point (something of the form \\host\name; I notice that os.stat() only believes it exists if you append a backslash: \\host\name\). For interior directories, at least with the Samba version that I'm using, os.stat() seems to give correct results. I think that this whole issue (that doing a stat on a directory to find out whether files in it were modified doesn't give usable results) is widely blown out of proportion. The only useful bit of info is that mtimes may have an up to 2 second granularity, and that anything as recent as 2 seconds should be considered as newer than the cache even if the cache is also less than 2 seconds. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim@interet.com Thu Dec 2 16:28:50 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 11:28:50 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us> <38457B42.85552AC@interet.com> <199912012309.SAA10873@eric.cnri.reston.va.us> Message-ID: <38469E42.AF0A0D55@interet.com> Guido van Rossum wrote: > Let's assume sys.path is [A, B]. Let's assume there's a foo.py and a > foo.dll. If foo.py lives in A and foo.dll lives in B, then import foo > ... Thank you for the detailed discussion showing that sys.path is needed so a choice can be made whether to load foo.dll or foo.py. As you correctly point out, a separate search path defeats this behavior. But I don't think the usefulness of the feature compensates for its resultant complexity. Specifically, it will be hard to create this behavior in archive files. As I envision archive files (which of course is subject to change) they contain *.pyc files and not DLL's. The DLL's must be in a ./DLL directory since the OS can not load them from strings. So if every *.pyc is in an archive file, your only choice is whether to load all DLL's first or last. That is, archive.pyl is either before or after ./DLL. If a package (probably with lots of subdirectories) author depends on having a search path within a package which discriminates between pyc and DLL files with equal names, then that search path plus the existence of the DLL's must be recorded in the archive. This is much more complicated than just an archive with all *.pyc files entered in a dotted name space: foo foo.sub1 foo.sub2 foo.sub2.pkx I would question whether equally named foo.dll and foo.py is worth it. The alternative (which is IMHO more common) is to code the choice in Python in the module that cares about it. > > And what if something > > doesn't work? Think of Python being used as a teaching language > > for the 8th grade. Think of the 8th grade teacher trying to get > > all this right. The only thing that works is simplicity. > > We will provide an installer that Just Works [tm]. OK for this case. Not enough for Python program distribution. JimA From jim@interet.com Thu Dec 2 16:30:49 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 11:30:49 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us> Message-ID: <38469EB9.5EDB9617@interet.com> Guido van Rossum wrote: > > > Perhaps, and if there is it should be prominently documented in the > > How to Distribute Your App section of the manual. I > > am worried about supporting versioning, but I will think about it. > > Join the distutil-SIG, they are discussing just this. I already belong to the distutil-SIG and have seen no such discussion. Jim From guido@CNRI.Reston.VA.US Thu Dec 2 17:17:52 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 12:17:52 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Thu, 02 Dec 1999 11:30:49 EST." <38469EB9.5EDB9617@interet.com> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us> <38469EB9.5EDB9617@interet.com> Message-ID: <199912021717.MAA14682@eric.cnri.reston.va.us> [Jim] > > > Perhaps, and if there is it should be prominently documented in the > > > How to Distribute Your App section of the manual. I > > > am worried about supporting versioning, but I will think about it. [me] > > Join the distutil-SIG, they are discussing just this. [Jim again] > I already belong to the distutil-SIG and have seen no such > discussion. Sorry, you're right (except for a brief exchange between you and Paul Dubois :-). But I think they should, it falls under their charter. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Dec 2 17:30:02 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 2 Dec 1999 12:30:02 -0500 (EST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <199912021717.MAA14682@eric.cnri.reston.va.us> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us> <38469EB9.5EDB9617@interet.com> <199912021717.MAA14682@eric.cnri.reston.va.us> Message-ID: <14406.44186.574647.651111@weyr.cnri.reston.va.us> Guido van Rossum writes: > Sorry, you're right (except for a brief exchange between you and Paul > Dubois :-). But I think they should, it falls under their charter. This was deliberatly postponed until after extension packages are supported and in place. I know Greg is interested in application installation as well as package installation. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From gmcm@hypernet.com Thu Dec 2 17:53:03 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 2 Dec 1999 12:53:03 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912021628.LAA14506@eric.cnri.reston.va.us> References: Your message of "Fri, 19 Nov 1999 22:43:32 EST." <1269053086-27079185@hypernet.com> Message-ID: <1267965342-1446902@hypernet.com> [Gordon] > > No success whatsoever in either direction across Samba. In fact > > the mtime of my Linux home directory as seen from NT is Jan 1, > > 1980. [Guido] > That's only the case for an NT mount point (something of the form > \\host\name; I notice that os.stat() only believes it exists if > you append a backslash: \\host\name\). For interior directories, > at least with the Samba version that I'm using, os.stat() seems > to give correct results. Correct (as I discovered not long after I posted). (I find that from NT I have to stat some file _in_ the directory to get an updated mtime from the stat _of_ the directory). > I think that this whole issue (that doing a stat on a directory > to find out whether files in it were modified doesn't give usable > results) is widely blown out of proportion. This has come up twice: re caching importers and dircache.py (used only by dircmp). We've arrived at the fact that it _can_ be made to work on Windows boxes. NFS? Andrew (anyone still use that)? IOW, do we want to trust it? Do we want to document that it might not be trustworthy in some situations? Make it optional- for-wizards? Kill it? IOOW, what's the proper proportion ;-)? > The only useful bit of info is that mtimes may have an up to 2 > second granularity, and that anything as recent as 2 seconds > should be considered as newer than the cache even if the cache is > also less than 2 seconds. From NT, at least, stat'ing any file in the directory seems to remove this 2 second limitation. - Gordon From guido@CNRI.Reston.VA.US Thu Dec 2 20:43:46 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 15:43:46 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Fri, 19 Nov 1999 05:29:50 PST." References: Message-ID: <199912022043.PAA15108@eric.cnri.reston.va.us> Here's the promised response to Greg's response to my wishlist. > On Thu, 18 Nov 1999, Guido van Rossum wrote: > > Gordon McMillan wrote: > >... > > > I think imputil's emulation of the builtin importer is more of a > > > demonstration than a serious implementation. As for speed, it > > > depends on the test. > > > > Agreed. I like some of imputil's features, but I think the API > > need to be redesigned. > > It what ways? It sounds like you've applied some thought. Do you have any > concrete ideas yet, or "just a feeling" :-) I'm working through some > changes from JimA right now, and would welcome other suggestions. I think > there may be some outstanding stuff from MAL, but I'm not sure (Marc?) I actually think that the way the PVM (Python VM) calls the importer ought to be changed. Assigning to __builtin__.__import__ is a crock. The API for __import__ is a crock. > >... > > So here's a challenge: redesign the import API from scratch. > > I would suggest starting with imputil and altering as necessary. I'll use > that viewpoint below. > > > Let me start with some requirements. > > > > Compatibility issues: > > --------------------- > > > > - the core API may be incompatible, as long as compatibility layers > > can be provided in pure Python > > Which APIs are you referring to? The "imp" module? The C functions? The > __import__ and reload builtins? > I'm guessing some of imp, the two builtins, and only one or two C > functions. All of those. > > - support for rexec functionality > > No problem. I can think of a number of ways to do this. Agreed, I think that imputil can do this. > > - support for freeze functionality > > No problem. A function in "imp" must be exposed to Python to support this > within the imputil framework. Agreed. It currently exports init_frozen() which is about the right functionality. > > - load .py/.pyc/.pyo files and shared libraries from files > > No problem. Again, a function is needed for platform-specific loading of > shared libraries. Is it useful to expose the platform differences? The current imp.load_dynamic() should suffice. > > - support for packages > > No problem. Demo's in current imputil. > > > - sys.path and sys.modules should still exist; sys.path might > > have a slightly different meaning > > I would suggest that both retain their *exact* meaning. We introduce > sys.importers -- a list of importers to check, in sequence. The first > importer on that list uses sys.path to look for and load modules. The > second importer loads builtins and frozen code (i.e. modules not on > sys.path). This is looking like the redesign I was looking for. (Note that imputil's current chaining is not good since it's impossible to remove or reorder importers, which I think is a required feature; an explicit list would solve this.) Actually, the order is the other way around, but by now you should know that. It makes sense to have separate ones for builtin and frozen modules -- these have nothing in common. There's another issue, which isn't directly addressed by imputil, although with clever use of inheritance it might be doable. I'd like more support for this however. Quite orthogonally to the issue of having separate importers, I might want to recognize new extensions. Take the example of the ILU folks. They want to be able to drop a file "foo.isl" in any directory on sys.path and have the ILU stubber automatically run if you try to import foo (the client stubs) or foo__skel (the server skeleton). This doesn't fit in the sys.importers strategy, because they want to be able to drop their .isl files in any directory along sys.path. (Or, more likely, they want to have control over where in sys.modules the directory/directories with .isl files are placed.) This requires an ugly modification to the _fs_import() function. (Which should have been a method, by the way, to make overriding it in a subclass of PathImporter easier!) I've been thinking here along the lines of a strategy where the standard importer (the one that walks sys.path) has a set of hooks that define various things it could look for, e.g. .py files, .pyc files, .so or .dll files. This list of hooks could be changed to support looking for .isl files. There's an old, subtle issue that could be solved through this as well: whether or not a .pyc file without a .py file should be accepted or not. Long ago (in Python 0.9.8) a .pyc file alone would never be loaded. This was changed at the request of a small but vocal minority of Python developers who wanted to distribute .pyc files without .py files. It has occasionally caused frustration because sometimes developers move .py files around but forget to remove the .pyc files, and then the .pyc file is silently picked up if it occurs on sys.path earlier than where the .py was moved to. Having a set of hooks for various extensions would make it possible to have a default where lone .pyc files are ignored, but where one can insert a .pyc importer in the list of hooks that does the right thing here. (Of course, it may be possible that this whole feature of lone .pyc files should be replaced since the same need is easily taken care of by zip importers. I also want to support (Jim A notwithstanding :-) a feature whereby different things besides directories can live on sys.path, as long as they are strings -- these could be added from the PYTHONPATH env variable. Every piece of code that I've ever seen that uses sys.path doesn't care if a directory named in sys.path doesn't exist -- it may try to stat various files in it, which also don't exist, and as far as it is concerned that is just an indication that the requested module doesn't live there. Again, we would have to dissect imputil to support various hooks that deal with different kind of entities in sys.path. The default hook list would consist of a single item that interprets the name as a directory name; other hooks could support zip files or URLs. Jack's "magic cookies" could also be supported nicely through such a mechanism. > Users can insert/append new importers or alter sys.path as before. > > sys.modules continues to record name:module mappings. Yes. Note that the interpretation of __file__ could be problematic. To what value do you set __file__ for a module loaded from a zip archive? > > - $PYTHONPATH and $PYTHONHOME should still be supported > > No problem. > > > (I wouldn't mind a splitting up of importdl.c into several > > platform-specific files, one of which is chosen by the configure > > script; but that's a bit of a separate issue.) > > Easy enough. The standard importer can select the appropriate > platform-specific module/function to perform the load. i.e. these can move > to Modules/ and be split into a module-per-platform. Again: what's the advantage of exposing the platform specificity? > > New features: > > ------------- > > > > - Integrated support for Greg Ward's distribution utilities (i.e. a > > module prepared by the distutil tools should install painlessly) > > I don't know the specific requirements/functionality that would be > required here (does Greg? :-), but I can't imagine any problem with this. Probably more support is required from the other end: once it's common for modules to be imported from zip files, the distutil code needs to support the creation and installation of such zip files. Also, there is a need for the install phase of distutil to communicate the location of the zip file to the Python installation. > > - Good support for prospective authors of "all-in-one" packaging tool > > authors like Gordon McMillan's win32 installer or /F's squish. (But > > I *don't* require backwards compatibility for existing tools.) > > Um. *No* problem. :-) :-) > > - Standard import from zip or jar files, in two ways: > > > > (1) an entry on sys.path can be a zip/jar file instead of a directory; > > its contents will be searched for modules or packages Note that this is what I mention above for distutil support. > While this could easily be done, I might argue against it. Old > apps/modules that process sys.path might get confused. Above I argued that this shouldn't be a problem. > If compatibility is not an issue, then "No problem." > > An alternative would be an Importer instance added to sys.importers that > is configured for a specific archive (in other words, don't add the zip > file to sys.path, add ZipImporter(file) to sys.importers). This would be harder for distutil: where does Python get the initial list of importers? > Another alternative is an Importer that looks at a "sys.py_archives" list. > Or an Importer that has a py_archives instance attribute. OK, but again distutil needs to be able to add to this list when it installs a package. (Note that package deinstallation should also be supported!) (Of course I don't require this to affect Python processes that are already running; but it should be possible to easily change the default search path for all newly started instances of a given Python installation.) > > (2) a file in a directory that's on sys.path can be a zip/jar file; > > its contents will be considered as a package (note that this is > > different from (1)!) > > No problem. This will slow things down, as a stat() for *.zip and/or *.jar > must be done, in addition to *.py, *.pyc, and *.pyo. Fine, this is where the caching comes in handy. > > I don't particularly care about supporting all zip compression > > schemes; if Java gets away with only supporting gzip compression > > in jar files, so can we. > > I presume we would support whatever zlib gives us, and no more. That's it. :-) > > - Easy ways to subclass or augment the import mechanism along > > different dimensions. For example, while none of the following > > features should be part of the core implementation, it should be > > easy to add any or all: > > > > - support for a new compression scheme to the zip importer > > Presuming ZipImporter is a class (derived from Importer), then this > ability is wholly dependent upon the author of ZipImporter providing the > hook. Agreed. But since we're likely going to provide this as a standandard feature, we must ensure that it provides this hook. > The Importer class is already designed for subclassing (and its interface > is very narrow, which means delegation is also *very* easy; see > imputil.FuncImporter). But maybe it's *too* narrow; some of the hooks I suggest above seem to require extra interfaces -- at least in some of the subclasses of the Importer base class. Note: I looked at the doc string for get_code() and I don't understand what the difference is between the modname and fqname arguments. If I write "import foo.bar", what are modname and fqname? Why are both present? Also, while you claim that the API is narrow, the multiple return values (also the different types for the second item) make it complicated. > > - support for a new archive format, e.g. tar > > A cakewalk. Gordon, JimA, and myself each have archive formats. :-) > > > - a hook to import from URLs or other data sources (e.g. a > > "module server" imported in CORBA) (this needn't be supported > > through $PYTHONPATH though) > > No problem at all. > > > - a hook that imports from compressed .py or .pyc/.pyo files > > No problem at all. > > > - a hook to auto-generate .py files from other filename > > extensions (as currently implemented by ILU) > > No problem at all. See above -- I think this should be more integrated with sys.path than you are thinking of. The more I think about it, the more I see that the problem is that for you, the importer that uses sys.path is a final subclass of Importer (i.e. it is itself not further subclassed). Several of the hooks I want seem to require additional hooks in the PathImporter rather than new importers. > > - a cache for file locations in directories/archives, to improve > > startup time > > No problem at all. > > > - a completely different source of imported modules, e.g. for an > > embedded system or PalmOS (which has no traditional filesystem) > > No problem at all. > > In each of the above cases, the Importer.get_code() method just needs to > grab the byte codes from the XYZ data source. That data source can be > cmopressed, across a network, on-the-fly generated, or whatever. Each > importer can certainly create a cache based on its concept of "location". > In some cases, that would be a mapping from module name to filesystem > path, or to a URL, or to a compiled-in, frozen module. See above for sys.path integration remark. > > - Note that different kinds of hooks should (ideally, and within > > reason) properly combine, as follows: if I write a hook to recognize > > .spam files and automatically translate them into .py files, and you > > write a hook to support a new archive format, then if both hooks are > > installed together, it should be possible to find a .spam file in an > > archive and do the right thing, without any extra action. Right? > > Ack. Very, very difficult. Actually, I take most of this back. Importers that deal with new extension types often have to go through a file system to transform their data to .py files, and this is just too complicated. However it would be still nice if there was code sharing between the code that looks for .py and .pyc files in a zip archive and the code that does the same in a filesystem. Hm, maybe even that shouldn't be necessary, the zip file probably should contain only .pyc files... (Unrelated remark: I should really try to release the set of modules we've written here at CNRI to deal with zip files. Unfortunately zip files are hairy and so is our code.) > The imputil scheme combines the concept of locating/loading into one step. > There is only one "hook" in the imputil system. Its semantic is "map this > name to a code/module object and return it; if you don't have it, then > return None." That's fine. I actually don't recall where the find-then-load API came from, I think it may be an artefact of the original implementation strategy. It is currently used as follows: we try to see if there's a .pyc and then we try to see if there's a .py; if both exist we compare the timestamps etc. to choose which one. But that's still a red herring. > Your compositing example is based on the capabilities of the > find-then-load paradigm of the existing "ihooks.py". One module finds > something (foo.spam) and the other module loads it (by generating a .py). I still don't understand why ihooks.py had to be so complicated. I guess I just had much less of an understanding of the issues. (It was also partly a compromise with an alternative design by Ken Manheimer, who basically forced me to support packages, originally through ni.py.) > All is not lost, however. I can easily envision the get_code() hook as > allowing any kind of return type. If it isn't a code or module object, > then another hook is called to transform it. > [ actually, I'd design it similarly: a *series* of hooks would be called > until somebody transforms the foo.spam into a code/module object. ] OK. This could be a feature of a subclass of Importer. > The compositing would be limited ony by the (Python-based) Importer > classes. For example, my ZipImporter might expect to zip up .pyc files > *only*. Obviously, you would want to alter this to support zipping any > file, then use the suffic to determine what to do at unzip time. > > > - It should be possible to write hooks in C/C++ as well as Python > > Use FuncImporter to delegate to an extension module. Maybe not so great, since it sounds like the C code can't benefit from any of the infrastructure that imputil offers. I'm not sure about this one though. > This is one of the benefits of imputil's single/narrow interface. Plus its vague specs? :-) > > - Applications embedding Python may supply their own implementations, > > default search path, etc., but don't have to if they want to piggyback > > on an existing Python installation (even though the latter is > > fraught with risk, it's cheaper and easier to understand). > > An application would have full control over the contents of sys.importers. > > For a restricted execution app, it might install an Importer that loads > files from *one* directory only which is configured from a specific > Win32 Registry entry. That importer could also refuse to load shared > modules. The BuiltinImporter would still be present (although the app > would certainly omit all but the necessary builtins from the build). > Frozen modules could be excluded. Actually there's little reason to exclude frozen modules or any .py/.pyc modules -- by definition, bytecode can't be dangerous. It's the builtins and extensions that need to be censored. We currently do this by subclassing ihooks, where we mask the test for builtins with a comparison to a predefined list of names. > > Implementation: > > --------------- > > > > - There must clearly be some code in C that can import certain > > essential modules (to solve the chicken-or-egg problem), but I don't > > mind if the majority of the implementation is written in Python. > > Using Python makes it easy to subclass. > > I posited once before that the cost of import is mostly I/O rather than > CPU, so using Python should not be an issue. MAL demonstrated that a good > design for the Importer classes is also required. Based on this, I'm a > *strong* advocate of moving as much as possible into Python (to get > Python's ease-of-coding with little relative cost). Agreed. However, how do you explain the slowdown (from 9 to 13 seconds I recall) though? Are you a lousy coder? :-) > The (core) C code should be able to search a path for a module and import > it. It does not require dynamic loading or packages. This will be used to > import exceptions.py, then imputil.py, then site.py. It does, however, need to import builtin modules. imputil currently imports imp, sys, strop and __builtin__, struct and marshal; note that struct can easily be a dynamic loadable module, and so could strop in theory. (Note that strop will be unnecessary in 1.6 if you use string methods.) I don't think that this chicken-or-egg problem is particularly problematic though. > The platform-specific module that perform dynamic-loading must be a > statically linked module (in Modules/ ... it doesn't have to be in the > Python/ directory). See earlier comments. > site.py can complete the bootstrap by setting up sys.importers with the > appropriate Importer instances (this is where an application can define > its own policy). sys.path was initially set by the import.c bootstrap code > (from the compiled-in path and environment variables). I thing that algorithm (currently in getpath.c / getpathp.c) might also be moved to Python code -- imported frozen. Sadly, rebuilding with a new version of a frozen module might be more complicated than rebuilding with a new version of a C module, but writing and maintaining this code in Python would be *sooooooo* much easier that I think it's worth it. > Note that imputil.py would not install any hooks when it is loaded. That > is up to site.py. This implies the core C code will import a total of > three modules using its builtin system. After that, the imputil mechanism > would be importing everything (site.py would .install() an Importer which > then takes over the __import__ hook). (Three not counting the builtin modules.) > Further note that the "import" Python statement could be simplified to use > only the hook. However, this would require the core importer to inject > some module names into the imputil module's namespace (since it couldn't > use an import statement until a hook was installed). While this > simplification is "neat", it complicates the run-time system (the import > statement is broken until a hook is installed). Same chicken-or-egg. We can be pragmatic. For a developer, I'd like a bit of robustness (all this makes it rather hard to debug a broken imputil, and that's a fair amount of code!). > Therefore, the core C code must also support importing builtins. "sys" and > "imp" are needed by imputil to bootstrap. > > The core importer should not need to deal with dynamic-load modules. Same question. Since that all has to be coded in C anyway, why not? > To support frozen apps, the core importer would need to support loading > the three modules as frozen modules. I'd like to see a description of how someone like Jim A would build a single-file application using the new mechanism. This could completely replace freeze. (Freeze currently requires a C compiler; that's bad.) > The builtin/frozen importing would be exposed thru "imp" for use by > imputil for future imports. imputil would load and use the (builtin) > platform-specific module to do dynamic-load imports. Sure. > > - In order to support importing from zip/jar files using compression, > > we'd at least need the zlib extension module and hence libz itself, > > which may not be available everywhere. > > Yes. I don't see this as a requirement, though. We wouldn't start to use > these by default, would we? Or insist on zlib being present? I see this as > more along the lines of "we have provided a standardized Importer to do > this, *provided* you have zlib support." Agreed. Zlib support is easy to get, but there are probably platforms where it's not. (E.g. maybe the Mac? I suppose that on the Mac, there would be some importer classes to import from a resource fork.) > > - I suppose that the bootstrap is solved using a mechanism very > > similar to what freeze currently used (other solutions seem to be > > platform dependent). > > The bootstrap that I outlined above could be done in C code. The import > code would be stripped down dramatically because you'll drop package > support and dynamic loading. Not the dynamic loading. But yes the package support. > Alternatively, you could probably do the path-scanning in Python and > freeze that into the interpreter. Personally, I don't like this idea as it > would not buy you much at all (it would still need to return to C for > accessing a number of scanning functions and module importing funcs). > > > - I also want to still support importing *everything* from the > > filesystem, if only for development. (It's hard enough to deal with > > the fact that exceptions.py is needed during Py_Initialize(); > > I want to be able to hack on the import code written in Python > > without having to rebuild the executable all the time. > > My outline above does not freeze anything. Everything resides in the > filesystem. The C code merely needs a path-scanning loop and functions to > import .py*, builtin, and frozen types of modules. Good. Though I think there's also a need for freezing everything. And when we go the route of the zip archive, the zip archive handling code needs to be somewhere -- frozen seems to be a reasonable choice. > If somebody nukes their imputil.py or site.py, then they return to Python > 1.4 behavior where the core interpreter uses a path for importing (i.e. no > packages). They lose dynamically-loaded module support. But if the path guessing is also done by site.py (as I propose) the path will probably be wrong. A warning should be printed. > > Let's first complete the requirements gathering. Are these > > requirements reasonable? Will they make an implementation too > > complex? Am I missing anything? > > I'm not a fan of the compositing due to it requiring a change to semantics > that I believe are very useful and very clean. However, I outlined a > possible, clean solution to do that (a secondary set of hooks for > transforming get_code() return values). As you may see from my responses, I'm a big fan of having several different sets of hooks. I do withdraw the composition requirement though. > The requirements are otherwise reasonable to me, as I see that they can > all be readily solved (i.e. they aren't burdensome). > > While this email may be long, I do not believe the resulting system would > be complex. From the user-visible side of things, nothing would be > changed. sys.path is still present and operates as before. They *do* have > new functionality they can grow into, though (sys.importers). The > underlying C code is simplified, and the platform-specific dynamic-load > stuff can be distributed to distinct modules, as needed > (e.g. BeOS/dynloadmodule.c and PC/dynloadmodule.c). > > > Finally, to what extent does this impact the desire for dealing > > differently with the Python bytecode compiler (e.g. supporting > > optimizers written in Python)? And does it affect the desire to > > implement the read-eval-print loop (the >>> prompt) in Python? > > If the three startup files require byte-compilation, then you could have > some issues (i.e. the byte-compiler must be present). Another chicken-or-egg. No biggie. > Once you hit site.py, you have a "full" environment and can easily detect > and import a read-eval-print loop module (i.e. why return to Python? just > start things up right there). You mean "why return to C?" I agree. It would be cool if somehow IDLE and Pythonwin would also be bootstrapped using the same mechanisms. (This would also solve the question "which interactive environment am I using?" that some modules and apps want to see answered because they need to do things differently when run under IDLE,for example.) > site.py can also install new optimizers as desired, a new Python-based > parser or compiler, or whatever... If Python is built without a parser or > compiler (I hope that's an option!), then the three startup modules would > simply be frozen into the executable. More power to hooks! --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Dec 2 21:22:33 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 2 Dec 1999 16:22:33 -0500 (EST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us> References: <199912022043.PAA15108@eric.cnri.reston.va.us> Message-ID: <14406.58137.359127.921135@weyr.cnri.reston.va.us> Guido van Rossum writes: > variable. Every piece of code that I've ever seen that uses sys.path > doesn't care if a directory named in sys.path doesn't exist -- it may > try to stat various files in it, which also don't exist, and as far as Not the case -- I know you've looked at some of my code in the KOE that ensures only real directories are on the path, and each is only there once (pathhack.py). Given that sys.path is often too long and includes duplicate entries in a large system (often one entry with and one without a trailing / for a given directory), it useful to be able to distinguish between things that should be interpretable as paths and things that aren't. It should not be hard to declare that "cookies" or whatever have some special form, like "". > (Unrelated remark: I should really try to release the set of modules > we've written here at CNRI to deal with zip files. Unfortunately zip > files are hairy and so is our code.) It doesn't help that that code just plain stinks. I maintain that no one here understands the whole of it. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jcw@equi4.com Thu Dec 2 21:41:46 1999 From: jcw@equi4.com (Jean-Claude Wippler) Date: Thu, 02 Dec 1999 22:41:46 +0100 Subject: [Python-Dev] Import redesign [LONG] References: <199912022043.PAA15108@eric.cnri.reston.va.us> Message-ID: <3846E79A.446EAFD5@equi4.com> Guido van Rossum wrote: [...] > Note that the interpretation of __file__ could be problematic. To > what value do you set __file__ for a module loaded from a zip archive? Makefiles use "archive(entry)" (this also supports nesting if needed). [...] > I'd like to see a description of how someone like Jim A would build a > single-file application using the new mechanism. This could > completely replace freeze. (Freeze currently requires a C compiler; > that's bad.) [...] This may be off-topic, but has anyone considered what it would take to load shared libs out of an archive? One way is to extract on-the-fly to a temporary area. A refinement is to leave extracted files there as cache, and perhaps even to extract to a file with a name derived from its MD5 digest (this way multiple users and even Python installations can share the cache). Would it be useful to define a "standard" area? -- Jean-Claude From gmcm@hypernet.com Thu Dec 2 23:15:50 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 2 Dec 1999 18:15:50 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us> References: Your message of "Fri, 19 Nov 1999 05:29:50 PST." Message-ID: <1267945992-2611810@hypernet.com> [Guido] big snip > Note that the interpretation of __file__ could be problematic. > To what value do you set __file__ for a module loaded from a zip > archive? I just left it alone (ie, as it was when I picked up the .pyc). Turns out OK, because then when the end user files a bug report, the developer can track it down. > Note: I looked at the doc string for get_code() and I don't > understand what the difference is between the modname and fqname > arguments. If I write "import foo.bar", what are modname and > fqname? As I recall: import foo.bar -> get_code(None, 'foo', 'foo') # returns foo -> get_code(, 'bar', 'foo.bar') > Why are both present? I think so the importer can choose between being tree structured or flat. > I'd like to see a description of how someone like Jim A would > build a single-file application using the new mechanism. This > could completely replace freeze. (Freeze currently requires a C > compiler; that's bad.) I have something working for Linux now. I froze exceptions.py. I hacked getpath.c so prefix = exec_prefix = executable's directory and the starting path is [prefix]. Although I did it differently, you could regard imputil.py and archive.py as frozen, too. (On WIndows it's somewhat different, because the result uses the stock python15.dll.) This somewhat oversimplifies; and I haven't really thought out all the ways people might try to use sym links. I'm inclined to think the starting path should contain both the executable's real directory and the sym link's directory. > .... I do withdraw the composition > requirement though. Hooray! - Gordon From gstein@lyra.org Fri Dec 3 00:19:14 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 2 Dec 1999 16:19:14 -0800 (PST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <384694D8.DCA3D75E@lemburg.com> Message-ID: On Thu, 2 Dec 1999, M.-A. Lemburg wrote: >... > Still, I would like to rephrase my 0.02EUR which I already > posted twice... why not start to think about what these > importers would do first ? If there are only a handful of > wishes we could just add them to the builtin machinery and > be done with it... I'd rather see the builtin machinery move to Python, regardless of what system is used and/or what features are added. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Fri Dec 3 03:19:40 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 2 Dec 1999 19:19:40 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us> Message-ID: On Thu, 2 Dec 1999, Guido van Rossum wrote: >... > Sometime, Greg Stein wrote: >... > > On Thu, 18 Nov 1999, Guido van Rossum wrote: >... > > > Agreed. I like some of imputil's features, but I think the API > > > need to be redesigned. > > > > It what ways? It sounds like you've applied some thought. Do you have any > > concrete ideas yet, or "just a feeling" :-) I'm working through some > > changes from JimA right now, and would welcome other suggestions. I think > > there may be some outstanding stuff from MAL, but I'm not sure (Marc?) > > I actually think that the way the PVM (Python VM) calls the importer > ought to be changed. Assigning to __builtin__.__import__ is a crock. > The API for __import__ is a crock. Something like sys.set_import_hook() ? The other alternative that I see would be to have the C code scan sys.importers, assuming each are callable objects, and call them with the appropriate params (e.g. module name). Of course, to move this scanning into Python would require something like sys.set_import_hook() unless Python looks for a hard-coded module and entrypoint. >... > > Which APIs are you referring to? The "imp" module? The C functions? The > > __import__ and reload builtins? > > > I'm guessing some of imp, the two builtins, and only one or two C > > functions. > > All of those. We can provide Python code to provide compatibility for "imp" and the two hooks. Nothing we can do to the C code, though. I'm not sure what the import API looks like from C, and whether they could all stay. A brief glance looks like most could stay. [ removing any would change Python's API version, which might be "okay" ] >... > > > - load .py/.pyc/.pyo files and shared libraries from files > > > > No problem. Again, a function is needed for platform-specific loading of > > shared libraries. > > Is it useful to expose the platform differences? The current > imp.load_dynamic() should suffice. This comes up several times throughout this message, and in some off-list mail Guido and I have exchanged. Namely, "should dynamic loading be part of the core, or performed via a module?" I would rather see it become a module, rather than inside the core (despite the fact that the module would have to be compiled into the interpreter). I believe this provides more flexibility for people looking to replace/augment/update/fix dynamic loading on various architectures. Rather than changing the core, a person can just drop in another module. The isolation between the core and modules is nicer, aesthetically, to me. The modules would also be exposing Just Another Importer Function, rather than a specialized API in the builtin imp module. Also note that it is easier to keep a module *out* of a Python-based application, than it is to yank functions out of the core of Python. Frozen apps, embedded apps, etc could easily leave out dynamic loading. Are there strict advantages? Not any that I can think of right now (beyond a bit of ease-of-use mentioned above). It just feels better to me. >... > > > - sys.path and sys.modules should still exist; sys.path might > > > have a slightly different meaning > > > > I would suggest that both retain their *exact* meaning. We introduce > > sys.importers -- a list of importers to check, in sequence. The first > > importer on that list uses sys.path to look for and load modules. The > > second importer loads builtins and frozen code (i.e. modules not on > > sys.path). > > This is looking like the redesign I was looking for. (Note that > imputil's current chaining is not good since it's impossible to remove > or reorder importers, which I think is a required feature; an explicit > list would solve this.) The chaining is an aspect of the current, singular import hook that Python uses. In the past, I've suggested the installation of a "manager" that maintains a list. sys.importers is similar in practice. Note that this Manager would be present with the sys.set_import_hook() scheme, while the Manager is implied if the core scans sys.importers. > Actually, the order is the other way around, but by now you should > know that. It makes sense to have separate ones for builtin and > frozen modules -- these have nothing in common. Yes, JimA pointed this out. The latest imputil has corrected this. I combined the builtin and frozen Importers because they were just so similar. I didn't want to iterate over two Importers when a single one sufficed quite well. *shrug* Could go either way, really. > There's another issue, which isn't directly addressed by imputil, > although with clever use of inheritance it might be doable. I'd like > more support for this however. Quite orthogonally to the issue of > having separate importers, I might want to recognize new extensions. Correct: while imputil doesn't address this, the standard/default Importer classes *definitely* can. >... > the directory/directories with .isl files are placed.) This requires > an ugly modification to the _fs_import() function. (Which should have > been a method, by the way, to make overriding it in a subclass of > PathImporter easier!) I yanked that code out of the DirectoryImporter so that the PathImporter could use it. I could see a reorg that creates a FileSystemImporter that defines the method, and the other two just subclass from that. > I've been thinking here along the lines of a strategy where the > standard importer (the one that walks sys.path) has a set of hooks > that define various things it could look for, e.g. .py files, .pyc > files, .so or .dll files. This list of hooks could be changed to > support looking for .isl files. Agreed. It should be easy to have a mapping of extension to handler. One issue: should there be an ordering to the extensions? Exercise for the reader to alter the data structures... > There's an old, subtle issue that could be solved through this as > well: whether or not a .pyc file without a .py file should be accepted > or not. Long ago (in Python 0.9.8) a .pyc file alone would never be > loaded. This was changed at the request of a small but vocal minority > of Python developers who wanted to distribute .pyc files without .py > files. It has occasionally caused frustration because sometimes > developers move .py files around but forget to remove the .pyc files, > and then the .pyc file is silently picked up if it occurs on sys.path > earlier than where the .py was moved to. I think, "too bad for them." :-) Having just a .pyc is a very nice feature. But how can you tell whether it was meant to be a plain .pyc or a mis-ordered one? To truly resolve that, you would need to scan the whole path, looking for a .py. However, maybe somebody put the .pyc there on purpose, to override the .py! --- begin slightly-off-topic --- Here is a neat little Bash script that allows you to use a .pyc as a CGI (to avoid parse overhead). Normally, you can't just drop a .pyc into the cgi-bin directory because the OS doesn't know how to execute it. Not a problem, I say... just append your .pyc to the following Bash script and execute! :-) #!/bin/bash exec - 3< $0 ; exec python -c 'import os,marshal ; f = os.fdopen(3, "rb") ; f.readline() ; f.readline() ; f.seek(8, 1) ; _c = marshal.load(f) ; del os, marshal, f ; exec _c' $@ (the script should be two lines; and no... you can't use readlines(2)) The above script will preserve stdin, stdout, and stderr. If the caller also use 3< ... well, that got overridden :-) The script doesn't work on Windows for two reasons, though: 1) Bash, 2) the "rb" mode followed by readline() Detailed info at the bottom of http://www.lyra.org/greg/python/ --- end of off-topic --- > Having a set of hooks for various extensions would make it possible to > have a default where lone .pyc files are ignored, but where one can > insert a .pyc importer in the list of hooks that does the right thing > here. (Of course, it may be possible that this whole feature of lone > .pyc files should be replaced since the same need is easily taken care > of by zip importers. Maybe. I'd still like to see plain .pyc files, but I know I can work around any change you might make here :-) (i.e. whatever you'd like to do... go for it) > I also want to support (Jim A notwithstanding :-) a feature whereby > different things besides directories can live on sys.path, as long as > they are strings -- these could be added from the PYTHONPATH env > variable. Every piece of code that I've ever seen that uses sys.path > doesn't care if a directory named in sys.path doesn't exist -- it may > try to stat various files in it, which also don't exist, and as far as > it is concerned that is just an indication that the requested module > doesn't live there. I'm not in favor of this, but it is more-than-doable. Again: your discretion... > Again, we would have to dissect imputil to support various hooks that > deal with different kind of entities in sys.path. The default hook > list would consist of a single item that interprets the name as a > directory name; other hooks could support zip files or URLs. Jack's > "magic cookies" could also be supported nicely through such a > mechanism. Specifically, the PathImporter would get "dissected" :-). No problem. > > Users can insert/append new importers or alter sys.path as before. > > > > sys.modules continues to record name:module mappings. > > Yes. > > Note that the interpretation of __file__ could be problematic. To > what value do you set __file__ for a module loaded from a zip archive? You don't (certainly in a way that is nice/compatible for modules that refer to it). This is why I don't like __file__ and __path__. They just don't make sense in archives or frozen code. Python code that relies on them will create problems when that code is placed into different packaging mechanisms. >... > > > (I wouldn't mind a splitting up of importdl.c into several > > > platform-specific files, one of which is chosen by the configure > > > script; but that's a bit of a separate issue.) > > > > Easy enough. The standard importer can select the appropriate > > platform-specific module/function to perform the load. i.e. these can move > > to Modules/ and be split into a module-per-platform. > > Again: what's the advantage of exposing the platform specificity? See above. >... > Probably more support is required from the other end: once it's common > for modules to be imported from zip files, the distutil code needs to > support the creation and installation of such zip files. Also, there > is a need for the install phase of distutil to communicate the > location of the zip file to the Python installation. I'm quite confident that something can be designed that would satisfy the needs here. Something akin to .pth files that a zip importer could read. >... > > > - Standard import from zip or jar files, in two ways: > > > > > > (1) an entry on sys.path can be a zip/jar file instead of a directory; > > > its contents will be searched for modules or packages > > Note that this is what I mention above for distutil support. > > > While this could easily be done, I might argue against it. Old > > apps/modules that process sys.path might get confused. > > Above I argued that this shouldn't be a problem. For most code, no, but as Fred mentioned (and I surmise), there are things out there assuming that sys.path contains strings which specify directories. Sure, we can do this (your discretion), but my feeling is to avoid it. > > If compatibility is not an issue, then "No problem." > > > > An alternative would be an Importer instance added to sys.importers that > > is configured for a specific archive (in other words, don't add the zip > > file to sys.path, add ZipImporter(file) to sys.importers). > > This would be harder for distutil: where does Python get the initial > list of importers? Default is just the two: BuiltinImporter and PathImporter. Adding ZipImporters (or anything else) at startup is TBD, but shouldn't pose a problem. >... > > > (2) a file in a directory that's on sys.path can be a zip/jar file; > > > its contents will be considered as a package (note that this is > > > different from (1)!) > > > > No problem. This will slow things down, as a stat() for *.zip and/or *.jar > > must be done, in addition to *.py, *.pyc, and *.pyo. > > Fine, this is where the caching comes in handy. IFF caching is enabled for the particular platform and installation. >... > > The Importer class is already designed for subclassing (and its interface > > is very narrow, which means delegation is also *very* easy; see > > imputil.FuncImporter). > > But maybe it's *too* narrow; some of the hooks I suggest above seem to > require extra interfaces -- at least in some of the subclasses of the > Importer base class. Correct -- the *subclasses*. I still maintain the imputil design of a single hook (get_code) is Right. I'll make a swipe at PathImporter in the next few weeks to add the capability for new extensions. > Note: I looked at the doc string for get_code() and I don't understand > what the difference is between the modname and fqname arguments. If I > write "import foo.bar", what are modname and fqname? Why are both > present? Also, while you claim that the API is narrow, the multiple > return values (also the different types for the second item) make it > complicated. Gordon detailed this in another note... Yes, the multiple return values make it a bit more complicated, but I can't think of any reasonable alternatives. A bit more doc should do the trick, I'd guess. >... > > > - a hook to auto-generate .py files from other filename > > > extensions (as currently implemented by ILU) > > > > No problem at all. > > See above -- I think this should be more integrated with sys.path than > you are thinking of. The more I think about it, the more I see that > the problem is that for you, the importer that uses sys.path is a > final subclass of Importer (i.e. it is itself not further subclassed). > Several of the hooks I want seem to require additional hooks in the > PathImporter rather than new importers. Correct -- I've currently designed/implemented PathImporter as "final". I don't forsee a problem turning it into something that can be hooked at run-time, or subclassed at code-time. A detailing of the features needed would be handy: * allow alternative file suffixes, with functions or subclasses to map the file into a code/module object. >... > > > - Note that different kinds of hooks should (ideally, and within > > > reason) properly combine, as follows: if I write a hook to recognize > > > .spam files and automatically translate them into .py files, and you > > > write a hook to support a new archive format, then if both hooks are > > > installed together, it should be possible to find a .spam file in an > > > archive and do the right thing, without any extra action. Right? > > > > Ack. Very, very difficult. > > Actually, I take most of this back. Importers that deal with new > extension types often have to go through a file system to transform > their data to .py files, and this is just too complicated. However it > would be still nice if there was code sharing between the code that > looks for .py and .pyc files in a zip archive and the code that does > the same in a filesystem. Hm, maybe even that shouldn't be necessary, > the zip file probably should contain only .pyc files... Gordon replies to this... All of the archives that myself, Gordon, and JimA have been using only store .pyc files. I don't see much code sharing between the filesystem and archive import code. >... > > All is not lost, however. I can easily envision the get_code() hook as > > allowing any kind of return type. If it isn't a code or module object, > > then another hook is called to transform it. > > [ actually, I'd design it similarly: a *series* of hooks would be called > > until somebody transforms the foo.spam into a code/module object. ] > > OK. This could be a feature of a subclass of Importer. That would be my preference, rather than loading more into the Importer base class itself. >... > > > - It should be possible to write hooks in C/C++ as well as Python > > > > Use FuncImporter to delegate to an extension module. > > Maybe not so great, since it sounds like the C code can't benefit from > any of the infrastructure that imputil offers. I'm not sure about > this one though. There isn't any infrastructure that needs to be accessed. get_code() is the call-point, and there is no mechanism provided to the callee to call back into the imputil system. > > This is one of the benefits of imputil's single/narrow interface. > > Plus its vague specs? :-) Ouch. I thought I was actually doing quite a bit better than normal with that long doc-string on get_code :-( >... > > For a restricted execution app, it might install an Importer that loads > > files from *one* directory only which is configured from a specific > > Win32 Registry entry. That importer could also refuse to load shared > > modules. The BuiltinImporter would still be present (although the app > > would certainly omit all but the necessary builtins from the build). > > Frozen modules could be excluded. > > Actually there's little reason to exclude frozen modules or any > .py/.pyc modules -- by definition, bytecode can't be dangerous. It's > the builtins and extensions that need to be censored. > > We currently do this by subclassing ihooks, where we mask the test for > builtins with a comparison to a predefined list of names. True. My concern is an invader misusing one "type" of module for another. For example, let's say you've provided a selection of modules each exporting function FOO, and the user can configure which module to use. Can they do damage if some unrelated, frozen module also exports FOO? Minor issue, anyhow. All the functionality is there. >... > > I posited once before that the cost of import is mostly I/O rather than > > CPU, so using Python should not be an issue. MAL demonstrated that a good > > design for the Importer classes is also required. Based on this, I'm a > > *strong* advocate of moving as much as possible into Python (to get > > Python's ease-of-coding with little relative cost). > > Agreed. However, how do you explain the slowdown (from 9 to 13 > seconds I recall) though? Are you a lousy coder? :-) Heh :-) I have not spent *any* time working on optimization. Currently, each Importer in the chain redoes some work of the prior Importer. A bit of restructuring would split the common work out to a Manager, which then calls a method in the Importer (and passes all the computed work). Of course, a bit of profiling wouldn't hurt either. Some of the "imp" interfaces could possibly be refined to better support the BuiltinImporter or the dynamic load features. The question is still valid, though -- at the moment, I can't explain it because I haven't looked into it. > > The (core) C code should be able to search a path for a module and import > > it. It does not require dynamic loading or packages. This will be used to > > import exceptions.py, then imputil.py, then site.py. Note: after writing this, I realized there is really no need for the core to do the imputil import. site.py can easily do that. > It does, however, need to import builtin modules. imputil currently Correct. > imports imp, sys, strop and __builtin__, struct and marshal; note that > struct can easily be a dynamic loadable module, and so could strop in > theory. (Note that strop will be unnecessary in 1.6 if you use string > methods.) I knew about strop, but imputil would be harder to use today if it relied on the string methods. So... I've delayed that change. The struct module is used in a couple teeny cases, dealing with constructing a network-order, 4-byte, binary integer value. It would be easy enough to just do that with a bit of Python code instead. > I don't think that this chicken-or-egg problem is particularly > problematic though. Right. In my ideal world, the core couldn't do a dynamic load, so that would need to be considered within the bootstrap process. >... > > site.py can complete the bootstrap by setting up sys.importers with the > > appropriate Importer instances (this is where an application can define > > its own policy). sys.path was initially set by the import.c bootstrap code > > (from the compiled-in path and environment variables). > > I thing that algorithm (currently in getpath.c / getpathp.c) might > also be moved to Python code -- imported frozen. Sadly, rebuilding > with a new version of a frozen module might be more complicated than > rebuilding with a new version of a C module, but writing and > maintaining this code in Python would be *sooooooo* much easier that I > think it's worth it. I think we can find a better way to freeze modules and to use them. Especially for the cases where we have specific "core" functions implemented in Python. (e.g. freezing parsers, compilers, and/or the read-eval loop) I don't forsee an issue that the build process becomes more complicated. If we nuke "makesetup" in favor of a Python script, then we could create a stub Python executable which runs the build script which writes the Setup file and the getpath*.c file(s). > > Note that imputil.py would not install any hooks when it is loaded. That > > is up to site.py. This implies the core C code will import a total of > > three modules using its builtin system. After that, the imputil mechanism > > would be importing everything (site.py would .install() an Importer which > > then takes over the __import__ hook). > > (Three not counting the builtin modules.) Correct, although I'll modify my statement to "two plus the builtins". > > Further note that the "import" Python statement could be simplified to use > > only the hook. However, this would require the core importer to inject > > some module names into the imputil module's namespace (since it couldn't > > use an import statement until a hook was installed). While this > > simplification is "neat", it complicates the run-time system (the import > > statement is broken until a hook is installed). > > Same chicken-or-egg. We can be pragmatic. > > For a developer, I'd like a bit of robustness (all this makes it > rather hard to debug a broken imputil, and that's a fair amount of > code!). True. I threw that out as an alternative, and then presented the counter argument :-) >... > > Therefore, the core C code must also support importing builtins. "sys" and > > "imp" are needed by imputil to bootstrap. > > > > The core importer should not need to deal with dynamic-load modules. > > Same question. Since that all has to be coded in C anyway, why not? It simplifies the core's import code to not deal with that stuff at all. > > To support frozen apps, the core importer would need to support loading > > the three modules as frozen modules. > > I'd like to see a description of how someone like Jim A would build a > single-file application using the new mechanism. This could > completely replace freeze. (Freeze currently requires a C compiler; > that's bad.) The portable mechanism for freezing will always need a compiler. Platform specific mechanisms (e.g. append to the .EXE, or use the linker to create a new ELF section) can optimize the freeze process in different ways. I don't have a design in my head for the freeze issues -- I've been considering that the mechanism would remain about the same. However, I can easily see that different platforms may want to use different freeze processes... hmm... >... > > Yes. I don't see this as a requirement, though. We wouldn't start to use > > these by default, would we? Or insist on zlib being present? I see this as > > more along the lines of "we have provided a standardized Importer to do > > this, *provided* you have zlib support." > > Agreed. Zlib support is easy to get, but there are probably platforms > where it's not. (E.g. maybe the Mac? I suppose that on the Mac, > there would be some importer classes to import from a resource fork.) Exactly. And importer classes to load from a Win32 resources (modifying a .EXE's resources post-link is cleaner than the append solution) >... > > My outline above does not freeze anything. Everything resides in the > > filesystem. The C code merely needs a path-scanning loop and functions to > > import .py*, builtin, and frozen types of modules. > > Good. Though I think there's also a need for freezing everything. > And when we go the route of the zip archive, the zip archive handling > code needs to be somewhere -- frozen seems to be a reasonable choice. Sure. > > If somebody nukes their imputil.py or site.py, then they return to Python > > 1.4 behavior where the core interpreter uses a path for importing (i.e. no > > packages). They lose dynamically-loaded module support. > > But if the path guessing is also done by site.py (as I propose) the > path will probably be wrong. A warning should be printed. All right. Doesn't Python already print a warning if it can't find site.py? > > > Let's first complete the requirements gathering. Are these > > > requirements reasonable? Will they make an implementation too > > > complex? Am I missing anything? > > > > I'm not a fan of the compositing due to it requiring a change to semantics > > that I believe are very useful and very clean. However, I outlined a > > possible, clean solution to do that (a secondary set of hooks for > > transforming get_code() return values). > > As you may see from my responses, I'm a big fan of having several > different sets of hooks. Yes. However, I've only recognized one so far. Propose more... I'm confident we can update the PathImporter design to accomodate (and retain the underlying imputil paradigm). > I do withdraw the composition requirement > though. :-) >... > > Once you hit site.py, you have a "full" environment and can easily detect > > and import a read-eval-print loop module (i.e. why return to Python? just > > start things up right there). > > You mean "why return to C?" I agree. It would be cool if somehow Heh. Yah, that's what I meant :-) > IDLE and Pythonwin would also be bootstrapped using the same > mechanisms. (This would also solve the question "which interactive > environment am I using?" that some modules and apps want to see > answered because they need to do things differently when run under > IDLE,for example.) Haven't thought on this. Should be doable, I'd think. > > site.py can also install new optimizers as desired, a new Python-based > > parser or compiler, or whatever... If Python is built without a parser or > > compiler (I hope that's an option!), then the three startup modules would > > simply be frozen into the executable. > > More power to hooks! :-) You betcha! I believe my next order of business: * update PathImporter with the file-extension hook * dynload C code reorg, per the other email * create new-model site.py and trash import.c * review freeze mechanisms and process * design mechanism for frozen core functionality (eg. getpath*.c) (coding and building design) * shift core functions to Python, using above design I'll just plow ahead, but also recognize that any/all may change. ie. I'll build examples/finals/prototypes and Guido can pick/choose/reimplement/etc as needed. I'm out next week, but should start on the above items by the end of the month (will probably do another mod_dav release in there somewhere). Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik@pythonware.com Fri Dec 3 10:10:10 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 3 Dec 1999 11:10:10 +0100 Subject: [Python-Dev] Import redesign [LONG] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> Message-ID: <023601bf3d78$0ec3dc30$f29b12c2@secret.pythonware.com> Jean-Claude Wippler wrote: > This may be off-topic, but has anyone considered what it would take to > load shared libs out of an archive? well, we do that in a number of applications. (lazy installers are really cool... if you've installed works, you've seen some weird stuff -- for example, when the application starts the first time, it's loading everything from inside the installer. the rest of the installation is done from within the application itself, using archives in the installation executable) I think things like this are better left for the application designers, though... From mal@lemburg.com Fri Dec 3 10:03:31 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 03 Dec 1999 11:03:31 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: Message-ID: <38479573.B2CFDD2B@lemburg.com> Greg Stein wrote: > > On Thu, 2 Dec 1999, M.-A. Lemburg wrote: > >... > > Still, I would like to rephrase my 0.02EUR which I already > > posted twice... why not start to think about what these > > importers would do first ? If there are only a handful of > > wishes we could just add them to the builtin machinery and > > be done with it... > > I'd rather see the builtin machinery move to Python, regardless of what > system is used and/or what features are added. In the long run that's probably the right direction, but right now we are only talking a very small set of additional features, which can easily be added to the existing code without too much fuzz. Plus it won't slow things down, which is important since Python startup time is already an issue all by itself. The imputil.py approach of doing (a whole bunch of) recursive Python function calls to all kinds of importers will not speed this up, I'm afraid. A on-disk lookup table would speed this up, but it would also break the current logic in imputil.py, which puts importer independence above all. -- IMHO, we should retreat to a more centralized interface, one which more resembles a manager rather than the agent interface implemented in imputil.py. Add-ons can then register themselves to say "hey, I can handle pyz-archives" or "I know how to import .so modules" or "I provide a search function which you can call to have me scan my module container (directory, web-site, archive)". The manager would take care of what to call and in which order, plus delegate requests to add-ons which implement the needed logic, e.g. add-ons for signature checking, unzipping archives, file system lookup tables, etc. It could also trace its actions and then keep an on-disk knowledge base for what it did in the past to find certain modules under certain conditions. Anyway, all this is extra magic for some future version of Python. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 28 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@CNRI.Reston.VA.US Fri Dec 3 13:45:07 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 03 Dec 1999 08:45:07 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Fri, 03 Dec 1999 11:03:31 +0100." <38479573.B2CFDD2B@lemburg.com> References: <38479573.B2CFDD2B@lemburg.com> Message-ID: <199912031345.IAA16376@eric.cnri.reston.va.us> [Greg] > > I'd rather see the builtin machinery move to Python, regardless of what > > system is used and/or what features are added. [Marc] > In the long run that's probably the right direction, but right now > we are only talking a very small set of additional features, > which can easily be added to the existing code without too much > fuzz. I disagree. We should do the redisign right rather than tweaking the existing code. > Plus it won't slow things down, which is important since > Python startup time is already an issue all by itself. The > imputil.py approach of doing (a whole bunch of) recursive Python > function calls to all kinds of importers will not speed this up, > I'm afraid. A on-disk lookup table would speed this up, but > it would also break the current logic in imputil.py, which > puts importer independence above all. I don't care about the current logic in imputil. It's only a prototype! > IMHO, we should retreat to a more centralized interface, > one which more resembles a manager rather than the agent > interface implemented in imputil.py. Add-ons can then > register themselves to say "hey, I can handle pyz-archives" > or "I know how to import .so modules" or "I provide a > search function which you can call to have me scan > my module container (directory, web-site, archive)". This makes sense. > The manager would take care of what to call and in which > order, plus delegate requests to add-ons which implement > the needed logic, e.g. add-ons for signature checking, unzipping > archives, file system lookup tables, etc. > > It could also trace its actions and then keep an on-disk > knowledge base for what it did in the past to find certain > modules under certain conditions. > > Anyway, all this is extra magic for some future version of > Python. I would say the manager API design and a basic set of specific handlers should go into 1.6. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Fri Dec 3 14:14:00 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 3 Dec 1999 15:14:00 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> Message-ID: <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> MAL wrote: > > IMHO, we should retreat to a more centralized interface, > > one which more resembles a manager rather than the agent > > interface implemented in imputil.py. Add-ons can then > > register themselves to say "hey, I can handle pyz-archives" > > or "I know how to import .so modules" or "I provide a > > search function which you can call to have me scan > > my module container (directory, web-site, archive)". but why? in my small-minded view of how python works, an importer carries out a very simple task: given a name, check if you have a module with that name, and install it. if you cannot, fail (in which case python asks the next importer along the path). why do you have to complicate things beyond that? why not just let Python provide a few base classes and mixins for people who want to create custom importers, and be done with it? rationale, please. From jim@interet.com Fri Dec 3 14:34:40 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 03 Dec 1999 09:34:40 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> Message-ID: <3847D500.53833D06@interet.com> "M.-A. Lemburg" wrote: > > Greg Stein wrote: > > I'd rather see the builtin machinery move to Python, regardless of what > > system is used and/or what features are added. > > In the long run that's probably the right direction, but right now > we are only talking a very small set of additional features, > which can easily be added to the existing code without too much > fuzz. I volunteer to write a Python archive in either Python or C. In fact I currently have prototypes for both. But I have to agree with Greg here. I think a Python importer is the way to go. The C code is 300 lines mostly in import.c and parallel to existing code. The Python archive is about 100 lines and is prettier, easy to read, alter and re-use (obviously). > Plus it won't slow things down, which is important since > Python startup time is already an issue all by itself. The I think archive files should be able to be fast, and should help, not hurt, startup time. Provided that the use of sys.path is curtailed, os.readdir() is not needed, and the specifications are not complicated. Although archive files are my special concern, I realize that imputil is not just about archives. JimA From guido@CNRI.Reston.VA.US Fri Dec 3 14:39:25 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 03 Dec 1999 09:39:25 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Thu, 02 Dec 1999 19:19:40 PST." References: Message-ID: <199912031439.JAA16524@eric.cnri.reston.va.us> Greg, Great response. I think we know where we each stand. Please go ahead with a new design. (That's trust, not carte blanche.) Just one thought: the more I think about it, the less I like sys.importers: functionality which is implemented through sys.importers must necessarily be placed either in front of all of sys.path or after it. While this is helpful for "canned" apps that want *everything* to be imported from a fixed archive, I think that for regular Python installations sys.path should remain the point of attack. In particular, installing a new package (e.g. PIL) should affect sys.path, regardless of the way of delivery of the modules (shared libs, .py files, .pyc files, or a zip archive). I'm not too worried about code that inspects sys.path and expects certain invariants; that code is most likely interfering with the import mechanism so should be revisited anyway. On the lone .pyc issue: I'd like to see this disappear when using the filesystem, I see no use for it there if we support .pyc files in zip archives. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim@interet.com Fri Dec 3 14:44:54 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 03 Dec 1999 09:44:54 -0500 Subject: [Python-Dev] Import redesign [LONG] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> Message-ID: <3847D766.1E5FFAF3@interet.com> Jean-Claude Wippler wrote: > > Guido van Rossum wrote: > > [...] > > Note that the interpretation of __file__ could be problematic. To > > what value do you set __file__ for a module loaded from a zip archive? > > Makefiles use "archive(entry)" (this also supports nesting if needed). I discovered the hard way this entry is not optional. I just used the archive file name for __file__. > This may be off-topic, but has anyone considered what it would take to > load shared libs out of an archive? One way is to extract on-the-fly to > a temporary area. A refinement is to leave extracted files there as > cache, and perhaps even to extract to a file with a name derived from > its MD5 digest (this way multiple users and even Python installations > can share the cache). Would it be useful to define a "standard" area? IMHO putting shared libs in an archive is a bad idea because the OS can not use them there. They must be extracted as you say. But then storage is wasted by using space in the archive and the external file. Deleting them after use wastes time. Better to leave them out of the archive and provide for them in the installer. IMHO the archive is a basic simple feature, and people make installers on top of that. Archives shouldn't try to do it all. JimA From mal@lemburg.com Fri Dec 3 14:14:09 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 03 Dec 1999 15:14:09 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> Message-ID: <3847D030.2C936E24@lemburg.com> Guido van Rossum wrote: > > [Greg] > > > I'd rather see the builtin machinery move to Python, regardless of what > > > system is used and/or what features are added. > > [Marc] > > In the long run that's probably the right direction, but right now > > we are only talking a very small set of additional features, > > which can easily be added to the existing code without too much > > fuzz. > > I disagree. We should do the redisign right rather than tweaking the > existing code. Ok, then... > > IMHO, we should retreat to a more centralized interface, > > one which more resembles a manager rather than the agent > > interface implemented in imputil.py. Add-ons can then > > register themselves to say "hey, I can handle pyz-archives" > > or "I know how to import .so modules" or "I provide a > > search function which you can call to have me scan > > my module container (directory, web-site, archive)". > > This makes sense. > > > The manager would take care of what to call and in which > > order, plus delegate requests to add-ons which implement > > the needed logic, e.g. add-ons for signature checking, unzipping > > archives, file system lookup tables, etc. > > > > It could also trace its actions and then keep an on-disk > > knowledge base for what it did in the past to find certain > > modules under certain conditions. > > > > Anyway, all this is extra magic for some future version of > > Python. > > I would say the manager API design and a basic set of specific > handlers should go into 1.6. BTW, is there a timeline for the 1.6 release ? I mean which things will have to be in 1.6 ? Some recent topics as hints: 1. Unicode 2. Import Manager API + default handlers 3. Python style coercion at C type level 4. Rich comparisons 5. __doc__ string extraction tool -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 28 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Fri Dec 3 14:24:04 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 03 Dec 1999 15:24:04 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> Message-ID: <3847D284.8CBF2A9C@lemburg.com> Fredrik Lundh wrote: > > MAL wrote: > > > IMHO, we should retreat to a more centralized interface, > > > one which more resembles a manager rather than the agent > > > interface implemented in imputil.py. Add-ons can then > > > register themselves to say "hey, I can handle pyz-archives" > > > or "I know how to import .so modules" or "I provide a > > > search function which you can call to have me scan > > > my module container (directory, web-site, archive)". > > but why? in my small-minded view of how python > works, an importer carries out a very simple task: > > given a name, check if you have a > module with that name, and install > it. if you cannot, fail (in which case > python asks the next importer along > the path). > > why do you have to complicate things beyond that? > why not just let Python provide a few base classes > and mixins for people who want to create custom > importers, and be done with it? Because importing in Python has become *much* more complicated over time. There are requests for new features which touch subjects such as storage mechanisms, lookups, signatures (for trusted code), lazy imports, etc. A chain of simple minded importers won't work together too well, duplicate work and downgrade performance considerably due to the many recursive function calls. Also, centralized caching strategies are hard to implement across import handlers. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 28 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jeremy@cnri.reston.va.us Fri Dec 3 16:47:54 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Fri, 3 Dec 1999 11:47:54 -0500 (EST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <14406.58137.359127.921135@weyr.cnri.reston.va.us> References: <199912022043.PAA15108@eric.cnri.reston.va.us> <14406.58137.359127.921135@weyr.cnri.reston.va.us> Message-ID: <14407.62522.360386.757519@goon.cnri.reston.va.us> >>>>> "FLD" == Fred L Drake, writes: >> (Unrelated remark: I should really try to release the set of >> modules we've written here at CNRI to deal with zip files. >> Unfortunately zip files are hairy and so is our code.) FLD> It doesn't help that that code just plain stinks. I maintain FLD> that no one here understands the whole of it. I'm all for improving the code and getting it out. The real problem is that interfaces have been glommed on for every new use of a Zip file. (You want to read one off a socket and extract files before you've got the whole thing? No problem! Add a new class.) We need to figure out the common patterns for using the archives and write a new set of interfaces to support that. Jeremy From guido@CNRI.Reston.VA.US Fri Dec 3 17:12:07 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 03 Dec 1999 12:12:07 -0500 Subject: [Python-Dev] What to do with our Zip code? In-Reply-To: Your message of "Fri, 03 Dec 1999 11:47:54 EST." <14407.62522.360386.757519@goon.cnri.reston.va.us> References: <199912022043.PAA15108@eric.cnri.reston.va.us> <14406.58137.359127.921135@weyr.cnri.reston.va.us> <14407.62522.360386.757519@goon.cnri.reston.va.us> Message-ID: <199912031712.MAA17061@eric.cnri.reston.va.us> [Jeremy, on our Zip code] > I'm all for improving the code and getting it out. The real problem > is that interfaces have been glommed on for every new use of a Zip > file. (You want to read one off a socket and extract files before > you've got the whole thing? No problem! Add a new class.) We need to > figure out the common patterns for using the archives and write a new > set of interfaces to support that. If we gave you the code we currently have, would someone else in this forum be willing to redesign it? Eventually it would become part of the Python distribution. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Sat Dec 4 09:54:30 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 4 Dec 1999 10:54:30 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> Message-ID: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> M.-A. Lemburg wrote: > > given a name, check if you have a > > module with that name, and install > > it. if you cannot, fail (in which case > > python asks the next importer along > > the path). > > > > why do you have to complicate things beyond that? > > why not just let Python provide a few base classes > > and mixins for people who want to create custom > > importers, and be done with it? > > Because importing in Python has become *much* more > complicated over time. There are requests for new > features which touch subjects such as storage mechanisms, > lookups, signatures (for trusted code), lazy imports, etc. sorry, I still don't understand it. our applications already use different storage mechanisms, databases, signatures, lazy importing, version handling, etc, etc. now, if *we* have managed to build all that on top of an old version of imputil.py, how come it's not sufficient for the rest of you? > A chain of simple minded importers won't work together > too well why? it sure works for us... > duplicate work avoiding duplicate work is what object oriented design is all about. and last time I checked, Python had excellent support for that. > and downgrade performance considerably due to the > many recursive function calls now that's what I call premature optimization. and this scares the hell out of me: if the rest of the python-dev crowd don't seriously believe that Python is (or can be made) fast enough to implement things like this, why the heck are you using Python at all? am I the only one here who doesn't believe in osterhout's talk about "the great system vs. scripting language divide"? From fredrik@pythonware.com Sat Dec 4 09:54:42 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 4 Dec 1999 10:54:42 +0100 Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> Message-ID: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> James C. Ahlstrom wrote: > IMHO putting shared libs in an archive is a bad idea because the OS > can not use them there. They must be extracted as you say. But then > storage is wasted by using space in the archive and the external file. > Deleting them after use wastes time. Better to leave them out of the > archive and provide for them in the installer. IMHO the > archive is a basic simple feature, and people make installers on top > of that. Archives shouldn't try to do it all. have you tried it? if not, why do you think you should be allowed to forbid others from doing it? in "the inmates are running the asylum", alan cooper points out that the *major* reason people all over the world love web applications are that there are no bloody installers. and here you are advocating that we all should be forced to use installers, when python makes it trivial to write self-installing apps. double-argh! (on the other hand, why do I complain? all pythonworks customers is going to be able to do all this anyway...). frankly, this "design by committee" (or is it "design by people who've never even been close to implementing something because they thought it was too hard, and thus think they're qualified to argue against those of us who didn't even realize that it was a hard problem"?) trend I've been seeing in all kinds of python forums makes me sooooo sad. the more of this I see (dist- utils-sig, doc-sig, here, c.l.python), the sadder I get, and the more I sympathise with John Skaller who's defining his own python-like universe... if someone needs me, I'll be down in the pub having a beer with the mad scientist, the shiny eff-bot, and mr. nitpicker. if we're not there, you'll find us in the lab, working on new string matching facilities for 1.6, SOAP [1], tkinter replacements for the masses, and whatever else we can come up with... see you! 1) http://www.newsalert.com/bin/story?StoryId=Coenz0bWbu0znmdKXqq From gstein@lyra.org Sat Dec 4 10:42:27 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 02:42:27 -0800 (PST) Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) In-Reply-To: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> Message-ID: On Sat, 4 Dec 1999, Fredrik Lundh wrote: > M.-A. Lemburg wrote: >... > > Because importing in Python has become *much* more > > complicated over time. There are requests for new > > features which touch subjects such as storage mechanisms, > > lookups, signatures (for trusted code), lazy imports, etc. > > sorry, I still don't understand it. our applications already > use different storage mechanisms, databases, signatures, > lazy importing, version handling, etc, etc. now, if *we* > have managed to build all that on top of an old version > of imputil.py, how come it's not sufficient for the rest > of you? I agree. The imputil mechanism has been proven in combat to work for many scenarios. I have not (yet) heard of a case where the model has proven insufficient. > > A chain of simple minded importers won't work together > > too well > > why? it sure works for us... Exactly. "Why?" Please provide an example. >... > > and downgrade performance considerably due to the > > many recursive function calls > > now that's what I call premature optimization. and this > scares the hell out of me: if the rest of the python-dev > crowd don't seriously believe that Python is (or can be > made) fast enough to implement things like this, why > the heck are you using Python at all? am I the only > one here who doesn't believe in osterhout's talk about > "the great system vs. scripting language divide"? Don't worry Fredrik... I'm with you on this one. I do not believe there is a problem with the speed. Nobody has yet profiled imputil to find out where/how the time is being spent. Nobody has tried to speed it up. Therefore, any claims about its performance are simply FUD. I claim that its interface is correct, and you (Fredrik) stated it well: "given a name, please give me a module if you can (otherwise None)." Underneath that semantic, there are a lot of things that can be done to alter the performance and organization. Claims about speed are entirely premature. Yes, I'm biased. But, in truth, I haven't seen a better mechanism yet. I've tossed out a few ideas on how imputil could be improved (which are solely based on guess, rather than empirical evidence of profiling output). When those changes are completed and there is still an issue, then I'll admit defeat and wait for somebody else to provide a new design. Cheers, -g -- Greg Stein, http://www.lyra.org/ From Vladimir.Marangozov@inrialpes.fr Sat Dec 4 11:15:53 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Sat, 4 Dec 1999 12:15:53 +0100 (CET) Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] In-Reply-To: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> from "Fredrik Lundh" at Dec 04, 1999 10:54:42 AM Message-ID: <199912041115.MAA00539@python.inrialpes.fr> Fredrik Lundh wrote: > [snip] > > > > frankly, this "design by committee"... [snip] > ... see you! > > > C'mon /F, it's a battle of ideas and that's the way it works before filtering the good ones from the bad ones, then focusing on the appropriate implementation. I'm in sync with the discussion, although I haven't posted my partial notes on it due to lack of time. But let me say that overall, this discussion is a good thing and the more opinions we get, the better. BTW, you just _can't_ leave like this and start playing solitaire at the bar, first, because we need beer too and it's unlikely that you'll find a bar we don't know already, and second, because it was you who revived this discussion with 1 word, repeated 3 times: > Subject: Re: [Python-Dev] Python 1.6 status > Date: Wed, 17 Nov 1999 12:46:01 +0100 > > Guido van Rossum wrote: > > - suggestions for new issues that maybe ought to be settled in 1.6 > > three things: imputil, imputil, imputil > > Thus, with no visible argumentation (so don't shoot on others when they argue instead of you), and with this one word, you pushed Guido to the extreme of suggesting a complete redesign of the import machinery from scratch, based on a "Grand Architecture" :-). Right? -- Right! This is a fact and a fairly amount of the credits go entirely to you! Since then, however, I haven't really seen your arguments, and I believe that nobody here got exactly your point. I, for one, may well argue against imputil as being just another brick on top of the grand mess. But because I haven't made the time to write properly my notes, I don't dare to express a partial opinion, not blame those who argue good or bad in the meantime, when I'm silent. So, why are you showing us your back when you have clearly something to say, but like me, you haven't made the time to say it? Please don't waste my time with emotional rants ;-). Everybody here tries to contribute according to its knowledge, experience and availability. Later, -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From mal@lemburg.com Sat Dec 4 10:45:52 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 04 Dec 1999 11:45:52 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> Message-ID: <3848F0E0.B8132AD2@lemburg.com> Fredrik Lundh wrote: > > M.-A. Lemburg wrote: > > > given a name, check if you have a > > > module with that name, and install > > > it. if you cannot, fail (in which case > > > python asks the next importer along > > > the path). > > > > > > why do you have to complicate things beyond that? > > > why not just let Python provide a few base classes > > > and mixins for people who want to create custom > > > importers, and be done with it? > > > > Because importing in Python has become *much* more > > complicated over time. There are requests for new > > features which touch subjects such as storage mechanisms, > > lookups, signatures (for trusted code), lazy imports, etc. > > sorry, I still don't understand it. our applications already > use different storage mechanisms, databases, signatures, > lazy importing, version handling, etc, etc. now, if *we* > have managed to build all that on top of an old version > of imputil.py, how come it's not sufficient for the rest > of you? I've tried to get (an older) imputil.py version up and running too. It did work, but only after some considerable tweaking and even with integrated cache mechanisms did not reach the performance of the builtin importer (which doesn't use the kinds of caching strategies I had built into imputil.py). Getting the whole setup to work wasn't easy at all, because of the way imputil importers delegate work and things get even more confusing when it starts to "take over" certain parts of packages by installing temselves as importers for a particular package. > > A chain of simple minded importers won't work together > > too well > > why? it sure works for us... An example: A path importer knows how to scan directories and how to use a path to tell the correct order. It can maybe also import .py/.pyc/.pyo files. Now what happens if it finds a shared lib as module... the usual imputil way would be to delegate the request to some other importer which can handle shared libs... but wait: how does the shared lib importer know where to look ? It will have to rescan the directories, etc... > > duplicate work > > avoiding duplicate work is what object oriented design > is all about. and last time I checked, Python had excellent > support for that. See my example above. The agent approach used by imputil does not support OO design too well: even though you can avoid duplicate programming work on the importers by using a few base classes which implement dir scans, shared lib imports, etc. the imputil design does not provide means to avoid duplicate actions taken by the importers. > > and downgrade performance considerably due to the > > many recursive function calls > > now that's what I call premature optimization. and this > scares the hell out of me: if the rest of the python-dev > crowd don't seriously believe that Python is (or can be > made) fast enough to implement things like this, why > the heck are you using Python at all? am I the only > one here who doesn't believe in osterhout's talk about > "the great system vs. scripting language divide"? Looks like you are in ranting mode here ;-) Seriously, I've checked my imputil.py version (with caches enabled) against the builtin importer and noticed a performance downgrade by factor >2. This was enough to convince me of looking for other techniques to handle the problems I had at the time... you know, relative imports and things. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 27 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Sat Dec 4 11:04:15 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 04 Dec 1999 12:04:15 +0100 Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> Message-ID: <3848F52F.5F5B748F@lemburg.com> Fredrik Lundh wrote: > > > > frankly, this "design by committee" (or is it "design by > people who've never even been close to implementing > something because they thought it was too hard, and > thus think they're qualified to argue against those of > us who didn't even realize that it was a hard problem"?) Huh ? Two points: 1. How can you be sure that people haven't tried implementing their ideas and for various reasons have come to some conclusion about those ideas ? 2. Would you seriously disqualify people from joining a discussion by the simple arguement that they have not implemented anything yet ? Just take the Unicode discussion as example: it was very lively and resulted in a decent proposal which is now subject to further investigation by the implementors ;-) Many people have joined in even though they did not and/or will not implement anything. Still, their arguments were very useful to show up weaknesses in the proposal. Now, let's rather have a beer in the pub around the corner than go on ranting about :-). -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 27 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Sat Dec 4 11:53:33 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 04 Dec 1999 12:53:33 +0100 Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) References: Message-ID: <384900BD.D16E72BC@lemburg.com> Greg Stein wrote: > > > [me:] > > > A chain of simple minded importers won't work together > > > too well > > > > why? it sure works for us... > > Exactly. "Why?" Please provide an example. See my reply to Fredrik. > >... > > > and downgrade performance considerably due to the > > > many recursive function calls > > > > now that's what I call premature optimization. and this > > scares the hell out of me: if the rest of the python-dev > > crowd don't seriously believe that Python is (or can be > > made) fast enough to implement things like this, why > > the heck are you using Python at all? am I the only > > one here who doesn't believe in osterhout's talk about > > "the great system vs. scripting language divide"? > > Don't worry Fredrik... I'm with you on this one. I do not believe there is > a problem with the speed. Nobody has yet profiled imputil to find out > where/how the time is being spent. Nobody has tried to speed it up. Sorry, Greg, but that is simply not true. I've spend a few days on trying to get more performance out of it and have succeeded, but in the end it wasn't enough to convince me of the approach. > Therefore, any claims about its performance are simply FUD. BTW, did anybody mention that an import manager wouldn't be able to provide an API which is useable for imputil style importers ? I'm not argueing against the possibility to use imputil style importers, just against making it the sole method of adding wisdom to Python imports. The imputil importers could well benefit from a manager providing logic to do basic things like importing shared libs, checking signatures, downloading modules from the web, etc. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 27 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein@lyra.org Sat Dec 4 12:15:13 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 04:15:13 -0800 (PST) Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) In-Reply-To: <384900BD.D16E72BC@lemburg.com> Message-ID: On Sat, 4 Dec 1999, M.-A. Lemburg wrote: >... > > Don't worry Fredrik... I'm with you on this one. I do not believe there is > > a problem with the speed. Nobody has yet profiled imputil to find out > > where/how the time is being spent. Nobody has tried to speed it up. > > Sorry, Greg, but that is simply not true. I've spend a few > days on trying to get more performance out of it and have > succeeded, but in the end it wasn't enough to convince me > of the approach. You sent me your changes... I don't believe that you were aggressive enough. As I've mentioned before, I think it is quite possible to retain the general Importer style and get_code() interface, but to shift some functionality out (to be computed once) to a higher-level mechanism. The patches that you sent me did not do this, so I'm not surprised that you hit a wall. Ack. See? Now I'm getting into discussions about performance and implementation without truly knowing where the timing is spent. Eyeballing it, I have an idea, but it would be best too see a profile output. My mantra is always "90% of the time you're wrong about where 90% of the time is being spent." I am unconcerned about performance, but will work on it so that I don't need to continue this conversation. That burden is on me. > > Therefore, any claims about its performance are simply FUD. > > BTW, did anybody mention that an import manager wouldn't > be able to provide an API which is useable for imputil > style importers ? I'm not argueing against the possibility > to use imputil style importers, just against making it the > sole method of adding wisdom to Python imports. Since the core will delegate out to Python (note: current working theory), then it certainly is not the "sole method" (since you can just replace the Python code). But there must be a default mechanism. The ihooks stuff was too complicated. imputil seems to be much easier. I'd love to see a third mechanism.... so I can steal ideas :-) > The imputil importers could well benefit from a manager > providing logic to do basic things like importing > shared libs, checking signatures, downloading modules > from the web, etc. For shared libs, yes. For the others: geez... I don't want to see that in the core infrastructure. Shift that out to specialized Importers. The infrstructure ought to be teeny and agnostic about how to map a module name to a module. Side note to python-dev people: I apologize... I realize that I'm beginning to get a bit defensive here. I'm going to be at XML '99 until Friday, so that should give me a breather. When I get back, I'll skip the talk and do some code. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 4 12:32:04 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 04:32:04 -0800 (PST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <3848F0E0.B8132AD2@lemburg.com> Message-ID: On Sat, 4 Dec 1999, M.-A. Lemburg wrote: > Fredrik Lundh wrote: >... > > sorry, I still don't understand it. our applications already > > use different storage mechanisms, databases, signatures, > > lazy importing, version handling, etc, etc. now, if *we* > > have managed to build all that on top of an old version > > of imputil.py, how come it's not sufficient for the rest > > of you? > > I've tried to get (an older) imputil.py version up and running > too. It did work, but only after some considerable tweaking > and even with integrated cache mechanisms did not reach > the performance of the builtin importer (which doesn't > use the kinds of caching strategies I had built into > imputil.py). 1) yes, it was an older version and did not have the PathImporter class. As a by product, the DirectoryImporters that it *did* have were much slower. It still did not support builtins, frozen modules, or dynamic loads. All of that is present now, so it works "out of the box" much better. 2) Performance: as I wrote in the other email, I don't believe that is an argument against the design. The imputil approach *will* be slower than the current Python mechanism, but there is some more coding to do to truly see how much. The side benefits (e.g. ZipImporter and caching) may outweigh the result. Time will tell. > Getting the whole setup to work wasn't easy > at all, because of the way imputil importers delegate work > and things get even more confusing when it starts to "take > over" certain parts of packages by installing temselves > as importers for a particular package. I don't understand this. If it is relevant, then please expand. Thx. > > > A chain of simple minded importers won't work together > > > too well > > > > why? it sure works for us... > > An example: > > A path importer knows how to scan directories and how to use > a path to tell the correct order. It can maybe also import > .py/.pyc/.pyo files. Now what happens if it finds a shared > lib as module... the usual imputil way would be to delegate > the request to some other importer which can handle shared > libs... but wait: how does the shared lib importer know > where to look ? It will have to rescan the directories, > etc... No, the "usual imputil way" is that the PathImporter understands searching a path and loading stuff from that path. An Importer is a combination of locating and loading (since they are, typically, tightly bound). The next rev will allow user-plugging of support for new file types. > > > duplicate work > > > > avoiding duplicate work is what object oriented design > > is all about. and last time I checked, Python had excellent > > support for that. > > See my example above. > > The agent approach used by imputil does not support > OO design too well: even though you can avoid duplicate > programming work on the importers by using a few > base classes which implement dir scans, shared lib > imports, etc. the imputil design does not provide > means to avoid duplicate actions taken by the importers. There is always a balance to be struck between independence and coupling. I chose to reduce coupling and increase independence. If you shift a bunch of stuff out of the Importers, then you will increase the coupling between the imputil framework and the Importers. That coupling will then close off future possibilities. Within the framework itself (e.g. between _import_hook and get_code), there is a lot of opportunity for change. Since that is behind the covers, it is no big deal to shift functionality around. I plan to do so. >... > Looks like you are in ranting mode here ;-) Seriously, > I've checked my imputil.py version (with caches enabled) > against the builtin importer and noticed a performance > downgrade by factor >2. This was enough to convince me > of looking for other techniques to handle the problems > I had at the time... you know, relative imports and things. I have run a long series of tests. Without doing any performance work on imputil, the ratio is 9 to 13. The 13 may have bumped up to about 15 or 16 when I added some dynamic loading code (I forget). Regardless, it is definitely less than a 2X increase. And that is with zero optimization. *shrug* I'm done. I'll do some code in a couple weeks. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 4 13:12:32 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 05:12:32 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912031439.JAA16524@eric.cnri.reston.va.us> Message-ID: On Fri, 3 Dec 1999, Guido van Rossum wrote: >... > Great response. I think we know where we each stand. Please go ahead > with a new design. (That's trust, not carte blanche.) Accepted gratefully. Thx. > Just one thought: the more I think about it, the less I like > sys.importers: functionality which is implemented through > sys.importers must necessarily be placed either in front of all of > sys.path or after it. While this is helpful for "canned" apps that > want *everything* to be imported from a fixed archive, I think that > for regular Python installations sys.path should remain the point of > attack. In particular, installing a new package (e.g. PIL) should > affect sys.path, regardless of the way of delivery of the modules > (shared libs, .py files, .pyc files, or a zip archive). Okay. I'll design with respect to this model. To be explicit/clear and to be sure I'm hearing you right: sys.path may contain Importer instances. Given the name FOO, the system will step through sys.path looking for the first occurence of FOO (looking in a directory or delegating). FOO may be found with any number of (configurable) file extensions, which are ordered (e.g. ".so" before ".py" before ".isl"). > I'm not too worried about code that inspects sys.path and expects > certain invariants; that code is most likely interfering with the > import mechanism so should be revisited anyway. The Benevolent Dictator has spoken. So be it. :-) > On the lone .pyc issue: I'd like to see this disappear when using the > filesystem, I see no use for it there if we support .pyc files in zip > archives. No problem. This actually creates a simplification in the system, as I'm seeing it now. I'm also seeing opportunities for a code reorg which may work towards MAL's issues with performance. I hope to have something in two or three weeks. I also hope people can be patient :-), but I certainly wouldn't mind seeing some alternative code! Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm@hypernet.com Sat Dec 4 14:59:44 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 4 Dec 1999 09:59:44 -0500 Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) In-Reply-To: <384900BD.D16E72BC@lemburg.com> Message-ID: <1267803104-11215142@hypernet.com> M.-A. Lemburg wrote: > Greg Stein wrote: > > Don't worry Fredrik... I'm with you on this one. I do not > > believe there is a problem with the speed. Nobody has yet > > profiled imputil to find out where/how the time is being spent. > > Nobody has tried to speed it up. > > Sorry, Greg, but that is simply not true. I've spend a few > days on trying to get more performance out of it and have > succeeded, but in the end it wasn't enough to convince me > of the approach. Remember those comparisons of Perl and Python, to which you added cgipython? I've added to the list a version that uses an old version of imputil (probably the one you optimized) and a compressed std lib. Note that my Linux python (1.5.2) is built in the RedHat style - even struct and strop are .so's; so that accounts for the majority of the open calls. This is a full Python (runs code.py if you don't pass it a script name). For lack of a better name, I've called it "pykit". First, the size of log files (in lines), i.e. number of system calls: Solaris Linux IRIX[1] Perl 88 85 70 Python 425 316 257 cgipython 182 pykit 136 Next, the number of "open" calls: Solaris Linux IRIX Perl 16 10 9 Python 107 71 48 cgipython 33 pykit 9 And the number of unsuccessful "open" calls: Solaris Linux IRIX Perl 6 1 3 Python 77 49 32 cgipython 28 pykit 2 Number of "mmap" calls: Solaris Linux IRIX Perl 25 25 1 Python 36 24 1 cgipython 13 pykit 21 This test would show off more if it went beyond startup. An import of a standard lib module in my stock Python involves 2 failed stats and 6 failed opens, then 2 successful opens and 2 fstats before the module is loaded. None of these occur in pykit. The downside (asking my Importer for a .so or a module not in the importer) takes no system calls, and involves a dozen or so lines of Python and a check of a dictionary. - Gordon From tismer@appliedbiometrics.com Sat Dec 4 15:29:03 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sat, 04 Dec 1999 16:29:03 +0100 Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) References: Message-ID: <3849333F.1DF2A201@appliedbiometrics.com> Greg Stein wrote: ... > My mantra is always "90% of the time you're wrong about where 90% > of the time is being spent." What a great sentence! We all know it, but many of us (especially me) forget about it during 90% of our coding time. Much better to spend this on design (as you did). thanks - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jim@interet.com Sat Dec 4 17:27:44 1999 From: jim@interet.com (James C. Ahlstrom) Date: Sat, 04 Dec 1999 12:27:44 -0500 Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> Message-ID: <38494F10.C644BA7@interet.com> Fredrik Lundh wrote: > > James C. Ahlstrom wrote: > > IMHO putting shared libs in an archive is a bad idea because the OS Dear Fredrik, I thought the point of Python-Dev was to propose designs and get feedback, right? Well, I got feedback :-). OK, I agree to alter my archive format so it provides the ability to store shared libs and not just *.pyd. I will add the string length and if needed a flag indicating the name is a shared lib. Now the details: > have you tried it? if not, why do you think you should > be allowed to forbid others from doing it? Yes I have tried it, and I am currently on my fourth version of an archive format which is based on formats by Greg Stein and Gordon McMillan. I hope it meets with the favor of the Grand Inquisition, and becomes the standard format. But maybe it won't. Oh well. > bloody installers. and here you are advocating that > we all should be forced to use installers, when python > makes it trivial to write self-installing apps. double-argh! I am not forcing anyone to do anything, only proposing that shared libs are best handled directly by imputil and not the class within imputil which handles archive files. It is just a geeky design issue, nothing more. JimA From jim@interet.com Sat Dec 4 18:31:48 1999 From: jim@interet.com (James C. Ahlstrom) Date: Sat, 04 Dec 1999 13:31:48 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> Message-ID: <38495E14.9C2FB107@interet.com> "M.-A. Lemburg" wrote: > An example: > > A path importer knows how to scan directories and how to use > a path to tell the correct order. It can maybe also import > .py/.pyc/.pyo files. Now what happens if it finds a shared > lib as module... the usual imputil way would be to delegate > the request to some other importer which can handle shared > libs... but wait: how does the shared lib importer know > where to look ? It will have to rescan the directories, > etc... The above refers to an earlier but still very recent version of imputil. On that basis is is perfectly accurate. Here is another example from my own experience almost identical to the above: One possible archive file format holds its list of archived *.pyc file names as keys in a dictionary. This is simple and efficient, but fails to correctly address the problem of shared libs (aka DLL's in Windows) with names identical to names of *.pyc files in the archive. For example, suppose foo.pyc is in the archive, and foo.dll is in a directory. Suppose sys.path is to be used to decide whether to load foo.pyc or foo.dll. Then an "archive importer" will fail to do this. Specifically you can't see if foo.pyc is in the archive and then check sys.path, nor can you do the reverse. You must call the "archive importer" repeatedly for each element of sys.path and search the directory at the same time. JimA From jim@interet.com Sat Dec 4 19:51:47 1999 From: jim@interet.com (James C. Ahlstrom) Date: Sat, 04 Dec 1999 14:51:47 -0500 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <384970D3.26A9ECDB@interet.com> Greg Stein wrote: > > On Fri, 3 Dec 1999, Guido van Rossum wrote: > > attack. In particular, installing a new package (e.g. PIL) should > > affect sys.path, regardless of the way of delivery of the modules > > (shared libs, .py files, .pyc files, or a zip archive). > To be explicit/clear and to be sure I'm hearing you right: sys.path may > contain Importer instances. Given the name FOO, the system will step > through sys.path looking for the first occurence of FOO (looking in a > directory or delegating). FOO may be found with any number of > (configurable) file extensions, which are ordered (e.g. ".so" before > ".py" before ".isl"). This is basically a gripe about this design spec. So if the answer turns out to be "we need this functionality so shut up" then just say that and don't flame me. This spec is painful. Suppose sys.path has 10 elements, and there are six file extensions. Then the simple algorithm is slow: for path in sys.path: # Yikes, may not be a string! for ext in file_extensions: name = "%s.%s" % (module_name, ext) full_path = os.path.join(path, name) if os.path.isfile(full_path): # Process file here And sys.path can contain class instances which only makes things slower. You could do a readdir() and cache the results, but maybe that would be slower. A better algorithm might be faster, but a lot more complicated. In the context of archive files, it is also painful. It prevents you from saving a single dictionary of module names. Instead you must have len(sys.path) dictionaries. You could try to save in the archive information about whether (say) a foo.dll was present in the file system, but the list of extensions is extensible. The above problem only exists to support equally-named modules; that is, to support a run-time choice of whether to load foo.pyc, foo.dll, foo.isl, etc. I claim (without having written it) that the fastest algorithm to solve the unique-name case is much faster than the fastest algorithm to solve the choose-among-equal-names case. Do we really need to support the equal-name case [Jim runs for cover...]? If so, how about inventing a new way to support it. Maybe if equal names exist, these must be pre-loaded from a known location? JimA From gstein@lyra.org Sat Dec 4 21:59:00 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 13:59:00 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <384970D3.26A9ECDB@interet.com> Message-ID: On Sat, 4 Dec 1999, James C. Ahlstrom wrote: > Greg Stein wrote: >... > > To be explicit/clear and to be sure I'm hearing you right: sys.path may > > contain Importer instances. Given the name FOO, the system will step > > through sys.path looking for the first occurence of FOO (looking in a > > directory or delegating). FOO may be found with any number of > > (configurable) file extensions, which are ordered (e.g. ".so" before > > ".py" before ".isl"). > > This is basically a gripe about this design spec. So if the answer > turns out to be "we need this functionality so shut up" then just > say that and don't flame me. > > This spec is painful. Suppose sys.path has 10 elements, and there > are six file extensions. Then the simple algorithm is slow: > for path in sys.path: # Yikes, may not be a string! > for ext in file_extensions: > name = "%s.%s" % (module_name, ext) > full_path = os.path.join(path, name) > if os.path.isfile(full_path): > # Process file here This is the algorithm that Python uses today, and my standard Importers follow. > And sys.path can contain class instances > which only makes things slower. IMO, we don't know this, or whether it is significant. > You could do a readdir() and cache > the results, but maybe that would be slower. A better > algorithm might be faster, but a lot more complicated. Who knows. BUT: the import process is now in Python -- it makes it *much* easier to run these experiments. We could not really do this when the import process is "hard-coded" in C code. > In the context of archive files, it is also painful. It prevents > you from saving a single dictionary of module names. Instead you > must have len(sys.path) dictionaries. You could try to > save in the archive information about whether (say) a foo.dll was > present in the file system, but the list of extensions is extensible. I am not following this. What/where is the "single dictionary of module names" ? Are you referring to a cache? Or is this about building an archive? An archive would look just like we have now: map a name to a module. It would not need multiple dictionaries. > The above problem only exists to support equally-named modules; that > is, to support a run-time choice of whether to load foo.pyc, foo.dll, > foo.isl, etc. I claim (without having written it) that the fastest > algorithm to solve the unique-name case is much faster than the fastest > algorithm to solve the choose-among-equal-names case. > > Do we really need to support the equal-name case [Jim runs for > cover...]? > If so, how about inventing a new way to support it. Maybe if equal > names exist, these must be pre-loaded from a known location? I don't understand what the problem is. I don't see one. We are still mapping a name to a module. sys.path defines a precedence. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sun Dec 5 01:17:57 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 17:17:57 -0800 (PST) Subject: [Python-Dev] pyc archives (was: .DLL vs .PYD search order) In-Reply-To: <38495E14.9C2FB107@interet.com> Message-ID: On Sat, 4 Dec 1999, James C. Ahlstrom wrote: >... > One possible archive file format holds its list of archived > *.pyc file names as keys in a dictionary. This is simple and > efficient, but fails to correctly address the problem of shared > libs (aka DLL's in Windows) with names identical to names of > *.pyc files in the archive. For example, suppose foo.pyc is in the > archive, and foo.dll is in a directory. Suppose sys.path is to be > used to decide whether to load foo.pyc or foo.dll. Then an > "archive importer" will fail to do this. Specifically you can't > see if foo.pyc is in the archive and then check sys.path, nor can > you do the reverse. You must call the "archive importer" repeatedly > for each element of sys.path and search the directory at the same time. What? The archive is independent of each .pyc's original position in sys.path. There is no reason/need to carry that information into an archive. If the archive contains "foo", then you're done. If it doesn't, then move on to the next element of sys.path (directory or Importer instance) and look there. Basically: if you deploy an archive, then all of its files will take precedence over any file found later on sys.path. This is exactly what sys.path is about: establishing precedence. If I understand you correctly, then you're trying to say there is some sort of interleaving that must occur. If so, then I don't understand why. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik@pythonware.com Mon Dec 6 12:20:34 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 6 Dec 1999 13:20:34 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> <384B7E32.F7B81D82@lemburg.com> Message-ID: <004401bf3fe4$4cab6ea0$f29b12c2@secret.pythonware.com> > > you obviously attempted to use imputil to implement > > non-standard import behaviour on top of the standard > > storage system -- while we've used it to implement > > standard import behaviour on top of non-standard > > storage systems. > > No, I tried to make the imputil approach work as replacement > for the standard builtin importer. I'm confused. earlier, you said (or rather, I think you said) that you looked at imputil to see if it could "handle the problems you had at the time"... and now you say that you tried to use it as a drop-in replacement for the "standard path importer". I must be missing something here... > After I got that to work, I added some caching > to avoid duplicated stats. The resulting importer was > around twice as slow as the builtin one for the following > imports: > > # the default one Python does at startup, plus: > from mx import HTMLTools,DateTime,ODBC > > This is a pretty common setup for my scripts, so its > preformance is relevant to me. did you try stuffing all your PYC's into an archive file, and running them from there? From fredrik@pythonware.com Sun Dec 5 18:22:57 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sun, 5 Dec 1999 19:22:57 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> Message-ID: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> > I've checked my imputil.py version (with caches enabled) > against the builtin importer and noticed a performance > downgrade by factor >2. This was enough to convince me > of looking for other techniques to handle the problems > I had at the time... you know, relative imports and things. hmm. I think I see the problem here... you obviously attempted to use imputil to implement non-standard import behaviour on top of the standard storage system -- while we've used it to implement standard import behaviour on top of non-standard storage systems. I don't know if imputil is good enough for the former, and I don't think I care... I've spent too many nights debugging code that relied on clever, non-standard hacks. PS. on the performance side of things, did you know that 're' can be up to ten times slower than 'regex'? but people don't complain -- probably because it allows them to do things they couldn't do before... From jim@interet.com Mon Dec 6 19:40:01 1999 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 06 Dec 1999 14:40:01 -0500 Subject: [Python-Dev] Re: pyc archives (was: .DLL vs .PYD search order) References: Message-ID: <384C1111.92984B5A@interet.com> Greg Stein wrote: > > On Sat, 4 Dec 1999, James C. Ahlstrom wrote: > >... > > One possible archive file format holds its list of archived > > *.pyc file names as keys in a dictionary. This is simple and > > efficient, but fails to correctly address the problem of shared > What? The archive is independent of each .pyc's original position in > sys.path. There is no reason/need to carry that information into an > archive. > > If the archive contains "foo", then you're done. If it doesn't, then move > on to the next element of sys.path (directory or Importer instance) and > look there. > > Basically: if you deploy an archive, then all of its files will take > precedence over any file found later on sys.path. This is exactly what > sys.path is about: establishing precedence. Sorry, I am a little slow today. My daughter got me up at 6 am to work on her computer video editor. No disk space, fragmentation, 2 gig limit on AVI files, ........ Are you saying this? If foo is imported, the archive importer is consulted first to see if it can provide foo. If not, sys.path is searched for foo.pyc, foo.pyl etc., and if foo.pyl is found, then its contents are added to the single archive importer dictionary. The order of addition to the archive dictionary is determined by sys.path, and duplicate names are not entered because they lie later on sys.path. But once a file is recognized as in an archive, it effectively precedes all of sys.path. Or this? If foo is imported, sys.path is searched for foo.pyc, foo.pyl, etc., and also all archive files found at each element of sys.path are searched for foo. If "bar" is imported, it may be found in foo.pyl. That is, there is an instance of an archive importer for each element of sys.path. What if the user names an archive file not on sys.path? What order does it have? JimA From jim@interet.com Mon Dec 6 18:34:41 1999 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 06 Dec 1999 13:34:41 -0500 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <384C01C1.8D1AFFFF@interet.com> Greg Stein wrote: > > On Sat, 4 Dec 1999, James C. Ahlstrom wrote: > > # Process file here > > This is the algorithm that Python uses today, and my standard Importers > follow. Agreed. > > And sys.path can contain class instances > > which only makes things slower. > > IMO, we don't know this, or whether it is significant. Agreed. > > You could do a readdir() and cache > > the results, but maybe that would be slower. A better > > algorithm might be faster, but a lot more complicated. > > Who knows. BUT: the import process is now in Python -- it makes it *much* > easier to run these experiments. We could not really do this when the > import process is "hard-coded" in C code. Agreed. > > In the context of archive files, it is also painful. It prevents > > you from saving a single dictionary of module names. Instead you > > must have len(sys.path) dictionaries. You could try to > > save in the archive information about whether (say) a foo.dll was > > present in the file system, but the list of extensions is extensible. > > I am not following this. What/where is the "single dictionary of module > names" ? Are you referring to a cache? Or is this about building an > archive? > > An archive would look just like we have now: map a name to a module. It > would not need multiple dictionaries. The "single dictionary of names" is in the single archive importer instance and has nothing to do with creating the archive. It is currently programmed this way. Suppose the user specifies by name 12 archive files to be searched. That is, the user hacks site.py to add archive names to the importer. The "single dictionary" means that the archive importer takes the 12 dictionaries in the 12 files and merges them together into one dictionary in order to speed up the search for a name. The good news is you can always just call the archive importer to get a module. The bad news is you can't do that for each entry on sys.path because there is no necessary identity between archive files and sys.path. The user specified the archive files by name, and they may or may not be on sys.path, and the user may or may not have specified them in the same order as sys.path even if they are. Suppose archive files must lie on sys.path and are processed in order. Then to find them you must know their name. But IMHO you want to avoid doing a readdir() on each element of sys.path and looking for files *.pyl. Suppose archive file names in general are the known name "lib.pyl" for the Python library, plus the names "package.pyl" where "package" can be the name of a Python package as a single archive file. Then if the user tries to import foo, imputil will search along sys.path looking for foo.pyc, foo.pyl, etc. If it finds foo.pyl, the archive importer will add it to its list of known archive files. But it must not add it to its single dictionary, because that would destroy the information about its position along sys.path. Instead, it must keep a separate dictionary for each element of sys.path and search the separate dictionaries under control of imputil. That is, get_code() needs a new argument for the element of sys.path being searched. Alternatively, you could create a new importer instance for each archive file found, but then you still have multiple dictionaries. They are in the multiple instances. All this is needed only to support import of identically named modules. If there are none, there is no problem because sys.path is being used only to find modules, not to disambiguate them. See also my separate reply to your other post which discusses this same issue. JimA From gstein@lyra.org Tue Dec 7 00:43:21 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 6 Dec 1999 16:43:21 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <384C01C1.8D1AFFFF@interet.com> Message-ID: On Mon, 6 Dec 1999, James C. Ahlstrom wrote: > Greg Stein wrote: >... > > I am not following this. What/where is the "single dictionary of module > > names" ? Are you referring to a cache? Or is this about building an > > archive? > > > > An archive would look just like we have now: map a name to a module. It > > would not need multiple dictionaries. > > The "single dictionary of names" is in the single archive importer > instance and has nothing to do with creating the archive. It > is currently programmed this way. Ah. There is the problem. In Guido's suggestion for the "next path of inquiry" :-), there is no "single dictionary of names". Instead, you have Importer instances as items in sys.path. Each instance maintains its dictionary, and they are not (necessarily) combined. If we were to combine them, then we would need to maintain the ordering requirements implied by sys.path. However, this would be problematic if sys.path changed -- we would have to detect the situation and rebuild a merged dict. > Suppose the user specifies by name 12 archive files to be searched. > That is, the user hacks site.py to add archive names to the importer. > The "single dictionary" means that the archive importer takes the 12 > dictionaries in the 12 files and merges them together into one > dictionary > in order to speed up the search for a name. The good news is you can > always just call the archive importer to get a module. The bad news is > you can't do that for each entry on sys.path because there is no > necessary identity between archive files and sys.path. The user > specified the archive files by name, and they may or may not be on > sys.path, and the user may or may not have specified them in the > same order as sys.path even if they are. The importer must be inserted into sys.path to establish a precedence. If the user wants to add 12 libraries... fine. But *all* of those modules will fall under a precedence defined by the Importer's position on sys.path. > Suppose archive files must lie on sys.path and are processed in order. > Then to find them you must know their name. But IMHO you want to > avoid doing a readdir() on each element of sys.path and looking for > files *.pyl. I do not believe that we will arbitrarily locate and open library files. They must be specified explicitly. > Suppose archive file names in general are the known name "lib.pyl" > for the Python library, plus the names "package.pyl" where "package" > can be the name of a Python package as a single archive file. Then > if the user tries to import foo, imputil will search along sys.path > looking for foo.pyc, foo.pyl, etc. If it finds foo.pyl, the archive > importer will add it to its list of known archive files. But it must > not add it to its single dictionary, because that would destroy the > information about its position along sys.path. Instead, it must keep > a separate dictionary for each element of sys.path and search the > separate dictionaries under control of imputil. That is, get_code() > needs a new argument for the element of sys.path being searched. > Alternatively, you could create a new importer instance for each > archive file found, but then you still have multiple dictionaries. > They are in the multiple instances. If the user installs ".pyl" as a recognized extension (i.e. installs into the PathImporter), then the above scenario is possible. In my in-head-design, I had not imagined any state being retained for extension-recognizer hooks. Of course, state can be retained simply by using a bound-method for the hook function. get_code() would not need to change. The foo.pyl would be consulted at the appropriate time based on where it is found in sys.path. Note that file- extension hooks would definitely have a complete path to the target file. Those are not Importers, however (although they will closely follow the get_code() hook since the extension is called from get_code). From a pure theoretical standpoint, you can also see that get_code() should not have a pathname passed -- that would introduce filesystem semantics into what is otherwise an independent semantic (map name to module). More detail: the extension recognizer could certainly retain cache about each of the archives that are located. However, the recognizer would be consulted (by the PathImporter) once for each archive found, in an ordering defined by sys.path. > All this is needed only to support import of identically named > modules. If there are none, there is no problem because sys.path > is being used only to find modules, not to disambiguate them. But the current (and future) semantics of Python states that you may have identically named modules, and that sys.path *does* disambiguate them. In fact, I use this feature all the time -- I use my new httplib.py rather than the standard library version. I do this by placing the specific directly "first" in my sys.path. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Tue Dec 7 05:11:25 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 7 Dec 1999 00:11:25 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> Message-ID: <001601bf4071$8278cc20$88a0143f@tim> [/F] > PS. on the performance side of things, did you know > that 're' can be up to ten times slower than 'regex'? > but people don't complain -- probably because it > allows them to do things they couldn't do before... Bad example: people do complain about this. Those who care a lot continue to use regex, temporarily pacified by the promise that re.py will get recoded in C and thus regain a good chunk of regex's speed. Those who care a whale of a lot continue to use Perl <0.9 wink>. From guido@CNRI.Reston.VA.US Tue Dec 7 12:45:25 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 07 Dec 1999 07:45:25 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Mon, 06 Dec 1999 16:43:21 PST." References: Message-ID: <199912071245.HAA21596@eric.cnri.reston.va.us> > If we were to combine them, then we would need to maintain the ordering > requirements implied by sys.path. However, this would be problematic if > sys.path changed -- we would have to detect the situation and rebuild a > merged dict. No need to worry about this: just don't merge the caches. Compared to the hundreds of failed open() calls that are done now, it's no big deal to do 12 failed Python dictionary lookups instead of one. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Tue Dec 7 13:25:54 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 7 Dec 1999 14:25:54 +0100 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com> Greg Stein wrote: > > The "single dictionary of names" is in the single archive importer > > instance and has nothing to do with creating the archive. It > > is currently programmed this way. > > Ah. There is the problem. In Guido's suggestion for the "next path of > inquiry" :-), there is no "single dictionary of names". Instead, you have > Importer instances as items in sys.path. Each instance maintains its > dictionary, and they are not (necessarily) combined. so the "sys.path contains importers (or strings)" strategy is now officially sanctioned? cool!!! (a quick look in our code base says that this will cause some trouble, unless os.path.isdir() is modified to reject non-strings... after all, if it's not a string, it cannot be a valid directory path, so this does make some sense ;-) another aside: can we have a standard mechanism for listing the contents of a given archive, please? we have a lot of "path scanning" stuff (PIL and PST, among others), and it would be great if things didn't break down if you stuff it all in an archive. something like: for path in sys.path: if os.path.isdir(path): files = os.listdir(path) else: try: files = path.listdir() except AttributeError: files = None if files is None: # no idea what's in here else: # path provides (at least) these modules would be really useful. and yes, it shouldn't have to be mentioned, since squeeze have done it since early 1997, but archive importers should provide a standard way to include non-module resources in the archive, and a standard way to access such resources as ordinary python streams. e.g: file = path.open(name, "rb") or something... From jim@interet.com Tue Dec 7 15:20:15 1999 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 07 Dec 1999 10:20:15 -0500 Subject: [Python-Dev] Import redesign [LONG] References: <199912071245.HAA21596@eric.cnri.reston.va.us> Message-ID: <384D25AF.4C4F5107@interet.com> Guido van Rossum wrote: > No need to worry about this: just don't merge the caches. Compared to > the hundreds of failed open() calls that are done now, it's no big > deal to do 12 failed Python dictionary lookups instead of one. Agreed. JimA From jim@interet.com Tue Dec 7 15:31:30 1999 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 07 Dec 1999 10:31:30 -0500 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <384D2852.3C36C216@interet.com> Greg Stein wrote: > Ah. There is the problem. In Guido's suggestion for the "next path of > inquiry" :-), there is no "single dictionary of names". Instead, you have > Importer instances as items in sys.path. Each instance maintains its > dictionary, and they are not (necessarily) combined. > [A large number of other design issues] OK, all design issues agreed. I will make needed changes. JimA From jim@interet.com Tue Dec 7 15:37:36 1999 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 07 Dec 1999 10:37:36 -0500 Subject: [Python-Dev] Import redesign [LONG] References: <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com> Message-ID: <384D29C0.3D3A2194@interet.com> Fredrik Lundh wrote: > another aside: can we have a standard mechanism for > listing the contents of a given archive, please? I will add this. > and yes, it shouldn't have to be mentioned, since squeeze > have done it since early 1997, but archive importers should > provide a standard way to include non-module resources in > the archive, and a standard way to access such resources > as ordinary python streams. I will add this. JimA From gstein@lyra.org Tue Dec 7 16:53:49 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 7 Dec 1999 08:53:49 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912071245.HAA21596@eric.cnri.reston.va.us> Message-ID: On Tue, 7 Dec 1999, Guido van Rossum wrote: > > If we were to combine them, then we would need to maintain the ordering > > requirements implied by sys.path. However, this would be problematic if > > sys.path changed -- we would have to detect the situation and rebuild a > > merged dict. > > No need to worry about this: just don't merge the caches. Compared to > the hundreds of failed open() calls that are done now, it's no big > deal to do 12 failed Python dictionary lookups instead of one. Have no fear... I wasn't planning on this... complicates too much stuff for too little gain. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Wed Dec 8 12:07:31 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 07:07:31 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim> References: <000201bf4150$46749da0$5aa2143f@tim> Message-ID: <199912081207.HAA00040@eric.cnri.reston.va.us> [Great analysis, Tim!] > 4) The audience is Python end-users "in general", and the product is pure > Python. I think this is the most important one for Distutils to address, > and compilation isn't a part of it. So far, though, what Gordon is doing > seems more appropriate than what Distutils has been up to. I hope his work > gets folded into this. I'm not sure what stuff by which Gordon you're referring to. I am only familiar with his installer, which I thought is win32 only (but I may be mistaken) and is an installer for a whole application, not just a bunch of modules. Please correct me if I'm wrong. But this reminds me of a different issue, which Jim Ahlstrom has been hammering about before: there's a completely separate set of cases where what you are distributing is a stand-alone application, and the target consists of end users who are entirely uninterested in whether it's written in Python, C or Elvish. (And then there's still the distinction between Win32, Unix or both.) The current distutil dools don't deal with this at all. I think it should though, and I think its framework is powerful enough to be able to add this, e.g. as a new "appdist" command. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Wed Dec 8 14:16:07 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 09:16:07 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim> Message-ID: <1267460464-31845181@hypernet.com> Guido wrote: > [Great analysis, Tim!] > > > 4) The audience is Python end-users "in general", and the > > product is pure Python. I think this is the most important one > > for Distutils to address, and compilation isn't a part of it. > > So far, though, what Gordon is doing seems more appropriate > > than what Distutils has been up to. I hope his work gets > > folded into this. > > I'm not sure what stuff by which Gordon you're referring to. I > am only familiar with his installer, which I thought is win32 > only (but I may be mistaken) and is an installer for a whole > application, not just a bunch of modules. Please correct me if > I'm wrong. It needed a name. I hate the word "Installer", but it expresses in one word the most common use of my stuff. I'll be releasing a beta for Linux real soon. Only some of the tricks are Windows only (such as self-extracting executables, which is only culturally appropriate on Windows, anyway). But more importantly it's not just for installing. The Python I use (interactively) on my wife's machine is 1 directory with about 6 files in it. On my Linux box I've been using the std lib in a .pyz for about a month now. Someone distributing a pure Python package could instead ship 3 files (imputil.py, archive.py and .pyz) with the "install" consisting of adding one line to site.py in the user's perfectly normal Python installation. And yeah, I solved the "manifest" problem, too. Mine predates Distutils, so don't accuse me of duplicate effort, (I pointed them to it a couple times). It uses ConfigParser and a config file, so it allows finer control. While .pyz's are completely cross-platform, I have yet to work out endianness issues in the other archive I use (which should probably be zip format - it can hold anything). And at the "Installer" end, I have yet to work out how things should work on non-ELF/COFF platforms (where I can't append the archive to the executable). But there aren't any technical issues involved; just lack of time. So no, it's not just for Windows; and no, it's not just for creating standalones (though that's what almost everyone uses it for). - Gordon From guido@CNRI.Reston.VA.US Wed Dec 8 14:56:42 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 09:56:42 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com> References: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim> <1267460464-31845181@hypernet.com> Message-ID: <199912081456.JAA00200@eric.cnri.reston.va.us> > It needed a name. I hate the word "Installer", but it expresses > in one word the most common use of my stuff. > > I'll be releasing a beta for Linux real soon. Only some of the > tricks are Windows only (such as self-extracting executables, > which is only culturally appropriate on Windows, anyway). > > But more importantly it's not just for installing. The Python I > use (interactively) on my wife's machine is 1 directory with > about 6 files in it. On my Linux box I've been using the std lib > in a .pyz for about a month now. Someone distributing a pure > Python package could instead ship 3 files (imputil.py, > archive.py and .pyz) with the "install" consisting of > adding one line to site.py in the user's perfectly normal Python > installation. > > And yeah, I solved the "manifest" problem, too. Mine predates > Distutils, so don't accuse me of duplicate effort, (I pointed > them to it a couple times). It uses ConfigParser and a config > file, so it allows finer control. > > While .pyz's are completely cross-platform, I have yet to work > out endianness issues in the other archive I use (which should > probably be zip format - it can hold anything). And at the > "Installer" end, I have yet to work out how things should work > on non-ELF/COFF platforms (where I can't append the archive > to the executable). But there aren't any technical issues > involved; just lack of time. > > So no, it's not just for Windows; and no, it's not just for > creating standalones (though that's what almost everyone > uses it for). Gordon, I'm sorry, but from this description I still have no idea what your stuff is (and I forgot the URL so I can't look it up). For example, if it's not (just) for installing, what *is* it for? What is the ``"manifest" problem'' and how did you solve it? Also, note that editing site.py is a no-no! You can create/edit sitecustomize.py, but you should leave site.py alone! --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Wed Dec 8 16:17:03 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 11:17:03 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081456.JAA00200@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com> Message-ID: <1267453215-32281635@hypernet.com> Guido, > Gordon, I'm sorry, but from this description I still have no idea > what your stuff is (and I forgot the URL so I can't look it up). http://starship.python.org/crew/gmcm/installer.html The Linux stuff has a couple alpha testers and will probably get announced in a week or two. > For example, if it's not (just) for installing, what *is* it for? At the bottom level, it's a bunch of tools using freeze's modulefinder, imputil.py and 2 kinds of archives. There's at least 2 layers above that, with "Installer" being the top. There's a clean separation between the layers, so you can break in wherever you like. > What is the ``"manifest" problem'' and how did you solve it? The problem is specifying a set of resources, hopefully without having to list them explicitly. I solve this with a config file that lets you specify packages, directories, directory trees.. with filters that can work from paths, names, extensions, regular expressions... > Also, note that editing site.py is a no-no! You can create/edit > sitecustomize.py, but you should leave site.py alone! That would work fine. One of the standalone configurations will write a site.py, but that's for a completely self-contained installation (ie, one which will have no conflicts with another Python installation). I'd also note that, for Windows at least, the path-expanding mechanism created by site.py has not caught on. I've got lots installed, and no site-python, site-packages or sitecustomize. - Gordon From guido@CNRI.Reston.VA.US Wed Dec 8 16:23:34 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 11:23:34 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> References: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com> <1267453215-32281635@hypernet.com> Message-ID: <199912081623.LAA04119@eric.cnri.reston.va.us> [me] > > Also, note that editing site.py is a no-no! You can create/edit > > sitecustomize.py, but you should leave site.py alone! [Gordon] > That would work fine. One of the standalone configurations will > write a site.py, but that's for a completely self-contained > installation (ie, one which will have no conflicts with another > Python installation). > > I'd also note that, for Windows at least, the path-expanding > mechanism created by site.py has not caught on. I've got lots > installed, and no site-python, site-packages or sitecustomize. You shouldn't see site-python or site-packages, they only exist on Unix. On Windows, everything is installed in the top Python directory. However you should see .pth files there, which is what site.py looks for. I believe NumPy and PIL use those. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Wed Dec 8 16:55:51 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 11:55:51 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081623.LAA04119@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> Message-ID: <1267450887-32421651@hypernet.com> > [Gordon] > > That would work fine. One of the standalone configurations will > > write a site.py, but that's for a completely self-contained > > installation (ie, one which will have no conflicts with another > > Python installation). > > > > I'd also note that, for Windows at least, the path-expanding > > mechanism created by site.py has not caught on. I've got lots > > installed, and no site-python, site-packages or sitecustomize. [Guido] > You shouldn't see site-python or site-packages, they only exist > on Unix. You mean "they only exist _for_ Unix", (site.py looks for them on Windows). I don't like that. For one thing, modulo a few platform differences, the same mechanism should work for multi-user Unix and Windows LAN installations. And single- user Windows (I know, redundant, even on NT) should be a degenerate case of the above. > On Windows, everything is installed in the top Python > directory. However you should see .pth files there, which is > what site.py looks for. I believe NumPy and PIL use those. No NumPy, no PIL, no .pth files. 99% of everything out there just says "unzip this somewhere on your Python path". In this case, Jim Ahlstrom may be right - there are too many options, or at least an insufficiently emphasized "proper" method. Until I worked out my own way of installing stuff, I used to lose a large number of packages whenever I upgraded my Windows Python. Much as I love Mark's stuff (and hesitate to criticize crazy Aussies), I wish there weren't so much special casing here for Windows. And no, I don't have any solutions to this, I'm just griping... - Gordon From guido@CNRI.Reston.VA.US Wed Dec 8 17:07:30 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 12:07:30 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 11:55:51 EST." <1267450887-32421651@hypernet.com> References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> Message-ID: <199912081707.MAA04242@eric.cnri.reston.va.us> > [Guido] > > You shouldn't see site-python or site-packages, they only exist > > on Unix. [Gordon] > You mean "they only exist _for_ Unix", (site.py looks for them > on Windows). No it doesn't. The code in site.py only adds site-packages and site-python when os.sep is '/'. RTSL. > I don't like that. For one thing, modulo a few > platform differences, the same mechanism should work for > multi-user Unix and Windows LAN installations. And single- > user Windows (I know, redundant, even on NT) should be a > degenerate case of the above. What do you mean by "the same mechanism should work"? The same mechanism for what? Are you talking about sharing the installed files somehow? > > On Windows, everything is installed in the top Python > > directory. However you should see .pth files there, which is > > what site.py looks for. I believe NumPy and PIL use those. > > No NumPy, no PIL, no .pth files. 99% of everything out there > just says "unzip this somewhere on your Python path". Fair enough. Of course I know about .pth files so I unzipped them elsewhere and added a .pth file pointing there... > In this case, Jim Ahlstrom may be right - there are too many > options, or at least an insufficiently emphasized "proper" > method. Until I worked out my own way of installing stuff, I > used to lose a large number of packages whenever I upgraded > my Windows Python. The .pth files are designed for this. Maybe they haven't been explained as well as they should. > Much as I love Mark's stuff (and hesitate to criticize crazy > Aussies), I wish there weren't so much special casing here for > Windows. It's not Mark's fault, it's Microsoft's fault. If you don't do things the way MS wants you to, experienced Windows users will gripe, misunderstand what you do, etc. > And no, I don't have any solutions to this, I'm just griping... Ditto. Understanding the problems is half of the solution though. The problems seem pretty complex! --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Wed Dec 8 18:25:50 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 13:25:50 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 11:55:51 EST." <1267450887-32421651@hypernet.com> Message-ID: <1267445488-32746429@hypernet.com> [Guido] > No it doesn't. The code in site.py only adds site-packages and > site-python when os.sep is '/'. RTSL. Oops. Missed that. > > I don't like that. For one thing, modulo a few > > platform differences, the same mechanism should work for > > multi-user Unix and Windows LAN installations. And single- user > > Windows (I know, redundant, even on NT) should be a degenerate > > case of the above. > > What do you mean by "the same mechanism should work"? The same > mechanism for what? Are you talking about sharing the installed > files somehow? In the above, "mechanism" basically meant that which creates sys.path. Basically, this came up for me because in standalone configurations (my Installer again), I have to take complete control of sys.path. After doing so differently on Windows and Linux, I finally realized that I can do it the same way on both. Which makes me question why they are so different. > The .pth files are designed for this. Maybe they haven't been > explained as well as they should. I'd say "badgered" or "browbeaten" instead of "explained" ;-). > > Much as I love Mark's stuff (and hesitate to criticize crazy > > Aussies), I wish there weren't so much special casing here for > > Windows. > > It's not Mark's fault, it's Microsoft's fault. If you don't do > things the way MS wants you to, experienced Windows users will > gripe, misunderstand what you do, etc. Even MS doesn't do things the way MS says they want you to. I find MS users equally divided between those who scream bloody murder if you touch the registry, and those who scream if you don't. It's not like *nixen suffer from an excessive degree of conformity in preferred installation procedures, but somehow Python survives there... > > And no, I don't have any solutions to this, I'm just griping... > > Ditto. Understanding the problems is half of the solution > though. The problems seem pretty complex! Grumpily agreed ;-). - Gordon From jim@interet.com Wed Dec 8 18:33:51 1999 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 08 Dec 1999 13:33:51 -0500 Subject: [Python-Dev] Linux Journal confirms evil rumor References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> Message-ID: <384EA48F.F5190180@interet.com> I finally got around to reading the current Linux Journal (which just keeps getting better and better) and lo! there was a picture of a familiar face I just couldn't quite.... Oh no! Could it be true? I heard rumors but I refused to believe them until now. The glasses are gone! Guido now looks like an investment banker! The sky is falling! Next will probably be a Python 1.6 as a 27 Meg DLL, and a Python IPO. Well, maybe not. Now that I look more closely, he is wearing a black and white and mustard (??MUSTARD) T-shirt which says "You Need Python". At least we ought to make him wear a name tag at IPC8. JimA From fdrake@acm.org Wed Dec 8 18:37:44 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 8 Dec 1999 13:37:44 -0500 (EST) Subject: [Python-Dev] Linux Journal confirms evil rumor In-Reply-To: <384EA48F.F5190180@interet.com> References: <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> <384EA48F.F5190180@interet.com> Message-ID: <14414.42360.309237.967766@weyr.cnri.reston.va.us> James C. Ahlstrom writes: > Oh no! Could it be true? I heard rumors but I refused to > believe them until now. The glasses are gone! Guido now > looks like an investment banker! The sky is falling! I'm afraid this non-distinctive look was introduced at IPC7... it's too bad we can't tell people Python was invented by the guy with the glasses anymore. > Next will probably be a Python 1.6 as a 27 Meg DLL, and > a Python IPO. Well, maybe not. Now that I look more > closely, he is wearing a black and white and mustard > (??MUSTARD) T-shirt which says "You Need Python". It's really the blue & white & orange IPC7 shirt. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Dec 8 18:41:51 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Wed, 8 Dec 1999 13:41:51 -0500 (EST) Subject: [Python-Dev] Linux Journal confirms evil rumor References: <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> <384EA48F.F5190180@interet.com> Message-ID: <14414.42607.701538.783684@anthem.cnri.reston.va.us> >>>>> "JCA" == James C Ahlstrom writes: JCA> Oh no! Could it be true? I heard rumors but I refused to JCA> believe them until now. The glasses are gone! Guido now JCA> looks like an investment banker! The sky is falling! He's not the only one who's, like, "gone corporate", but I won't mention any names, so as to protect the guilty. From jim@digicool.com Wed Dec 8 19:03:42 1999 From: jim@digicool.com (Jim Fulton) Date: Wed, 08 Dec 1999 14:03:42 -0500 Subject: [Python-Dev] Linux Journal confirms evil rumor References: <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> <384EA48F.F5190180@interet.com> <14414.42607.701538.783684@anthem.cnri.reston.va.us> Message-ID: <384EAB8E.EBA595B5@digicool.com> "Barry A. Warsaw" wrote: > > He's not the only one who's, like, "gone corporate", but I won't > mention any names, so as to protect the guilty. OK, Buzz. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From tim_one@email.msn.com Thu Dec 9 05:31:52 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 9 Dec 1999 00:31:52 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us> Message-ID: <000301bf4206$b39e5b80$36a2143f@tim> [Guido] > [Great analysis, Tim!] I beg to differ: it's internally inconsistent and should have identified at least 3 axes and hence at least 8 cases. Still, you got more than you paid for . >> 4) The audience is Python end-users "in general", and the >> product is pure Python. I think this is the most important one >> for Distutils to address, and compilation isn't a part of it. >> So far, though, what Gordon is doing seems more appropriate >> than what Distutils has been up to. I hope his work gets folded >> into this. > I'm not sure what stuff by which Gordon you're referring to. You guessed right! > I am only familiar with his installer, which I thought is win32 > only (but I may be mistaken) and is an installer for a whole > application, not just a bunch of modules. Please correct me if > I'm wrong. If it can install a whole app, what makes you suspect it couldn't install just a bunch of modules <0.5 wink>? It started life as Windows-only, and I believe it's been virtually ignored by non-Windows folk because of that. Bad blind spot. It supplies already-working approaches to many of the issues that are still being *talked* about on Distutils (at least archive formats, code to manipulate same, manifest files (how do you tell the tool which files to package?), and transparently bundling a Python interpreter when needed). > But this reminds me of a different issue, which Jim Ahlstrom has > been hammering about before: there's a completely separate set of > cases where what you are distributing is a stand-alone application, > and the target consists of end users who are entirely uninterested > in whether it's written in Python, C or Elvish. I include part of that in my case #4 above, where the app happens to be written in Pure Python -- but the user doesn't have to know that. Gordon is addressing at least that part of it. AFAIK he can't deal with transparently compiling C or exorcising Elvish on the target platform, but if you're just distributing the binaries I expect his work is directly usable already. > (And then there's still the distinction between Win32, Unix or > both.) I vote "both". The world really doesn't need another Win32-only (or Unix-only) installer, archive format, compression format, or distribution model. Jim seems mostly interested in Win32-only to me, and his concerns haven't been about the mechanics of distribution but about how-- regardless of tool --to create a bulletproof Python installation by hook or by crook. Last time we went thru this, it was concluded that one couldn't without patching the Python Windows binary with a resource editor (to point to its own infernal <0.5 wink> registry entries). Distutils hasn't talked about that at all (that I've seen, anyway); if there were a less radical approach to that, I suspect Jim would be delighted to use one of the commercial Win32 installation pkgs (and if that's what his customers expect, delighted or not that's what he'll do). > The current distutil dools don't deal with this at all. That's why I said I thought what Gordon is doing seems more appropriate to case #4 than what Distutils has been doing. > I think it should though, Ditto. > and I think its framework is powerful enough to be able to > add this, e.g. as a new "appdist" command. I cordially invite (since Gordon will uncordially browbeat ) people to look seriously at what he's done. Best I can tell, for apps that don't need compilation "on the other end", it's mostly "there" already! give-the-man-a-hand-ly y'rs - tim From tim_one@email.msn.com Thu Dec 9 05:52:23 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 9 Dec 1999 00:52:23 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <1267453215-32281635@hypernet.com> Message-ID: <000601bf4209$90a90c80$36a2143f@tim> > http://starship.python.org/crew/gmcm/installer.html Eh? Doesn't work for me. This does: http://starship.python.net/crew/gmcm/distribute.html From tim_one@email.msn.com Thu Dec 9 06:38:54 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 9 Dec 1999 01:38:54 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us> Message-ID: <000701bf4210$10925a40$36a2143f@tim> [Gordon] >> Much as I love Mark's stuff (and hesitate to criticize crazy >> Aussies), I wish there weren't so much special casing here for >> Windows. [Guido] > It's not Mark's fault, it's Microsoft's fault. If you don't do > things the way MS wants you to, experienced Windows users will > gripe, misunderstand what you do, etc. Something just occurred to me: MS's guidelines aren't arbitrary, they actually have very good reasons. In the case of putting all an app's crucial info in the Registry, it's the only way to allow a site administrator to set policy and site options remotely (an admin can fiddle other machines' registries remotely). This works very well indeed when there's only "one copy" of an app on a machine (or at most one copy "per user"). What just occurred to me is that JimA is concerned with *not* letting any info from a previously-installed Python affect the app he's installing. Similarly, Gordon's Win32 "standalone installer" modifies python.exe and pythonw.exe to use a PYTHONPATH he forces, leaving the registry out of it. Similarly, the woes I've had in trying to sell Python as a general Win32 scripting tool at work mostly boil down to that there's no effortless way to do it that doesn't risk picking up info from-- or forcing info onto --pre-existing or future distinct Python installations (in contrast, Perl "just works" in this respect). IOW, the three of us find getting path info out of the registry intolerable because we are in fact trying to do the opposite of what the registry mechanism was *designed* for: we want perfect isolation, not perfect sharing. This has come up on Python-Help a few times too, in the guise of someone installing a product that in turn installs an older version of Python, which in turn confuses another product that relies on features in a newer version of Python. So while the traditional Windows .ini file (like Unix this-or-that.rc file) model was replaced by the registry for excellent reasons, those reasons don't apply to the way we're using Python! The .ini file model was exactly right for what most of us seem to want to do, and the registry model is exactly wrong. just-thought-i'd-cheer-you-up-ly y'rs - tim From skip@mojam.com (Skip Montanaro) Thu Dec 9 07:38:36 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 9 Dec 1999 01:38:36 -0600 (CST) Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <000701bf4210$10925a40$36a2143f@tim> References: <199912081707.MAA04242@eric.cnri.reston.va.us> <000701bf4210$10925a40$36a2143f@tim> Message-ID: <14415.23676.775163.786028@dolphin.mojam.com> Tim> So while the traditional Windows .ini file (like Unix Tim> this-or-that.rc file) model was replaced by the registry for Tim> excellent reasons, those reasons don't apply to the way we're using Tim> Python! The .ini file model was exactly right for what most of us Tim> seem to want to do, and the registry model is exactly wrong. Alright! Now I understand what all the hubbub is about! My eyes have mostly been glazing over trying to follow all this Windows registry/path/ini stuff. MS believes that Python is the application. Those of us writing Python programs view those programs as the applications, not the Python interpreter per se. Is there some way that people writing applications in Python can set up registry entries that are specific to their application (e.g. tabnanny.py) instead of only specific to the Python interpreter? Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From gmcm@hypernet.com Thu Dec 9 14:17:27 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 9 Dec 1999 09:17:27 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <000701bf4210$10925a40$36a2143f@tim> References: <199912081707.MAA04242@eric.cnri.reston.va.us> Message-ID: <1267374045-37047016@hypernet.com> [Guido] > > It's not Mark's fault, it's Microsoft's fault. If you don't do > > things the way MS wants you to, experienced Windows users will > > gripe, misunderstand what you do, etc. [Tim] > Something just occurred to me: MS's guidelines aren't arbitrary, > they actually have very good reasons. In the case of putting all > an app's crucial info in the Registry, it's the only way to allow > a site administrator to set policy and site options remotely (an > admin can fiddle other machines' registries remotely). This > works very well indeed when there's only "one copy" of an app on > a machine (or at most one copy "per user"). And actually, the business about separate subtrees for the machine's configuration and the user's configuration is pretty clever. MS doesn't explain it well, and it gets misused, but when done right, it's a lot simpler than the maze of .xxxrc files you sometimes find in other OSes. > What just occurred to me is that JimA is concerned with *not* > letting any info from a previously-installed Python affect the > app he's installing. Similarly, Gordon's Win32 "standalone > installer" modifies python.exe and pythonw.exe to use a > PYTHONPATH he forces, leaving the registry out of it. Similarly, > the woes I've had in trying to sell Python as a general Win32 > scripting tool at work mostly boil down to that there's no > effortless way to do it that doesn't risk picking up info from-- > or forcing info onto --pre-existing or future distinct Python > installations (in contrast, Perl "just works" in this respect). In my Linux version, I went to the heart of the matter - getpath.c. It occurs to me that getpath.c might do better to follow a normal bootstrap process - ie, create the absolute minimal sys.path required to go to the next step. Then the rest of what goes on in getpath.c could be written in Python. Maybe that Python code needs to get frozen in (to prevent bozos from destroying an installation by stepping on getpath.py), but it would make it a lot easier to create independent installations, and also reduce the variations between platforms at the C level. (Then again, I've never heard of anyone stepping on exceptions.py.) If some registry manipulation primitives were exposed (say, through ntpath) that would mean that Windows developers could (if they wanted) play by the MS rules with at least the option of not stepping on each other. - Gordon From jim@interet.com Thu Dec 9 15:02:18 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 10:02:18 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> Message-ID: <384FC47A.BB4DA517@interet.com> Tim Peters wrote: > Jim seems mostly interested in Win32-only to me, and his concerns haven't > been about the mechanics of distribution but about how-- regardless of > tool --to create a bulletproof Python installation by hook or by crook. Not exactly. I am interested in how to create a bullet-proof installation. But I am equally interested in Unix (especially Linux) and dislike the current dichotomy in the code base. Lately I have been more active in distribution via archive files. Part of the solution is an archive file format which is identical on Unix and Windows, and which can hold the Python library and packages as single files. For my own efforts on this see: ftp://ftp.interet.com/pub/pylib.html This is an archive file format similar to Gordon's format, although Gordon's work goes well beyond just file formats. I currently have fifth generation code for this format, and am adding features as suggested by Fredrik Lundt. I hope it gets considered as a candidate for a Python standard format. > Distutils hasn't talked about that at all (that I've seen, anyway); Gordon, Greg Stein and I have discussed file formats before. I think it was on distutils. Anyway that was months ago. JimA From guido@CNRI.Reston.VA.US Thu Dec 9 16:17:18 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 11:17:18 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 09:17:27 EST." <1267374045-37047016@hypernet.com> References: <199912081707.MAA04242@eric.cnri.reston.va.us> <1267374045-37047016@hypernet.com> Message-ID: <199912091617.LAA05742@eric.cnri.reston.va.us> > [Guido] > > > It's not Mark's fault, it's Microsoft's fault. If you don't do > > > things the way MS wants you to, experienced Windows users will > > > gripe, misunderstand what you do, etc. > [Tim] > > Something just occurred to me: MS's guidelines aren't arbitrary, > > they actually have very good reasons. In the case of putting all > > an app's crucial info in the Registry, it's the only way to allow > > a site administrator to set policy and site options remotely (an > > admin can fiddle other machines' registries remotely). This > > works very well indeed when there's only "one copy" of an app on > > a machine (or at most one copy "per user"). [Gordon] > And actually, the business about separate subtrees for the > machine's configuration and the user's configuration is pretty > clever. MS doesn't explain it well, and it gets misused, but > when done right, it's a lot simpler than the maze of .xxxrc files > you sometimes find in other OSes. I agree. And I am guilty of not even try to find MS' explanation -- I just looked in the registry at what other apps did and tried to mimic that (plus what Mark had already done), without really knowing what I was doing. I now know a little better -- see the end of this message. > In my Linux version, I went to the heart of the matter - > getpath.c. It occurs to me that getpath.c might do better to > follow a normal bootstrap process - ie, create the absolute > minimal sys.path required to go to the next step. Then the > rest of what goes on in getpath.c could be written in Python. > Maybe that Python code needs to get frozen in (to prevent > bozos from destroying an installation by stepping on > getpath.py), but it would make it a lot easier to create > independent installations, and also reduce the variations > between platforms at the C level. (Then again, I've never heard > of anyone stepping on exceptions.py.) Yes, this is exactly what was proposed in the thread on the Big Import Rewrite. > If some registry manipulation primitives were exposed (say, > through ntpath) that would mean that Windows developers > could (if they wanted) play by the MS rules with at least the > option of not stepping on each other. That's a good idea. These functions are already available through Mark's win32api extension -- much of which will eventually (I hope before 1.6 is out!) become part of the core distribution. In the mean time, I've been thinking a bit more about how Python should be using the Windows registry. (It's clear to me that Python should use the registry -- those who disagree can go build their own Python distribution.) The basic ideas of Python's current registry usage are sound: there's a resource built into the DLL which is part of the key into the registry used for all information. The problem lies in which key is used. All versions of Python 1.5.x (1.5, 1.5.1, 1.5.2) use the same key! This is a main cause of trouble, because it means that different versions cannot peacefully live together even if the user installs them into different directories -- they will all use the registry keys of the last version installed. This, in turn, means that someone who writes a Python application that has a dependency on a particular Python version (and which application worth distributing doesn't :-) cannot trust that if a Python installation is present, it is the right one. But they also cannot simply bundle the standard installer for the correct Python version with their program, because its installation would overwrite an existing Python application, thus breaking some *other* Python apps that the user might already have installed. (There's a solution for app builders who are willing to do a lot of work -- you can change the registry key resource in the DLL. For example, Alice comes with its own version of Python 1.5.1 and it uses "1.5.1-alice" as its registry key. The Alice installer installs Python in a subdirectory of the Alice installation directory and points the 1.5.1-alice registry entries there. The problem is that this is a lot of work for the average app builder.) I thought a bit about how VB solves this. I think that when you wrap up a VB app in, all the support code (mostly a big DLL) is wrapped with it. When the user runs the installer, the DLL is installed (probably in the WINDOWS directory). If a user installs several VB apps built with the same VB version, they all attempt to install the exact same DLL; of course the installers notice this and optimize it away, keeping a reference count. (Ignoring for now the fact that those reference counts don't always work!) If an app builty with a different VB version is installed, it has a DLL with a different name, and that is installed separately. Other support files, I presume, are dealt with in much the same way. Voila, there's the theory. How can we do something similar for Python? A app written in Python should need to install only three or four files: - a driver EXE to start the app - a copy of the Python DLL - the Python library in an archive - the app code in an archive The latter two could be combined into a single archive, but I propose that we use two archives so that the DLL and the Python library archive can be shared between installations of independent Python apps as long as they use the exact same Python version and don't need additional 3rd party packages. (I believe that Jim A's proposal combines the archives with the EXE and the DLL, reducing the number of files to two. That's fine too.) Is there a use for the registry here at all? Maybe not. (I notice that VB seems to have a single registry entry, pointing to a DLL; all other VB files also seem to live there.) Complications: - Some apps may need a custom extension module, which has to be installed as a PYD file. So it seems that there needs to be a directory per app, and perhaps per version of the app (if the app distributor cares). - Some apps need other, non-pyc files (e.g. data tables or help files); it would be handy if these could be stored in the archives as well. - Some standard extension modules are in their own PYD files; these also need to be installed. They aren't typically marked with a version, so perhaps a path directory per version of Python (if not per installed app) is wise. - How to distribute an app that needs 3rd party stuff, e.g. Tcl/Tk, or PIL, or NumPy? Their Python code can easily be wrapped up in another archive with a standard name incorporating a version number; but the required PYD and DLL files are a separate story. (E.g. for Tkinter, you need _tkinter.pyd which links against tcl80.dll.) Basically the same solution as for standard PYD files can work; the needed DLL files can be installed either systemwide (if they have a reliable version number in their name, like tcl80.dll) or in the per-app or per-package directory (like NumPy). - Presumably, the archives will contain PYC files only. This means that tracebacks will not show source code, only line numbers. For Jim A, this is probably exactly what he wants (if the user gets a traceback, his "robust app" has miserably failed, and he takes it in pride that this doesn't happen). But for some others, access to the sources could be essential. For example, I might want to distribute IDLE using this mechanism; users of IDLE who are curious about the standard library (or about IDLE itself) should be able to open the source for an arbitrary module (and maybe even edit it, although that's not a priority and perhaps should even be discouraged). Library source access is an important feature of the IDLE debugger as well. A way out for IDLE is to install a classic distribution of the Python library sources, into the filesystem at an IDLE specific location. Other apps, with only the need for source code in tracebacks, might choose to to have the PY files in the archives sitting next to the PYC files, and somehow the traceback mechanism should be accessing the archive to get a hold of the source. And yes, I realize that Jim A's latest offering solves most of these problems to a large extent -- well done. (Jim, would you care to comment on the issues that you don't address? Will you address them in a future version?) Final notes: There are two different problems here. One is how to distribute Python apps robustly to end users who don't particular care about Python. This is Jim A's problem (and he has a solution that works for him). In general the solutions here try to isolate the installed app from other Python installations. I'm proposing that at least the DLL and the Python library archive can probably be shared between apps without reducing robustness if we keep track more carefully of version numbers. The other problem is how to distribute packages of Python and extension modules for use by Python users. These typically need to drop into some existing Python installation. This is Paul Dubois' problem with NumPy (amongst others) and is the current focus of the distutil SIG. However I believe that there could be a lot of common infrastructure that would help us create better solutions for both problems. For package distribution, common infrastructure (a.k.a. standards) is essential. For app distribution, common infrastructure isn't so important (since the solutions strive for total isolation, there's no problem if different apps use solutions). However, this changes when app creators want to distribute robust self-sufficient apps that use 3rd party packages -- then the 3rd party packages must allow being packaged up using the app distribution creator of choice. Solving this compound problem (creating package distributions that can be redistributed easily as part of robust Python app distributions) should be an important goal for the infrastructure we're building here. The Big Import Rewrite ought to add this to its list of objectives if it isn't already on it. My guess is that the solution for this compound problem will increase the dependency of app distribution tools on the package distribution infrastructure; which to me seems like a Good Thing because it would lead to more code sharing. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim@interet.com Thu Dec 9 16:24:40 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 11:24:40 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000701bf4210$10925a40$36a2143f@tim> Message-ID: <384FD7C8.12832BF1@interet.com> Tim Peters wrote: > Something just occurred to me: MS's guidelines aren't arbitrary, they > actually have very good reasons. In the case of putting all an app's > crucial info in the Registry, it's the only way to allow a site > administrator to set policy and site options remotely (an admin can fiddle > other machines' registries remotely). This works very well indeed when > there's only "one copy" of an app on a machine (or at most one copy "per > user"). The registry is still a bad idea because it lumps critical and app data into single files and brings up the ugly problem of protecting individual registry entries instead of just files. Microsoft should have put all app config into the app directory and provided for remote admin of that. But that is not really your point (just ranting about the registry again). > IOW, the three of us find getting path info out of the registry intolerable > because we are in fact trying to do the opposite of what the registry > mechanism was *designed* for: we want perfect isolation, not perfect > sharing. > > This has come up on Python-Help a few times too, in the guise of someone > installing a product that in turn installs an older version of Python, which > in turn confuses another product that relies on features in a newer version > of Python. Or, in other words, no isolation is possible if critical info depends on global data like PYTHONPATH or a _common_ registry entry. We could have different registry entries, but this is confusing and not documented. I think we can solve this with archive files in a way compatible with Unix without going off on a Windows-only wavelength. If the archive file contains everything, and it is in the dir of the app, and the app looks there and finds it, then it Just Works. See also my reply to Skip. JimA From akuchlin@mems-exchange.org Thu Dec 9 16:32:08 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Thu, 9 Dec 1999 11:32:08 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list Message-ID: <199912091632.LAA09236@amarok.cnri.reston.va.us> After poking around in the O'Reilly POSIX book, here's a list of POSIX functions that don't seem to be available in Python. Not all of them seem worth supporting. Ironically, Greg Ward's daemonize() Perl subroutine, which started me on this, doesn't actually seem to need anything that Python doesn't have. I'm looking for corrections to the list; are there other POSIX functions I've missed, or are some of them actually in Python? I think implementing most of these functions is straightforward, with the exception of opendir/readdir/closedir. Worth adding? ============= opendir(), readdir(), closedir() -- most of their functionality is available through os.listdir(), but it might be useful to have a direct interface. Downside is that this would require a new extension type for the C DIR struct. My (lazy) inclination is to not bother. Worth adding: ============= abort() -- used in Py_FatalError(), but not accessible to Python code ctermid(), ctermid_r() -- returns the terminal pathname -- probably just add ctermid(), but use ctermid_r() for thread-safety fpathconf(fd, name) -- Get configuration limit for a file -- would need constants from unistd.h getlogin() -- returns user's login name -- could do something similar with pwd.getpwuid( os.getuid() )[0], but getlogin() apparently looks in utmp getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs pathconf(path, name) -- Gets config variables for a path -- would need constants from unistd.h sysconf(int name) -- Gets system configuration information -- would need constants from unistd.h Not worth adding: ================= clearerr() -- looks like fileobjects call clearerr() before raising errors cuserid() -- returns user's login name -- ORA book says "Do not use this function" -- removed in 1990 POSIX difftime -- seems only required in C "because no addition properties are defined for time_t" (Solaris man page) tmpfile(), tmpnam() -- Create temp file, generate temp filename -- Similar functionality available in tempfile.py mblen(), mbstowcs(), mbtowc(), wcstombs(), wctomb() -- Multi-byte character functions: -- Don't bother; wait for the Unicode type. -- A.M. Kuchling http://starship.python.net/crew/amk/ I'm sorry I became abusive just now ... calling you worms... I was just speaking relatively, you understand. -- Dekko, in ZOT! #3 From jcw@equi4.com Thu Dec 9 16:38:13 1999 From: jcw@equi4.com (Jean-Claude Wippler) Date: Thu, 09 Dec 1999 17:38:13 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> Message-ID: <384FDAF5.C25C447C@equi4.com> "James C. Ahlstrom" wrote: [...] > ftp://ftp.interet.com/pub/pylib.html Ouch - what's wrong with zip archives? There are utilities to convert to/from zip, to re-pack, to mount zip transparently so it's entries look like regular files, FTP servers, etc. Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format. Zips would seem natural with JPython. And suppose that scripting ever starts to consolidate to a common scripting kernel (yah, well), do you really want a system which is closing all doors to cross-fertilization? Zip has an advantage over .tar.gz in that its table of contents is available without having to decompress the whole kaboodle. Your format has no checksum, which for deployment and long-term storage can be important. If you want a marshalled TOC, then why not add a manifest entry for it, sort of like what ranlib does with ar? You designed the format so archives can be concatenated without any tool (other than "cat"), but this works just as well with zip files, as the Tcl Wrap approach demonstrates. Allow me to very, very loosely paraphrase Guido here: sure, everyone can design an archive format, but they are likely to make the same mistakes all over again - so why not adopt a format which is tried and tested? With all due respect - I sincerely hope you will reconsider and alter your code to work with zip files. It's probably a small adjustment? Unless your *intent* is to create a diverging standard, of course... -- Jean-Claude From jim@interet.com Thu Dec 9 16:46:35 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 11:46:35 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <199912081707.MAA04242@eric.cnri.reston.va.us> <000701bf4210$10925a40$36a2143f@tim> <14415.23676.775163.786028@dolphin.mojam.com> Message-ID: <384FDCEB.2226C1C1@interet.com> Skip Montanaro wrote: > MS believes that Python is the application. Those of us writing > Python programs view those programs as the applications, not the Python > interpreter per se. I think this is a good point. Windows app programmers (mostly) view Python as part of their app and try it install it in their app directory. Unix installs Python as a system app in multiple versions and users use PATH to pick a version. Unix users view the Python interpreter as a system service which is needed for running their app. I think this is because a Windows app is a visual program, and the Python release compiles to a console app (not really a visual program). So all (?most) Windows Python apps are custom mains with Python as a component, but the stock python.exe is not the main. This makes it difficult to document a way to install Python in the Unix fashion, since all apps need their own binary main and python15.dll is the only thing in common. IMHO archive files can solve this a lot more simply. JimA From guido@CNRI.Reston.VA.US Thu Dec 9 16:55:40 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 11:55:40 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 17:38:13 +0100." <384FDAF5.C25C447C@equi4.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> Message-ID: <199912091655.LAA05928@eric.cnri.reston.va.us> > "James C. Ahlstrom" wrote: > > [...] > > ftp://ftp.interet.com/pub/pylib.html Jean-Claude Wippler replied: > Ouch - what's wrong with zip archives? > > There are utilities to convert to/from zip, to re-pack, to mount zip > transparently so it's entries look like regular files, FTP servers, etc. > > Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format. > > Zips would seem natural with JPython. And suppose that scripting ever > starts to consolidate to a common scripting kernel (yah, well), do you > really want a system which is closing all doors to cross-fertilization? > > Zip has an advantage over .tar.gz in that its table of contents is > available without having to decompress the whole kaboodle. > > Your format has no checksum, which for deployment and long-term storage > can be important. > > If you want a marshalled TOC, then why not add a manifest entry for it, > sort of like what ranlib does with ar? > > You designed the format so archives can be concatenated without any tool > (other than "cat"), but this works just as well with zip files, as the > Tcl Wrap approach demonstrates. > > Allow me to very, very loosely paraphrase Guido here: sure, everyone can > design an archive format, but they are likely to make the same mistakes > all over again - so why not adopt a format which is tried and tested? > > With all due respect - I sincerely hope you will reconsider and alter > your code to work with zip files. It's probably a small adjustment? > > Unless your *intent* is to create a diverging standard, of course... Exactly my sentiments. We have rough Python code to deal with zip files; it's very rough because we got kind of carried away adding features and ended up with spaghetti code :-( But it's working code nevertheless and we're offering it up for anyone in this group to clean up (we could do that ourselves but it's not high on our current priority list). I don't know anything about Tcl Wrap. I do know a great deal about the ZIP format, but apparently I missed the concatenation feature. How does this work? Does that work for all zip tools, or just for the ZIP reader in Wrap? (I looked up how Jim A does it -- his central directory at the end of the file contains the total size of the data covered by that directory, so he seeks back to the beginning of it and sees if another magic number precedes it; and so on. Very simple.) I quickly looked at the Wrap page; it shows how to access data files stored in the archive. Question: does the wrap::open code go out to the regular filesystem if it finds there's no wrap archive? That would be handy so you can test the code in its unwrapped form without change. Python needs this too. --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@cnri.reston.va.us Thu Dec 9 17:12:00 1999 From: gward@cnri.reston.va.us (Greg Ward) Date: Thu, 9 Dec 1999 12:12:00 -0500 Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Dec 09, 1999 at 11:32:08AM -0500 References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <19991209121159.B20179@cnri.reston.va.us> On 09 December 1999, Andrew M. Kuchling said: > After poking around in the O'Reilly POSIX book, here's a list of POSIX > functions that don't seem to be available in Python. Not all of them > seem worth supporting. Ironically, Greg Ward's daemonize() Perl > subroutine, which started me on this, doesn't actually seem to need > anything that Python doesn't have. I think I already pointed this your way, but don't forget the man page for Perl's POSIX module: "perldoc POSIX". I suspect POSIX functions that don't make sense in Perl also don't make sense in Python. I agree with all your assessments about what's worth adding and what's not, and that {close,read,open}dir() are questionable and probably not worth the bother. Random thoughts: > abort() -- used in Py_FatalError(), but not accessible to Python code Would this do the same as in C, ie. terminate the process and dump core? > getlogin() -- returns user's login name > -- could do something similar with pwd.getpwuid( os.getuid() )[0], but > getlogin() apparently looks in utmp With a documentation proviso that utmp is very old-fashioned, and you really should do the getuid() thing unless you definitely want to get the login ID from utmp. Perhaps an alternate "getlogin" (different name?) that does the getuid() thing could be provided. Greg From guido@CNRI.Reston.VA.US Thu Dec 9 17:16:03 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 12:16:03 -0500 Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: Your message of "Thu, 09 Dec 1999 12:12:00 EST." <19991209121159.B20179@cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> <19991209121159.B20179@cnri.reston.va.us> Message-ID: <199912091716.MAA06063@eric.cnri.reston.va.us> > > getlogin() -- returns user's login name > > -- could do something similar with pwd.getpwuid( os.getuid() )[0], but > > getlogin() apparently looks in utmp > > With a documentation proviso that utmp is very old-fashioned, and you > really should do the getuid() thing unless you definitely want to get > the login ID from utmp. Perhaps an alternate "getlogin" (different > name?) that does the getuid() thing could be provided. There's the getpass module which has a getuser() function that looks in various env vars and if all else fails uses getuid() and pwd. If the goal is to get the user ID without being fooled, using os.getuid() or os.geteuid() directly seems to be the right thing to do; I don't see the need for a shorthand for pwd.getpwuid(os.getuid())[0] (which is what getuser() uses). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Thu Dec 9 17:18:10 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 12:18:10 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 10:02:18 EST." <384FC47A.BB4DA517@interet.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> Message-ID: <199912091718.MAA06087@eric.cnri.reston.va.us> [Jim A] > Lately I have been more active in distribution via archive files. > Part of the solution is an archive file format which is identical on > Unix and Windows, and which can hold the Python library and packages > as single files. For my own efforts on this see: > > ftp://ftp.interet.com/pub/pylib.html Apart from agreeing with Jean-Claude's rant about inventing a new archive format, I think this is a good proposal because it is very clear about the problem it tries to solve and doesn't get distracted by other issues. I also commend Jim for building upon Greg Stein's imputil (like Gordon did). I wish I could present a solution this simple as The Standard Way, but (as explained in my long post earlier today) there just are so many wrinkles that I'd rather hold out for the Right Solution... But I've taken good notice of Jim's solution. --Guido van Rossum (home page: http://www.python.org/~guido/) From beazley@cs.uchicago.edu Thu Dec 9 17:16:57 1999 From: beazley@cs.uchicago.edu (David Beazley) Date: Thu, 9 Dec 1999 11:16:57 -0600 (CST) Subject: [Python-Dev] Missing POSIX functions: the list References: <199912091632.LAA09236@amarok.cnri.reston.va.us> <19991209121159.B20179@cnri.reston.va.us> Message-ID: <199912091716.LAA15624@gargoyle.cs.uchicago.edu> Greg Ward writes: > > I think I already pointed this your way, but don't forget the man page > for Perl's POSIX module: "perldoc POSIX". I suspect POSIX functions > that don't make sense in Perl also don't make sense in Python. > > I agree with all your assessments about what's worth adding and what's > not, and that {close,read,open}dir() are questionable and probably not > worth the bother. Random thoughts: > I disagree. I think that the POSIX module should strive to be as complete as possible--even if certain functions are closely related other functionality in the library (tmpfile for instance). I suspect that this sort of thing is probably the cause of the missing functionality in the current library (as in, "why would anyone want to do that?" when in fact there may be a perfectly good reason in certain situations). > > abort() -- used in Py_FatalError(), but not accessible to Python code > > Would this do the same as in C, ie. terminate the process and dump core? > Sure, why not? This might be a useful thing to do every so often---when trying to figure out what's wrong with a C extension module for instance. Cheers, Dave From jim@interet.com Thu Dec 9 17:43:57 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 12:43:57 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> Message-ID: <384FEA5D.A07F23EC@interet.com> Jean-Claude Wippler wrote: > Ouch - what's wrong with zip archives? Thanks very much for looking over the format. In general Zip archives store whole branches of a file system. A Python ./Lib zip archive would contain: N:/python/Python-1.5.2/Lib/string.pyc N:/python/Python-1.5.2/Lib/os.pyc N:/python/Python-1.5.2/Lib/copy.pyc N:/python/Python-1.5.2/Lib/test/testall.pyc Zip archives are isomorphic to branches of a file system. That means there must be a sys.path for each zip archive file. How would this be specified? The archive format stores modules as dotted names, just as they appear in the import statement. The search path is "." in every archive file by definition. The import statement "import foo" just results in a dictionary lookup for key "foo", not a search through a zip directory along a local search path for "foo.something" where "something" can be pyc, pyo, py, etc. The intent was to link the archives to the import statement, not re-create a directory tree. It borrowed this feature from the archive formats of Greg and Gordon. > There are utilities to convert to/from zip, to re-pack, to mount zip > transparently so it's entries look like regular files, FTP servers, etc. Basic operations (to, from, repack) are easy in Python. > Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format. Hmmm.... > Your format has no checksum, which for deployment and long-term storage > can be important. Actually the pylib.py "dir()" method reads all *.pyc with marshal, and I am depending on marshal to object to bad data and also out-of-date magic numbers. But this is a good point. > If you want a marshalled TOC, then why not add a manifest entry for it, > sort of like what ranlib does with ar? Sorry, I don't understand. Please explain. > You designed the format so archives can be concatenated without any tool > (other than "cat"), but this works just as well with zip files, as the > Tcl Wrap approach demonstrates. Are you saying that cat zip1.zip zip2.zip > myzip.zip works? An important feature is the ability to concatenate to a binary: cat python.exe zip1.zip > myapp.exe Searching for this isn't fast unless magic numbers are at the end. Are zip files recognizable from the end (I don't know)? > Allow me to very, very loosely paraphrase Guido here: sure, everyone can > design an archive format, but they are likely to make the same mistakes > all over again - so why not adopt a format which is tried and tested? > > With all due respect - I sincerely hope you will reconsider and alter > your code to work with zip files. It's probably a small adjustment? > > Unless your *intent* is to create a diverging standard, of course... The intent is to create a standard but not a diverging standard. Are there any zip experts out there? Can zip files satisfy all the design requirements I listed in pylib.html? Is there zip code available? All my code is in Python. JimA From jcw@equi4.com Thu Dec 9 17:57:33 1999 From: jcw@equi4.com (Jean-Claude Wippler) Date: Thu, 09 Dec 1999 18:57:33 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> Message-ID: <384FED8D.3C535D38@equi4.com> Guido van Rossum wrote: > > [... my not-really-meant-as-rant about adopting zip as format ...] > [zip concatenation feature] > How does this work? Does that work for all zip tools, or just for the > ZIP reader in Wrap? (I looked up how Jim A does it -- his central > directory at the end of the file contains the total size of the data > covered by that directory, so he seeks back to the beginning of it and > sees if another magic number precedes it; and so on. Very simple.) Same for Wrap. Standard tools would not see the preceding ZIP groups. In terms of maintenance, I'd avoid this trick. I merely wanted to point out that zip archives can be stacked, if the reader is set up to it. > Question: does the wrap::open code go out to the regular filesystem > if it finds there's no wrap archive? That would be handy so you can > test the code in its unwrapped form without change. IIRC, Wrap overrides "open" for embedded entries as "file.zip/abc.py". There's more being developed in this area: a "virtual file system" which lets you mount archives and such (VFS by Matt Newman, mentioned with his permission), so that the file-system model can be extended to navigate into a lot more things than real file systems. Andrew Kuchling's post hints at another tangent: opendir/readdir is of course simply an enumeration. There's a lot of "genericity" lurking in scanning across file systems, trees, networks, and resources in general. The filesystem <-> OO dichotomy needs a review. > Python needs this too. Concepts like these have a lot to offer - and would make even more sense if they were done in a way which benefits multiple scripting languages. Feel free to reply by email if you ever want to further discuss this. -- Jean-Claude From fdrake@acm.org Thu Dec 9 18:10:44 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 13:10:44 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <14415.61604.415084.520092@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > After poking around in the O'Reilly POSIX book, here's a list of POSIX > functions that don't seem to be available in Python. Not all of them > seem worth supporting. Ironically, Greg Ward's daemonize() Perl I think your assessment is reasonable. I looked at posixmodule.c and note also that the functions use PyArg_Parse() and PyArg_NoArgs() instead of using PyArg_ParseTuple(). The advantage of PyArg_ParseTuple() is that the name of the function can be specified for inclusion in TypeError messages when the arguments are not of the right type. I'm doing some work to correct this now. I've also added ctermid(), and will try to add at least a few more before I check in the changes. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu Dec 9 18:17:35 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 9 Dec 1999 13:17:35 -0500 (EST) Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> Message-ID: <14415.62015.856931.750279@anthem.cnri.reston.va.us> >>>>> "JW" == Jean-Claude Wippler writes: JW> Same for Wrap. Standard tools would not see the preceding ZIP JW> groups. JW> In terms of maintenance, I'd avoid this trick. I merely JW> wanted to point out that zip archives can be stacked, if the JW> reader is set up to it. I agree. I can't recall the details now, but I had a lot of problems with zip concatenation in JPython. I think at least some of the older Java tools for groking zips don't work with contatenation. -Barry From guido@CNRI.Reston.VA.US Thu Dec 9 18:21:42 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 13:21:42 -0500 Subject: [Python-Dev] Virtual filesystem APIs In-Reply-To: Your message of "Thu, 09 Dec 1999 18:57:33 +0100." <384FED8D.3C535D38@equi4.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> Message-ID: <199912091821.NAA06209@eric.cnri.reston.va.us> Jean-Claude Wippler: > There's more being developed in this area: a "virtual file system" which > lets you mount archives and such (VFS by Matt Newman, mentioned with his > permission), so that the file-system model can be extended to navigate > into a lot more things than real file systems. I agree. We have experimented with this a bunch in the Knowbot sofware, where we have some code that wants to look at a "filesystem" but could be talking to some kind of filesystem emulation across an RPC connection or alternatively could be accessing a zip file. Our conclusion is that a convenient interface is modeled after (a subset of) the os and os.path functionality. In fact, the only thing you would need to add to the os module would be a function to open a file object; I've proposed to add os.fopen() as an alias for the built-in open(). The idea that you could mount one VFS inside another is nice, although I'm not sure how practical it is. For one thing, in our fs code, os.path.sep and friends (e.g. os.path.normcase behavior) were set per filesystem; what would happen if you mounted a Unix filesystem in an NT tree? Doing the translations is hard too; e.g. on a Mac fs, the separator is ':' and a '/' can be part of a filename -- do you simply swap them? What if a Mac file has both '/' and '\' and you mount it on a Windows FS? I'd rather stay away from this. On the other hand the VFS concept could be used as a totally different solution to the sys.importers vs. sys.path > Andrew Kuchling's post hints at another tangent: opendir/readdir is of > course simply an enumeration. There's a lot of "genericity" lurking in > scanning across file systems, trees, networks, and resources in general. I'd still rather see listdir() (which our sample virtual FS API supported). I don't think it necessarily makes sense to do this on a more generic basis -- other trees and graphs have sufficiently different semantics that using a FS like API doesn't necessarily cut it. Take for example the Windows registry -- looks a lot like a filesystem, doesn't it? Yet it has one fundamental property that a typical FS doesn't: directory nodes can have data *and* children... I've written a tree widget and found that it's remarkably hard to come up with a workable API to talk to trees *in general*. Trees are a universal concept, but code sharing is still elusive... Perhaps because the concept is so simple? > The filesystem <-> OO dichotomy needs a review. I think that my proposal above should cover this. (We looked briefly at doing a similar thing for Java, and found that it's actually harder there -- they have all these nice objects representing paths, but it's not easily subclassable to represent paths in some virtual filesystem.) > Concepts like these have a lot to offer - and would make even more sense > if they were done in a way which benefits multiple scripting languages. > Feel free to reply by email if you ever want to further discuss this. I see only very hope for this point of view, but I will refrain to comment more. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Thu Dec 9 18:23:14 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 9 Dec 1999 13:23:14 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <384FEA5D.A07F23EC@interet.com> Message-ID: <1267359311-37934097@hypernet.com> James C. Ahlstrom wrote: > Jean-Claude Wippler wrote: > > > Ouch - what's wrong with zip archives? > In general Zip archives store whole branches of a file > system. > The archive format stores modules as dotted names, just as they > appear in the import statement. The search path is "." in every > archive file by definition. The import statement "import foo" > just results in a dictionary lookup for key "foo", not a search > through a zip directory along a local search path for > "foo.something" where "something" can be pyc, pyo, py, etc. > > The intent was to link the archives to the import statement, not > re-create a directory tree. It borrowed this feature from the > archive formats of Greg and Gordon. As I've stated before, I have 2 archive formats. This may seem a needless complication, but my suspicion is that sooner or later, people will want 2 different kinds. One is a .pyz format, which corresponds closely to Jim's .pyl format (with a number of minor differences: it's compressed, the archive as a whole has the Python magic number, instead of each entry, and it's not designed for concatenation). The other is like a zip, and probably should be zip format. It's designed to hold _anything_, and can be manipulated from C and from Python. It can be concatenated and / or embedded (and the innner one opened without extraction). It's table of contents is more file-system like. Importing from one is slower, but that's not really what it's for. It's for packaging up arbitrary resources. Like .pyz's, or Tcl/Tk for Tkinter apps, or configuration files. Jim is correct that a good importer (which can say "No, it's not mine" as quickly as possible) is better satisfied by a simple dictionary lookup than fooling with file extensions and directories (virtual or real). > > If you want a marshalled TOC, then why not add a manifest entry > > for it, sort of like what ranlib does with ar? > > Sorry, I don't understand. Please explain. The table of contents is just another entry. > An important feature is the ability to concatenate to a binary: > cat python.exe zip1.zip > myapp.exe > Searching for this isn't fast unless magic numbers are at the > end. Are zip files recognizable from the end (I don't know)? Where do you think we got this idea? > Are there any zip experts out there? Can zip files satisfy all > the design requirements I listed in pylib.html? Is there zip > code available? All my code is in Python. Hmm. My bookmark appears to be dead (I was there not long ago): http://www.cubic.org/source/archive/fileform/packers/appnote.t xt There have been several references on this list to Guido et al having some Python / zip code. - Gordon From guido@CNRI.Reston.VA.US Thu Dec 9 18:23:27 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 13:23:27 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 13:17:35 EST." <14415.62015.856931.750279@anthem.cnri.reston.va.us> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> <14415.62015.856931.750279@anthem.cnri.reston.va.us> Message-ID: <199912091823.NAA06243@eric.cnri.reston.va.us> > I agree. I can't recall the details now, but I had a lot of problems > with zip concatenation in JPython. I think at least some of the older > Java tools for groking zips don't work with contatenation. The Java "jar" tool mostly ignores the central directory -- it seems to read the archive from the front, using the local header records, and ignoring the central directory (of course it writes one when it creates an archive). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Thu Dec 9 18:32:15 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 13:32:15 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 12:43:57 EST." <384FEA5D.A07F23EC@interet.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <384FEA5D.A07F23EC@interet.com> Message-ID: <199912091832.NAA06287@eric.cnri.reston.va.us> > In general Zip archives store whole branches of a file > system. A Python ./Lib zip archive would contain: > > N:/python/Python-1.5.2/Lib/string.pyc > N:/python/Python-1.5.2/Lib/os.pyc > N:/python/Python-1.5.2/Lib/copy.pyc > N:/python/Python-1.5.2/Lib/test/testall.pyc > > Zip archives are isomorphic to branches of a file system. > That means there must be a sys.path for each zip archive file. > How would this be specified? Not true. It's easy (using the proper Zip tools) to creat an archive containing this instead: string.pyc os.pyc copy.pyc testall.pyc Thus the entire archive is considered the directory. The Java "jar" tool uses this approach. It's also easy to have packages in there (again this is what Java does): test/ test/__init__.pyc test/pystone.pyc test_support.pyc (etc.) > The archive format stores modules as dotted names, just as they > appear in the import statement. The search path is "." in every > archive file by definition. The import statement "import foo" > just results in a dictionary lookup for key "foo", not a search > through a zip directory along a local search path for "foo.something" > where "something" can be pyc, pyo, py, etc. > > The intent was to link the archives to the import statement, not > re-create a directory tree. It borrowed this feature from > the archive formats of Greg and Gordon. Maybe you've gone overboard. The time it takes to translate the dots into slashes really isn't the big deal. > Are there any zip experts out there? Can zip files satisfy all the > design requirements I listed in pylib.html? Is there zip code > available? All my code is in Python. Yes (all of us here at CNRI), yes, yes (we have the spaghetti code). While zip files support compression, they support uncompressed files as well and we could go either way. Their most popular compression format is gzip compatible and can be read and written with the zlib module, which is in the standard Python distribution (even on Windows) -- though to build it you need the zlib C library which is of course external (but solid open source). --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Dec 9 18:41:22 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 13:41:22 -0500 (EST) Subject: [Python-Dev] Virtual filesystem APIs In-Reply-To: <199912091821.NAA06209@eric.cnri.reston.va.us> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> <199912091821.NAA06209@eric.cnri.reston.va.us> Message-ID: <14415.63442.92911.748132@weyr.cnri.reston.va.us> Guido van Rossum writes: > os.path.sep and friends (e.g. os.path.normcase behavior) were set per Hah! Caught you in public! "sep" & friends are defined in the os module; this is where the separation breaks down. I think these should be located in os.path, and os can just pick them up from there to be backward compatible. os.pathsep is a problem, somewhat; it is related to os.sep, but is very different in many ways. I don't think there's a good way to deal with it. > filesystem; what would happen if you mounted a Unix filesystem in an > NT tree? Doing the translations is hard too; e.g. on a Mac fs, the > separator is ':' and a '/' can be part of a filename -- do you simply > swap them? What if a Mac file has both '/' and '\' and you mount it > on a Windows FS? I'd rather stay away from this. And this is tightly related to the sep/pathsep problem as well. I agree, we should stay away from it. > I think that my proposal above should cover this. (We looked briefly > at doing a similar thing for Java, and found that it's actually harder > there -- they have all these nice objects representing paths, but it's > not easily subclassable to represent paths in some virtual But it was easy to create a set of interfaces with a reasonable API; getting back to the "typical" Java classes was what really changed the most. For those of us not working on the KOE: I set up Filesystem and FSFile interfaces; the Filesystem represented the entire filesystem and the FSFile was very similar to the java.io.File class, but had additional methods to get input and output stream objects (of the standard Java flavor); all the buffering and such could be wrapped on top of that just like any other Java I/O. The specific application was to provide access to an isolated directory structure which untrusted code "owned", but ensured that parent directories were unreachable. Additional security checks can be worked into such a structure as applicable. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake@acm.org Thu Dec 9 19:06:32 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 14:06:32 -0500 (EST) Subject: [Python-Dev] posix module test suite Message-ID: <14415.64952.780974.8124@weyr.cnri.reston.va.us> There's not a test for the posix or os modules; if anyone would like to contribute one, this would be a good time! ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jcw@equi4.com Thu Dec 9 20:51:11 1999 From: jcw@equi4.com (Jean-Claude Wippler) Date: Thu, 09 Dec 1999 21:51:11 +0100 Subject: [Python-Dev] Virtual filesystem APIs References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> <199912091821.NAA06209@eric.cnri.reston.va.us> Message-ID: <3850163F.80BDCB75@equi4.com> Guido van Rossum wrote: > [... horrors of cross-OS mounts and ":\/" separators ...] I agree, this has some very hairy sides to it. But VFS is really more about mounting non-FS things in a "root" FS (presumably the real one). > On the other hand the VFS concept could be used as a totally different > solution to the sys.importers vs. sys.path Heck, I'll be the "enfant terrible" once more: yes, and this stuff could well be implemented generically across scripting languages. Of course the act of "importing" is a very Pythonic issue - but FS/VFS traversal and the actual shared library load need not be. Anyway, enough of that. > Take for example the Windows registry -- looks a lot like a > filesystem, doesn't it? Yet it has one fundamental property that a > typical FS doesn't: directory nodes can have data *and* children... What you're saying is that dir = set-of-subdirs + set-of-files, and that this is a more general requirement than plain FS's. Doesn't that simply mean that the more general model is needed as basis to handle both? > Trees are a universal concept, but code sharing is still elusive... Ah, but think of the implications: archives, networks, XML, the world! -- Jean-Claude From fdrake@acm.org Thu Dec 9 21:16:00 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 16:16:00 -0500 (EST) Subject: [Python-Dev] forwarded message from Fred L. Drake Message-ID: <14416.7184.255000.342231@weyr.cnri.reston.va.us> --KHBYcjBZ+r Content-Type: text/plain; charset=us-ascii Content-Description: message body text Content-Transfer-Encoding: 7bit OK, I've checked in some changes to the posix module to add support for a few of the POSIX interfaces Andrew expressed interest in seeing (and some he said weren't such a good idea, or at least not necessary, but about which I decided I disagreed after all). For those of you who aren't on the checkins list (??), I've attached the message so you'll know what functions were added. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives --KHBYcjBZ+r Content-Type: message/rfc822 Content-Description: forwarded message Received: from cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1]) by weyr.cnri.reston.va.us (8.9.1b+Sun/8.9.1) with SMTP id QAA22917 for ; Thu, 9 Dec 1999 16:13:16 -0500 (EST) Received: from dinsdale.python.org (dinsdale [132.151.1.21]) by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id QAA01352; Thu, 9 Dec 1999 16:12:41 -0500 (EST) Received: from dinsdale.python.org (dinsdale.python.org [132.151.1.21]) by dinsdale.python.org (Postfix) with ESMTP id 710BB1CE73; Thu, 9 Dec 1999 16:12:39 -0500 (EST) Delivered-To: python-checkins@dinsdale.python.org Received: from python.org (parrot.python.org [132.151.1.90]) by dinsdale.python.org (Postfix) with ESMTP id EA9681CE71 for ; Thu, 9 Dec 1999 16:12:37 -0500 (EST) Received: from cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1] (may be forged)) by python.org (8.9.1a/8.9.1) with ESMTP id QAA14229 for ; Thu, 9 Dec 1999 16:12:38 -0500 (EST) Received: from weyr.cnri.reston.va.us (weyr [132.151.1.174]) by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id QAA01348 for ; Thu, 9 Dec 1999 16:12:37 -0500 (EST) Received: (from fdrake@localhost) by weyr.cnri.reston.va.us (8.9.1b+Sun/8.9.1) id QAA22913 for python-checkins@python.org; Thu, 9 Dec 1999 16:13:10 -0500 (EST) Message-Id: <199912092113.QAA22913@weyr.cnri.reston.va.us> Errors-To: python-checkins-admin@python.org X-BeenThere: python-checkins@python.org X-Mailman-Version: 1.2 (experimental) Precedence: bulk List-Id: Check-in messages from the Python maintainers Content-Length: 1821 From: "Fred L. Drake" Sender: python-checkins-admin@python.org To: python-checkins@python.org Subject: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.115,2.116 Date: Thu, 9 Dec 1999 16:13:10 -0500 (EST) MIME-Version: 1.0 Update of /projects/cvsroot/python/dist/src/Modules In directory weyr:/home/fdrake/projects/python/Modules Modified Files: posixmodule.c Log Message: Added support for abort(), ctermid(), tmpfile(), tempnam(), tmpnam(), and TMP_MAX. Converted all functions that used PyArg_Parse() or PyArg_NoArgs() to use PyArg_ParseTuple() and specified all function names using the :name syntax in the format strings, to allow better error messages when TypeError is raised for parameter type mismatches. Index: posixmodule.c =================================================================== RCS file: /projects/cvsroot/python/dist/src/Modules/posixmodule.c,v retrieving revision 2.115 retrieving revision 2.116 diff -u -C2 -r2.115 -r2.116 *** posixmodule.c 1999/10/19 13:29:23 2.115 --- posixmodule.c 1999/12/09 21:13:07 2.116 *************** *** 432,442 **** static PyObject * ! posix_int(args, func) PyObject *args; int (*func) Py_FPROTO((int)); { int fd; int res; ! if (!PyArg_Parse(args, "i", &fd)) return NULL; [...1720 lines suppressed...] #endif + #ifdef HAVE_TEMPNAM + {"tempnam", posix_tempnam, METH_VARARGS, posix_tempnam__doc__}, + #endif + #ifdef HAVE_TMPNAM + {"tmpnam", posix_tmpnam, METH_VARARGS, posix_tmpnam__doc__}, + #endif + {"abort", posix_abort, METH_VARARGS, posix_abort__doc__}, {NULL, NULL} /* Sentinel */ }; *************** *** 3426,3429 **** --- 3586,3592 ---- if (ins(d, "X_OK", (long)X_OK)) return -1; #endif + #ifdef TMP_MAX + if (ins(d, "TMP_MAX", (long)TMP_MAX)) return -1; + #endif #ifdef WNOHANG if (ins(d, "WNOHANG", (long)WNOHANG)) return -1; _______________________________________________ Python-checkins mailing list Python-checkins@python.org http://www.python.org/mailman/listinfo/python-checkins --KHBYcjBZ+r-- From guido@CNRI.Reston.VA.US Thu Dec 9 21:19:57 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 16:19:57 -0500 Subject: [Python-Dev] forwarded message from Fred L. Drake In-Reply-To: Your message of "Thu, 09 Dec 1999 16:16:00 EST." <14416.7184.255000.342231@weyr.cnri.reston.va.us> References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> Message-ID: <199912092119.QAA06731@eric.cnri.reston.va.us> > OK, I've checked in some changes to the posix module to add support > for a few of the POSIX interfaces Andrew expressed interest in seeing > (and some he said weren't such a good idea, or at least not necessary, > but about which I decided I disagreed after all). I wish you'd made your disagreement public before checking it in... But it's not too late... --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Thu Dec 9 21:32:26 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Thu, 9 Dec 1999 16:32:26 -0500 (EST) Subject: [Python-Dev] forwarded message from Fred L. Drake In-Reply-To: <14416.7184.255000.342231@weyr.cnri.reston.va.us> References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> Message-ID: <14416.8170.18298.33796@amarok.cnri.reston.va.us> Fred L. Drake, Jr. writes (in a CVS checkin): >Added support for abort(), ctermid(), tmpfile(), tempnam(), tmpnam(), >and TMP_MAX. For those of you following along, the tmpfile(), tempnam(), tmpnam() functions were ones I listed as probably not worth adding. On the other hand, David Beazley wrote: > I think that the POSIX module should strive to be as >complete as possible--even if certain functions are closely related >other functionality in the library (tmpfile for instance). I suspect ... and that's a good point, too. The POSIX functions may provide adaptability that a Python analog doesn't; for example, you could read /etc/passwd in pure Python, but that wouldn't handle NIS or shadow passwords. So I guess I'll vote for completeness over lack of overlap; leave tmpfile() & friends in. -- A.M. Kuchling http://starship.python.net/crew/amk/ This supports reflection, which is the 90s way of writing self-modifying code. -- John Aycock at IPC7, during his parsing talk From guido@CNRI.Reston.VA.US Thu Dec 9 21:38:42 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 16:38:42 -0500 Subject: [Python-Dev] forwarded message from Fred L. Drake In-Reply-To: Your message of "Thu, 09 Dec 1999 16:32:26 EST." <14416.8170.18298.33796@amarok.cnri.reston.va.us> References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> <14416.8170.18298.33796@amarok.cnri.reston.va.us> Message-ID: <199912092138.QAA06790@eric.cnri.reston.va.us> > ... and that's a good point, too. The POSIX functions may provide > adaptability that a Python analog doesn't; for example, you could read > /etc/passwd in pure Python, but that wouldn't handle NIS or shadow > passwords. So I guess I'll vote for completeness over lack of > overlap; leave tmpfile() & friends in. OK, I agree now. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Dec 9 22:30:52 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 17:30:52 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <14416.11676.888918.511932@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > After poking around in the O'Reilly POSIX book, here's a list of POSIX Ok, here's my comments on the remainder of these. > Worth adding? > ============= > opendir(), readdir(), closedir() -- > most of their functionality is available through > os.listdir(), but it might be useful to have a direct > interface. Downside is that this would require a new > extension type for the C DIR struct. My (lazy) inclination > is to not bother. [rewinddir() and seekdir() should be considered as well, where supported.] There's more tedium than anything in implementing a new C type. I'm a little concerned that there might not be any real value here, but it's hard to be sure about that. Is there any real reason not to use os.listdir(). > Worth adding: > ============= ... > fpathconf(fd, name) -- Get configuration limit for a file > -- would need constants from unistd.h This is mostly a matter of setting up the constants; not hard, just more distracting than I want to deal with right now. > getlogin() -- returns user's login name > -- could do something similar with pwd.getpwuid( os.getuid() )[0], but > getlogin() apparently looks in utmp Per Guido's comments, I'm not sure how valuable it is. It may make sense strictly for completeness, but I've never heard of utmp being considered reliable in any way. Maybe I'm too new at all this. > getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs This should be easy enough. > pathconf(path, name) -- Gets config variables for a path > -- would need constants from unistd.h (Same as for fpathconf().) > sysconf(int name) -- Gets system configuration information > -- would need constants from unistd.h > > Not worth adding: > ================= Aside from the ones I've already added, I agree. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim@digicool.com Thu Dec 9 23:31:40 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 09 Dec 1999 18:31:40 -0500 Subject: [Python-Dev] Thankyou for fsync :) Message-ID: <38503BDC.CB91FB29@digicool.com> I found recently that I needed fsync and was pleasantly surprized to find that it is provided in the posix module, where available. Can I count on it staying in the posix module, when available, for the forseeable future? Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gstein@lyra.org Fri Dec 10 00:32:33 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 9 Dec 1999 16:32:33 -0800 (PST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <14416.11676.888918.511932@weyr.cnri.reston.va.us> Message-ID: On Thu, 9 Dec 1999, Fred L. Drake, Jr. wrote: > Andrew M. Kuchling writes: >... > > opendir(), readdir(), closedir() -- > > most of their functionality is available through > > os.listdir(), but it might be useful to have a direct > > interface. Downside is that this would require a new > > extension type for the C DIR struct. My (lazy) inclination > > is to not bother. > > [rewinddir() and seekdir() should be considered as well, where > supported.] > > There's more tedium than anything in implementing a new C type. I'm > a little concerned that there might not be any real value here, but > it's hard to be sure about that. Is there any real reason not to use > os.listdir(). No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic number if you're worried about mixing CObjects. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Fri Dec 10 02:03:04 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 21:03:04 -0500 Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: Your message of "Thu, 09 Dec 1999 18:31:40 EST." <38503BDC.CB91FB29@digicool.com> References: <38503BDC.CB91FB29@digicool.com> Message-ID: <199912100203.VAA07410@eric.cnri.reston.va.us> > I found recently that I needed fsync and was pleasantly surprized > to find that it is provided in the posix module, where available. > > Can I count on it staying in the posix module, when available, > for the forseeable future? Since we seem to be on an adding spree, I don't see why not -- as long as POSIX keeps it available :) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@mojam.com (Skip Montanaro) Fri Dec 10 06:28:56 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 10 Dec 1999 00:28:56 -0600 (CST) Subject: [Python-Dev] posix module test suite In-Reply-To: <14415.64952.780974.8124@weyr.cnri.reston.va.us> References: <14415.64952.780974.8124@weyr.cnri.reston.va.us> Message-ID: <14416.40360.611743.143624@dolphin.mojam.com> Fred> There's not a test for the posix or os modules; if anyone would Fred> like to contribute one, this would be a good time! ;-) Not having ever written any tests for the core Python modules, it seems natural to ask if there are any guidelines for the construction of such tests or the test equivalent of the Modules/xxmodule.c file. Are there standard behaviors expected for passing and failing a test? Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From tim_one@email.msn.com Fri Dec 10 08:48:59 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 10 Dec 1999 03:48:59 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <14415.23676.775163.786028@dolphin.mojam.com> Message-ID: <000501bf42eb$66529860$412d153f@tim> [Skip Montanaro] > Alright! Now I understand what all the hubbub is about! My eyes have > mostly been glazing over trying to follow all this Windows > registry/path/ini stuff. MS believes that Python is the application. > Those of us writing Python programs view those programs as the > applications, not the Python interpreter per se. Eww -- that's a helpful and insightful way to put it, Skip! Now maybe *I* can understand what the hubbub is about . > Is there some way that people writing applications in Python can set > up registry entries that are specific to their application (e.g. > tabnanny.py) instead of only specific to the Python interpreter? Yes, but they can't get Python to look at those before it's too late. I spent a whole evening a month or two ago just trying to figure out where all the cruft in my Windows sys.path *came* from. This is out-of-the-box; I haven't added anything myself: ['', 'D:\\Python\\win32', 'D:\\Python\\win32\\lib', 'D:\\Python', 'D:\\Python\\Pythonwin', 'D:\\Python\\Lib\\plat-win', 'D:\\Python\\Lib', 'D:\\Python\\DLLs', 'D:\\Python\\Lib\\lib-tk', 'D:\\PYTHON\\DLLs', 'D:\\PYTHON\\lib', 'D:\\PYTHON\\lib\\plat-win', 'D:\\PYTHON\\lib\\lib-tk', 'D:\\PYTHON'] That's bizarre on the face of it, and tracking it all down was draining. I've forgotten the details. I do remember concluding that it was impossible to do what I wanted to do without changing the implementation, though, and nobody on Python-Dev disputed that at the time. In a pragmatic crunch, I wrote the little app I needed to distribute at the time in Perl instead, meaning to come back to this. I haven't had time. IIRC, the ultimate problem wasn't really that Python looked at the registry to get *some* path info, it was a combination of A) It looked at the registry so early that it was impossible to stop it from executing whatever site.py the registry pointed at (well, I could with the -S option -- but then there was no way to get it to do the site.py that was *wanted* instead). B) No way to override what was in the registry; e.g., I was greatly surprised to discover that setting a PYTHONPATH envar didn't override anything, it simply plunked the PYTHONPATH entries into sys.path along with everything else -- and too late to stop anything anyway. In a long msg I haven't yet read all the way thru, Guido at least suggested associating different registry path info with different Python versions. That would address a number of otherwise currently intractable problems. I suspect it still wouldn't help with the problem I was facing, though. That is, I wanted to be able to tell people to run \\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py which is just a Windows way of saying "run a Python executable from a shared network location". When they tried that, though, the network Python looked in *their* individual registries for its Python path info, and some of the hackers with mondo customized Python setups on their own machines watched things go down in flames. This certainly can't be a common problem, but it speaks to an unforgiving rigidity in the current approach. There seemed to be nothing I could do to guarantee this would work, short of telling users to edit their registries before running this tool (that's a non-starter on Windows -- editing the registry is dangerous) or putting a customized Python on the network pointing to a bogus registry key (it was faster to write the app in Perl! Perl doesn't *try* to be so infernally helpful , so doesn't get in the way either). I'm left wondering what purpose putting Python library path info into the Windows registry serves. Is there anyone on Windows who *doesn't* have their Python Lib/ etc as direct subdirectories of the directory containing python.exe? Not that I've seen. Python puts *those* in sys.path too -- but only after it (in the normal case; see my sys.path above) pulls identically redundant paths out of the registry first, or (in the cases we're griping about) pulls irrelevant or downright harmful paths out of the registry first (paths appropriate to the last Python you *installed*, not to the Python that's *running*!). Perhaps all this cruft is needed to support embedded Python, though (something I've never done). Regardless, I expect it would have been enough for me if PYTHONPATH simply worked the way I mistakenly assumed it would (that is, this is sys.path, and that's *it*; feel free to prepend the current directory when initialization is complete, but before then looking at any file not reached from PYTHONPATH is verboten). the-cleverer-the-code-the-more-vital-that-there-be-a-way-to- short-circuit-it-ly y'rs - tim From jim@interet.com Fri Dec 10 12:16:31 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 07:16:31 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000501bf42eb$66529860$412d153f@tim> Message-ID: <3850EF1F.158445B6@interet.com> Tim Peters wrote: > > [Skip Montanaro] > > Is there some way that people writing applications in Python can set > > Yes, but they can't get Python to look at those before it's too late. I > spent a whole evening a month or two ago just trying to figure out where all > the cruft in my Windows sys.path *came* from. This is out-of-the-box; I > ..... Excellent discussion Tim! > I suspect it still wouldn't help with the problem I was facing, though. > That is, I wanted to be able to tell people to run > > \\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py > > which is just a Windows way of saying "run a Python executable from a shared > network location". When they tried that, though, the network Python looked > in *their* individual registries for its Python path info, and some of the > hackers with mondo customized Python setups on their own machines watched > things go down in flames. I think a sensible way to run little apps is to put everything in an archive file including the main.py. On Windows you concattenate that to python.exe, and it Just Works. > Windows registry serves. Is there anyone on Windows who *doesn't* have > their Python Lib/ etc as direct subdirectories of the directory containing > python.exe? Not that I've seen. Point on the curve. We don't. We freeze everything except the main.py. JimA From jim@interet.com Fri Dec 10 13:38:28 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 08:38:28 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> Message-ID: <38510254.ED15D32B@interet.com> Jean-Claude Wippler wrote: > Ouch - what's wrong with zip archives? > With all due respect - I sincerely hope you will reconsider and alter > your code to work with zip files. It's probably a small adjustment? OK, you talked me into it. Ya, small adjustment, no problem ;-) JimA From jack@oratrix.nl Fri Dec 10 13:51:10 1999 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 10 Dec 1999 14:51:10 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Message by "James C. Ahlstrom" , Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com> Message-ID: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> Is it possible nowadays to have two files with the same name but different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive? That's the one thing that always struck me as very very silly about zipfiles. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From gmcm@hypernet.com Fri Dec 10 14:28:51 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 10 Dec 1999 09:28:51 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> References: Message by "James C. Ahlstrom" , Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com> Message-ID: <1267287023-386248@hypernet.com> Jack Jansen asks: > Is it possible nowadays to have two files with the same name but > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same > archive? Depends on how you do it. If the user imports foo.spam.bar, an importer will be asked for: foo (return foo.__init__) foo.spam (return foo.bar.__init__) foo.spam.bar (return foo.spam.bar) But the API allows lots of variations. This is another possible interaction: foo (return None) foo.__init__ (return foo.__init__) foo.spam (return None) foo.bar.__init__ (return foo.bar.__init__) foo.spam.bar (return foo.spam.bar) Or, by looking at different args to get_code, you could look at the requests as: foo in context of None spam in context of foo bar in context of foo.spam With another variation where the request for __init__ becomes explicit. The first way seems the natural way for archives, and makes it easy to keep foo.bar.spam distinct from foo.spam. > That's the one thing that always struck me as very very silly > about zipfiles. Huh? - Gordon From guido@CNRI.Reston.VA.US Fri Dec 10 14:51:39 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 09:51:39 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 14:51:10 +0100." <19991210135111.2F83C370CF2@snelboot.oratrix.nl> References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> Message-ID: <199912101451.JAA07786@eric.cnri.reston.va.us> > Is it possible nowadays to have two files with the same name but different > paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive? > > That's the one thing that always struck me as very very silly about zipfiles. Zip files contain the full path, there's no problem with that. Was there ever? --Guido van Rossum (home page: http://www.python.org/~guido/) From jack@oratrix.nl Fri Dec 10 14:52:26 1999 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 10 Dec 1999 15:52:26 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Message by "Gordon McMillan" , Fri, 10 Dec 1999 09:28:51 -0500 , <1267287023-386248@hypernet.com> Message-ID: <19991210145227.01F99370CF2@snelboot.oratrix.nl> > Jack Jansen asks: > > > Is it possible nowadays to have two files with the same name but > > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same > > archive? > > Depends on how you do it. Apparently I mis-phrased my question, I'll try again. When people suggested to use zip format as the standard Python archive format I was a bit worried, becuase I've had it happen to me various times that I was unable to create a ZIP archive with two files with the same name but different paths (i.e. create an archive of a directory that contains both a foo/bar.py and a foo/spam/bar.py). So, my question was: has this happened to me because the winzip I used was braindead, or is there possibly a problem with the ZIP file format that disallows two files with the same name in one archive? Most zip programs I've seen also seem to present filenames as the primary metaphore, with full pathnames somewhat "tacked on". If the latter is the case I wonder whether zip is the right format to use... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido@CNRI.Reston.VA.US Fri Dec 10 15:00:51 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 10:00:51 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 15:52:26 +0100." <19991210145227.01F99370CF2@snelboot.oratrix.nl> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> Message-ID: <199912101500.KAA07863@eric.cnri.reston.va.us> Again, the zip format does not have this problem. Some zip tools may -- then we don't use those. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Fri Dec 10 15:40:21 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 10:40:21 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: References: <14416.11676.888918.511932@weyr.cnri.reston.va.us> Message-ID: <14417.7909.511437.230915@weyr.cnri.reston.va.us> Greg Stein writes: > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic > number if you're worried about mixing CObjects. That's certainly one option, but I would have made readdir(), seekdir(), rewinddir() and closedir() into the methods read(), seek(), rewind() and close(). So it's a question of what interface you prefer; functions with magically interpreted token parameters (kind of like file descriptors, hey!), or something that is more recognizably object-oriented. I know my preference. ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From mal@lemburg.com Fri Dec 10 15:55:02 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 16:55:02 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> Message-ID: <38512256.F9287E24@lemburg.com> Jack Jansen wrote: > > > Jack Jansen asks: > > > > > Is it possible nowadays to have two files with the same name but > > > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same > > > archive? > > > > Depends on how you do it. > > Apparently I mis-phrased my question, I'll try again. > > When people suggested to use zip format as the standard Python archive format > I was a bit worried, becuase I've had it happen to me various times that I was > unable to create a ZIP archive with two files with the same name but different > paths (i.e. create an archive of a directory that contains both a foo/bar.py > and a foo/spam/bar.py). > > So, my question was: has this happened to me because the winzip I used was > braindead, or is there possibly a problem with the ZIP file format that > disallows two files with the same name in one archive? Most zip programs I've > seen also seem to present filenames as the primary metaphore, with full > pathnames somewhat "tacked on". > > If the latter is the case I wonder whether zip is the right format to use... Hmm, I've been doing the above for years now... never had a problem with it (I use Info-ZIPs tools, BTW), e.g. /home/lemburg> unzip -l projects/distribution/mxODBC-1.1.1.zip Archive: projects/distribution/mxODBC-1.1.1.zip Length Date Time Name -------- ---- ---- ---- 131316 06-09-99 14:10 ODBC/EasySoft/mxODBC.c 131316 06-09-99 14:10 ODBC/Informix/mxODBC.c ... Would be cool if I could use my packages as ZIP files :-) So here's another vote for using the ZIP format. BTW, wouldn't it make sense to include the zlib code in the core distribution much like the pcre stuff is now ? AFAIK, it is public domain and including it would remedy many of the compatibility issues with the different zlib versions around. Ok, that was my wish for Xmas :-) The rest is up to Mr. Clause... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@CNRI.Reston.VA.US Fri Dec 10 16:04:24 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:04:24 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 16:55:02 +0100." <38512256.F9287E24@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> Message-ID: <199912101604.LAA14100@eric.cnri.reston.va.us> > BTW, wouldn't it make sense to include the zlib code > in the core distribution much like the pcre stuff is now ? > AFAIK, it is public domain and including it would remedy many of the > compatibility issues with the different zlib versions around. What compatibility issues? Note that the Win32 distri already comes with zlib statically linked into zlib.pyd. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Fri Dec 10 16:15:48 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:15:48 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> Message-ID: <38512734.CF6E4489@lemburg.com> Guido van Rossum wrote: > > > BTW, wouldn't it make sense to include the zlib code > > in the core distribution much like the pcre stuff is now ? > > AFAIK, it is public domain and including it would remedy many of the > > compatibility issues with the different zlib versions around. > > What compatibility issues? Note that the Win32 distri already comes > with zlib statically linked into zlib.pyd. There were issues with zlib 1.0.4 and later ones. Also, many Linux distributions don't have the zlib header files installed. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@CNRI.Reston.VA.US Fri Dec 10 16:19:47 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:19:47 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 17:15:48 +0100." <38512734.CF6E4489@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> Message-ID: <199912101619.LAA14174@eric.cnri.reston.va.us> > There were issues with zlib 1.0.4 and later ones. Also, many > Linux distributions don't have the zlib header files installed. Hm. I don't recall having any problems reported to me. I'd rather not include the entire zlib distri in the Python distri -- zlib is rather big. Adding only the Unix source would be cheating. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Fri Dec 10 16:25:23 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:25:23 -0500 Subject: [Python-Dev] dbm clone with serious specs wanted Message-ID: <199912101625.LAA14216@eric.cnri.reston.va.us> Someone has asked me for a dbm clone that can store 16M keys of 350 bytes each, and runs on Linux, HPUX, and NT. That's 5.6 Gigabyte in keys alone! I presume most classic approaches won't cut it since total file size is typicall limited by the seek system call, internal data structures and/or file index format to 2Gb (signed longs) or 4Gb (unsigned longs). Does anyone have an idea where to start looking? Would a Python extension already exist? --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli@amber.org Fri Dec 10 16:29:27 1999 From: petrilli@amber.org (Christopher Petrilli) Date: Fri, 10 Dec 1999 11:29:27 -0500 Subject: [Python-Dev] dbm clone with serious specs wanted In-Reply-To: <199912101625.LAA14216@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Fri, Dec 10, 1999 at 11:25:23AM -0500 References: <199912101625.LAA14216@eric.cnri.reston.va.us> Message-ID: <19991210112927.A14102@trump.amber.org> Guido van Rossum [guido@CNRI.Reston.VA.US] wrote: > Someone has asked me for a dbm clone that can store 16M keys of 350 > bytes each, and runs on Linux, HPUX, and NT. That's 5.6 Gigabyte in > keys alone! I presume most classic approaches won't cut it since > total file size is typicall limited by the seek system call, internal > data structures and/or file index format to 2Gb (signed longs) or 4Gb > (unsigned longs). > > Does anyone have an idea where to start looking? Would a Python > extension already exist? Assuming you mean an interface to a ddbm-style situation, you could easily use berkeley DB, I belive it is limited in the 4TB range... Chris -- | Christopher Petrilli | petrilli@amber.org From mal@lemburg.com Fri Dec 10 16:26:10 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:26:10 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> Message-ID: <385129A2.6FAF4E81@lemburg.com> Guido van Rossum wrote: > > > There were issues with zlib 1.0.4 and later ones. Also, many > > Linux distributions don't have the zlib header files installed. > > Hm. I don't recall having any problems reported to me. I'd rather > not include the entire zlib distri in the Python distri -- zlib > is rather big. Adding only the Unix source would be cheating. How about only adding those parts which would be needed to at least deflate the ZIP archive contents ? If the ZIP archive format becomes the standard for Python, we'd have to ensure that all Python users can read them. Well, at least that's what I would expect from a standard format :-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@CNRI.Reston.VA.US Fri Dec 10 16:29:36 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:29:36 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 17:26:10 +0100." <385129A2.6FAF4E81@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> Message-ID: <199912101629.LAA14274@eric.cnri.reston.va.us> > How about only adding those parts which would be needed to > at least deflate the ZIP archive contents ? Ditto -- still lots of portability issues I bet. > If the ZIP archive format becomes the standard for Python, we'd > have to ensure that all Python users can read them. Well, at > least that's what I would expect from a standard format :-) There's a simple solution: don't use compression. With current disk prices it's really not worth it. Let the installer do the decompression (installers travel across networks where compression *is* worth it). --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Fri Dec 10 16:34:09 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Fri, 10 Dec 1999 11:34:09 -0500 (EST) Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: <38512734.CF6E4489@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> Message-ID: <14417.11137.562474.99270@amarok.cnri.reston.va.us> M.-A. Lemburg writes: >There were issues with zlib 1.0.4 and later ones. Also, many >Linux distributions don't have the zlib header files installed. For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm, and zlib.XXX.rpm only contains libz.so. On the other hand, anyone who's compiling Python should really have the various -devel RPMs installed. I'd argue against including it, because it might cause odd versioning problems. For example, what if I have PIL compiled against zlib1.1.2 (zlib is used for writing PNGs) and the Python binary includes zlib1.1.3? There might be hard-to-debug problems caused by calling the wrong symbol. PCRE is a special case, because we've actually hacked the code a lot; it's not the PCRE code as Philip Hazel distributes it. Just received Guido's email suggesting skipping compression in archives; not a bad idea. You'd use less CPU, but might do more I/O because you're reading more sectors off disk. There probably isn't much need for compression when the archive is on-disk; Java needed it because of applets. -- A.M. Kuchling http://starship.python.net/crew/amk/ The NSA response was, "Well, that was interesting, but there aren't any ciphers like that." -- Gus Simmons, "The History of Subliminal Channels" From petrilli@amber.org Fri Dec 10 16:39:44 1999 From: petrilli@amber.org (Christopher Petrilli) Date: Fri, 10 Dec 1999 11:39:44 -0500 Subject: [Python-Dev] dbm clone with serious specs wanted In-Reply-To: <19991210112927.A14102@trump.amber.org>; from petrilli@amber.org on Fri, Dec 10, 1999 at 11:29:27AM -0500 References: <199912101625.LAA14216@eric.cnri.reston.va.us> <19991210112927.A14102@trump.amber.org> Message-ID: <19991210113944.B14102@trump.amber.org> Christopher Petrilli [petrilli@amber.org] wrote: > Guido van Rossum [guido@CNRI.Reston.VA.US] wrote: > > Does anyone have an idea where to start looking? Would a Python > > extension already exist? > > Assuming you mean an interface to a ddbm-style situation, you could easily > use berkeley DB, I belive it is limited in the 4TB range... I just did some checking... first Robin Dunn has an interface, but it's not currently compatible with BerkeleyDB 3.x, which just came out... it shouldn't be hard to retrofit. Anyway, the limits are based on page size... 512b page: 2TB 64K page: 256TB It uses 32bit numbers for pages, so I assume that is also a reflection of the number of keys allowed... given I belive one key must use a minimum of one page. I know that I've pushed earlier releases o around 50Gb without trouble, but you might see issues relatd to the number of keys. I'd ask Sleepycat directly, as they'r amazingly responsive. Chris -- | Christopher Petrilli | petrilli@amber.org From mal@lemburg.com Fri Dec 10 16:37:30 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:37:30 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> <199912101629.LAA14274@eric.cnri.reston.va.us> Message-ID: <38512C4A.ADB63C2B@lemburg.com> Guido van Rossum wrote: > > > How about only adding those parts which would be needed to > > at least deflate the ZIP archive contents ? > > Ditto -- still lots of portability issues I bet. Hmm, not sure: zlib is pretty portable. Its the interface changes that can break code, not so much the zlib portability. > > If the ZIP archive format becomes the standard for Python, we'd > > have to ensure that all Python users can read them. Well, at > > least that's what I would expect from a standard format :-) > > There's a simple solution: don't use compression. With current disk > prices it's really not worth it. Let the installer do the > decompression (installers travel across networks where compression > *is* worth it). That's a possibility, right. It would still let us use the many ZIP tools while not adding complexity to the core. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Fri Dec 10 16:43:11 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:43:11 +0100 Subject: [Python-Dev] dbm clone with serious specs wanted References: <199912101625.LAA14216@eric.cnri.reston.va.us> Message-ID: <38512D9F.2AE9DC8B@lemburg.com> Guido van Rossum wrote: > > Someone has asked me for a dbm clone that can store 16M keys of 350 > bytes each, and runs on Linux, HPUX, and NT. That's 5.6 Gigabyte in > keys alone! I presume most classic approaches won't cut it since > total file size is typicall limited by the seek system call, internal > data structures and/or file index format to 2Gb (signed longs) or 4Gb > (unsigned longs). > > Does anyone have an idea where to start looking? Would a Python > extension already exist? I'd suggest using a dbm style wrapper around the DB-API and then trying out the many cross-platform databases. IBM DB2 comes to mind... it can certainly handle these sizes given the right hardware. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake@acm.org Fri Dec 10 17:35:01 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 12:35:01 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <199912100203.VAA07410@eric.cnri.reston.va.us> References: <38503BDC.CB91FB29@digicool.com> <199912100203.VAA07410@eric.cnri.reston.va.us> Message-ID: <14417.14789.306365.439782@weyr.cnri.reston.va.us> Guido van Rossum writes: > Since we seem to be on an adding spree, I don't see why not -- as long > as POSIX keeps it available :) fsync() isn't listed in O'Reilly's POSIX book, so it's probably not in the POSIX spec. Neither is the tempnam() function I added in yesterdays spree, though tmpfile() and tmpnam() are. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim@digicool.com Fri Dec 10 18:37:53 1999 From: jim@digicool.com (Jim Fulton) Date: Fri, 10 Dec 1999 18:37:53 +0000 Subject: [Python-Dev] Thankyou for fsync :) References: <38503BDC.CB91FB29@digicool.com> <199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us> Message-ID: <38514881.5C124E36@digicool.com> "Fred L. Drake, Jr." wrote: > > Guido van Rossum writes: > > Since we seem to be on an adding spree, I don't see why not -- as long > > as POSIX keeps it available :) > > fsync() isn't listed in O'Reilly's POSIX book, so it's probably not > in the POSIX spec. It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;) I'd still like it to stay, where available. :) Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fdrake@acm.org Fri Dec 10 18:36:44 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 13:36:44 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <38514881.5C124E36@digicool.com> References: <38503BDC.CB91FB29@digicool.com> <199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us> <38514881.5C124E36@digicool.com> Message-ID: <14417.18492.932392.608912@weyr.cnri.reston.va.us> Jim Fulton writes: > It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;) I don't have that one, but I certainly don't have any plans on ripping out fsync(). Not today, at any rate. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim@interet.com Fri Dec 10 18:37:50 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 13:37:50 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> Message-ID: <3851487E.F610BE17@interet.com> Jack Jansen wrote: > > Is it possible nowadays to have two files with the same name but different > paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive? Yes, I just made one with WinZip. JimA From gmcm@hypernet.com Fri Dec 10 18:41:56 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 10 Dec 1999 13:41:56 -0500 Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <38514881.5C124E36@digicool.com> Message-ID: <1267271840-1299809@hypernet.com> Fred L. Drake, Jr. wrote: > > Guido van Rossum writes: > > Since we seem to be on an adding spree, I don't see why not > > -- as long as POSIX keeps it available :) > > fsync() isn't listed in O'Reilly's POSIX book, so it's > probably not > in the POSIX spec. > It's in the other O'Reilly POSIX book, p 348 of POSIX.4. - Gordon From fdrake@acm.org Fri Dec 10 18:43:56 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 13:43:56 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <1267271840-1299809@hypernet.com> References: <38514881.5C124E36@digicool.com> <1267271840-1299809@hypernet.com> Message-ID: <14417.18924.461115.906914@weyr.cnri.reston.va.us> Gordon McMillan writes: > It's in the other O'Reilly POSIX book, p 348 of POSIX.4. Ah, I don't have that either. I thought POSIX.4 was real-time stuff. (If anyone wants to send a copy along, I'd be glad to consider adding reasonable interfaces for Python. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim@interet.com Fri Dec 10 18:43:18 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 13:43:18 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> Message-ID: <385149C6.DF942F36@interet.com> Jack Jansen wrote: > When people suggested to use zip format as the standard Python archive format > I was a bit worried, becuase I've had it happen to me various times that I was > unable to create a ZIP archive with two files with the same name but different > paths (i.e. create an archive of a directory that contains both a foo/bar.py > and a foo/spam/bar.py). No problem. But most zip tools will create an archive with either no path (file name is "bar.py") or full path (filename "foo/bar.py". If paths are different Ok, not sure about duplicate bare names. The difference is an option and has nothing to do with how the file name is specified to the utility. JimA From jim@interet.com Fri Dec 10 18:48:47 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 13:48:47 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> Message-ID: <38514B0F.84A546C6@interet.com> "M.-A. Lemburg" wrote: > How about only adding those parts which would be needed to > at least deflate the ZIP archive contents ? > > If the ZIP archive format becomes the standard for Python, we'd > have to ensure that all Python users can read them. Well, at > least that's what I would expect from a standard format :-) I think that for now we will need to create archives with compression method zero: no compression. That is a valid compression method all ZIP utilities support. The point is that zlib just isn't part of Python. Jim From jcw@equi4.com Fri Dec 10 18:57:00 1999 From: jcw@equi4.com (Jean-Claude Wippler) Date: Fri, 10 Dec 1999 19:57:00 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> <38514B0F.84A546C6@interet.com> Message-ID: <38514CFC.47C8A8E0@equi4.com> "James C. Ahlstrom" wrote: [...] > I think that for now we will need to create archives with > compression method zero: no compression. That is a valid > compression method all ZIP utilities support. Sounds good. This is also exactly how Java started out with jar. -jcw From gmcm@hypernet.com Fri Dec 10 19:06:59 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 10 Dec 1999 14:06:59 -0500 Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us> References: <1267271840-1299809@hypernet.com> Message-ID: <1267270337-1390160@hypernet.com> Fred wrote: > Gordon McMillan writes: > > It's in the other O'Reilly POSIX book, p 348 of POSIX.4. > > Ah, I don't have that either. I thought POSIX.4 was real-time > stuff. Well, it says it is, but having done some stuff with automated warehouses, I'm always amazed at how people will use the term "real-time". I'd say "pretty likely to be responsive" ;-). > (If anyone wants to send a copy along, I'd be glad to consider > adding reasonable interfaces for Python. ;) Only around 70 documented functions, but many of them appear to be tweaks, or redocumenting stuff in view of new kernel behaviors. - Gordon From fdrake@acm.org Fri Dec 10 19:18:16 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 14:18:16 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <1267270337-1390160@hypernet.com> References: <1267271840-1299809@hypernet.com> <1267270337-1390160@hypernet.com> Message-ID: <14417.20984.151867.630871@weyr.cnri.reston.va.us> Gordon McMillan writes: > Well, it says it is, but having done some stuff with automated > warehouses, I'm always amazed at how people will use the > term "real-time". I'd say "pretty likely to be responsive" ;-). Oh, a manager's interpretation of real-time: "I want this by close of business next Wednesday!" > Only around 70 documented functions, but many of them > appear to be tweaks, or redocumenting stuff in view of new > kernel behaviors. Anything that should be added anywhere? Failing all else, I can probably read the man pages if I know what to look for. ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake@acm.org Fri Dec 10 21:40:29 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 16:40:29 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <14417.29517.238124.767279@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > fpathconf(fd, name) -- Get configuration limit for a file ... > pathconf(path, name) -- Gets config variables for a path ... > sysconf(int name) -- Gets system configuration information > -- would need constants from unistd.h I'm almost done with these, and also confstr (from POSIX.2). I don't have time to finish them today; I'll check them in next week. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From skip@mojam.com (Skip Montanaro) Fri Dec 10 23:20:21 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 10 Dec 1999 17:20:21 -0600 (CST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us> References: <38514881.5C124E36@digicool.com> <1267271840-1299809@hypernet.com> <14417.18924.461115.906914@weyr.cnri.reston.va.us> Message-ID: <14417.35509.284749.924066@dolphin.mojam.com> Fred> I thought POSIX.4 was real-time stuff. This all seems to be happening in real-time to me... ;-) Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From andy@robanal.demon.co.uk Sat Dec 11 00:11:28 1999 From: andy@robanal.demon.co.uk (Andy Robinson) Date: Sat, 11 Dec 1999 00:11:28 GMT Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: <199912101619.LAA14174@eric.cnri.reston.va.us> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> Message-ID: <38519531.15439641@post.demon.co.uk> On Fri, 10 Dec 1999 11:19:47 -0500, you wrote: >> There were issues with zlib 1.0.4 and later ones. Also, many >> Linux distributions don't have the zlib header files installed. > >Hm. I don't recall having any problems reported to me. I'd rather >not include the entire zlib distri in the Python distri -- zlib >is rather big. Adding only the Unix source would be cheating. > Minor data point on the importance of zlib. I spent a long time figuring out what Adobe PDF's "flate filter" was before I discovered it was the inverse of "deflate" (yes, there were loud sounds of head-slapping when I clicked) and discovered that zlib.compress() was EXACTLY what you need to create compressed streams in PDF documents. Being a Windows person, I naively assumed zlib was in the standard distribution everywhere, and subsequently discovered Mac and Unix users were not so happy. So if you want to make PDFs, having zlib around is very useful indeed... - Andy From akuchlin@mems-exchange.org Sat Dec 11 00:35:58 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Fri, 10 Dec 1999 19:35:58 -0500 (EST) Subject: [Python-Dev] Enabling more modules by default In-Reply-To: <38519531.15439641@post.demon.co.uk> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <38519531.15439641@post.demon.co.uk> Message-ID: <14417.40046.850655.491684@amarok.cnri.reston.va.us> Andy Robinson writes: >... So if you want to make PDFs, having zlib >around is very useful indeed... This raises a good point, though I still dislike the idea of including the zlib library. It would be nice if Setup.in would be autogenerated to compile all the modules it can -- bsddb if it finds libdb, zlib if it finds libz.a. I vaguely recall once working on a Python script that would generate a customized Setup.in file, though I can't find it at the moment. Given that someone has already suggested automatically enabling threads on those platforms that support it, why not go all the way? (But a Python script that generates a Setup.in isn't going to work, unless we compile a minipython first and then create a more complete Setup file.) -- A.M. Kuchling http://starship.python.net/crew/amk/ The most merciful thing in the world... is the inability of the human mind to correlate all its contents. -- H.P. Lovecraft From petrilli@amber.org Sat Dec 11 05:54:41 1999 From: petrilli@amber.org (Christopher Petrilli) Date: Sat, 11 Dec 1999 00:54:41 -0500 Subject: [Python-Dev] Enabling more modules by default In-Reply-To: <14417.40046.850655.491684@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Fri, Dec 10, 1999 at 07:35:58PM -0500 References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <38519531.15439641@post.demon.co.uk> <14417.40046.850655.491684@amarok.cnri.reston.va.us> Message-ID: <19991211005441.A20923@trump.amber.org> Andrew M. Kuchling [akuchlin@mems-exchange.org] wrote: > Andy Robinson writes: > >... So if you want to make PDFs, having zlib > >around is very useful indeed... > > This raises a good point, though I still dislike the idea of including > the zlib library. It would be nice if Setup.in would be autogenerated > to compile all the modules it can -- bsddb if it finds libdb, zlib if > it finds libz.a. I vaguely recall once working on a Python script that > would generate a customized Setup.in file, though I can't find it at > the moment. Given that someone has already suggested automatically > enabling threads on those platforms that support it, why not go all > the way? WEll, one warning about BSDdb, is that it comes in 3 incarnations that all might be -ldb :-): 1.85 2.x 3.x and they are NOT compatible with eachother. 1.85 has serious brain damage, and honestly I'd love to see Robin Dunn's 2.x work rolled in to replace it, but not sure how viable that is---people might actually want the 1.85 breakage. Chris -- | Christopher Petrilli | petrilli@amber.org From gstein@lyra.org Sat Dec 11 11:23:30 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 03:23:30 -0800 (PST) Subject: [Python-Dev] Zip format In-Reply-To: <1267287023-386248@hypernet.com> Message-ID: On Fri, 10 Dec 1999, Gordon McMillan wrote: >... > If the user imports foo.spam.bar, an importer will be asked for: > foo (return foo.__init__) > foo.spam (return foo.bar.__init__) ^^^ foo.spam.__init__ > foo.spam.bar (return foo.spam.bar) The above sequence is what currently happens. > But the API allows lots of variations. This is another possible > interaction: > foo (return None) > foo.__init__ (return foo.__init__) > foo.spam (return None) > foo.bar.__init__ (return foo.bar.__init__) > foo.spam.bar (return foo.spam.bar) The core of imputil has no knowledge of the __init__ thingy. That is specific to the filesystem-based stuff. So in this sense, "possible" means "imputil could be changed to do this". I would argue against the change, however :-) > Or, by looking at different args to get_code, you could look at > the requests as: > foo in context of None > spam in context of foo > bar in context of foo.spam Bing! Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 11 11:26:59 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 03:26:59 -0800 (PST) Subject: [Python-Dev] Zip format In-Reply-To: <14417.11137.562474.99270@amarok.cnri.reston.va.us> Message-ID: On Fri, 10 Dec 1999, Andrew M. Kuchling wrote: > M.-A. Lemburg writes: > >There were issues with zlib 1.0.4 and later ones. Also, many > >Linux distributions don't have the zlib header files installed. > > For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm, > and zlib.XXX.rpm only contains libz.so. On the other hand, anyone > who's compiling Python should really have the various -devel RPMs Exactly. The distro's *have* the headers -- it all depends on what you installed. I happen to have the headers on my system (because I installed zlib-devel, as AMK mentions). > installed. I'd argue against including it, because it might cause odd > versioning problems. For example, what if I have PIL compiled against > zlib1.1.2 (zlib is used for writing PNGs) and the Python binary > includes zlib1.1.3? There might be hard-to-debug problems > caused by calling the wrong symbol. I totally agree. >... > Just received Guido's email suggesting skipping compression in > archives; not a bad idea. You'd use less CPU, but might do > more I/O because you're reading more sectors off disk. There > probably isn't much need for compression when the archive is on-disk; > Java needed it because of applets. There are all kinds of things that we can do here. Consider mmap'ing the archive into a shared memory segment, used by all the Python processes on the system... woo! :-) IMO, the standard distro can use zip files, and just bail if they are compressed, but Python cannot load zlib. Obvious failure with an obvious remedy. No big deal. As Guido also mentions, an installer can just bring along zlib if they want to use a compressed archive. i.e. their choice. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 11 11:33:47 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 03:33:47 -0800 (PST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <14417.7909.511437.230915@weyr.cnri.reston.va.us> Message-ID: On Fri, 10 Dec 1999, Fred L. Drake, Jr. wrote: > Greg Stein writes: > > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic > > number if you're worried about mixing CObjects. > > That's certainly one option, but I would have made readdir(), > seekdir(), rewinddir() and closedir() into the methods read(), seek(), > rewind() and close(). So it's a question of what interface you > prefer; functions with magically interpreted token parameters (kind of > like file descriptors, hey!), or something that is more recognizably > object-oriented. > I know my preference. ;-) Well, I know my preference of those two alternatives, too :-), but if we're going with the Pythonic minimalism, then I'd think you would expose the functions "as close as possible." Would I argue if you went with a method-based approach? No :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik@pythonware.com Sat Dec 11 13:07:08 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 11 Dec 1999 14:07:08 +0100 Subject: [Python-Dev] Zip format References: Message-ID: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com> Greg Stein wrote: > There are all kinds of things that we can do here. Consider mmap'ing the > archive into a shared memory segment, used by all the Python processes on > the system... woo! :-) it doesn't really look like this, but I hope we're defining interfaces here, and not just "one true solution". I'd be very annoyed if it turned out that we couldn't use works' archives with the new standard importer... > As Guido also mentions, an installer can just bring along zlib if they > want to use a compressed archive. i.e. their choice. in the pythonworks universe, the installer and the application is the same thing... From fredrik@pythonware.com Sat Dec 11 13:12:12 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 11 Dec 1999 14:12:12 +0100 Subject: [Python-Dev] Thankyou for fsync :) References: <38503BDC.CB91FB29@digicool.com><199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us> Message-ID: <006c01bf43d9$57bc0f90$f29b12c2@secret.pythonware.com> Fred L. Drake, Jr. wrote: > fsync() isn't listed in O'Reilly's POSIX book, so it's probably not > in the POSIX spec. Neither is the tempnam() function I added in > yesterdays spree, though tmpfile() and tmpnam() are. instead of guessing, you can get a complete list from: http://www.unix-systems.org/apis.html reading up on the "single unix specification" should also help: http://www.unix-systems.org/online.html (registration required; contains complete man pages for all functions covered by the UNIX95 and UNIX98 specification) From gstein@lyra.org Sat Dec 11 13:10:00 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 05:10:00 -0800 (PST) Subject: [Python-Dev] Zip format In-Reply-To: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com> Message-ID: On Sat, 11 Dec 1999, Fredrik Lundh wrote: > Greg Stein wrote: > > There are all kinds of things that we can do here. Consider mmap'ing the > > archive into a shared memory segment, used by all the Python processes on > > the system... woo! :-) > > it doesn't really look like this, but I hope we're defining > interfaces here, and not just "one true solution". I'd be Oh, I was just having fun there :-). I don't see "one true solution" at all. Just some standards. > very annoyed if it turned out that we couldn't use works' > archives with the new standard importer... get_code() and its processing is not going anywhere. Some stuff will change under the covers, and we'll be using sys.path (typically) rather than chaining (although chaining will still exist!). I would think that your Importer subclass would be directly usable, but the installation could/would be a bit different. Heck, worst case, nothing is going to invalidate your archive format -- feel free to berate me if I ever break that! Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim@interet.com Mon Dec 13 14:50:11 1999 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 13 Dec 1999 09:50:11 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <38510254.ED15D32B@interet.com> Message-ID: <385507A3.9F6AAF0F@interet.com> > Jean-Claude Wippler wrote: > > > Ouch - what's wrong with zip archives? > > > With all due respect - I sincerely hope you will reconsider and alter > > your code to work with zip files. It's probably a small adjustment? OK, I now have a new module "zipfile" which reads and writes ZIP files. It is written in Python and has been tested on Windows and Linux. I tested it with WinZip and found that the files it creates are read OK with WinZip, and WinZip files are read OK with zipfile. So I am withdrawing my Python archive file format, and re-writing all my stuff using zipfile. It should all be done in a week. Basically everything works fine. But there are some problems. Python seems to lack a CRC-32 function, so I wrote one in Python. It is slow. We need to add a CRC-32 function to some Python built-in module that it always present, like md5 or binascci. The zlib module is not necessarily present. I can't seem to get WinZip to record a partial path. That is, I want the ./Lib/test package to have these ZIP paths: test/__init__.pyc test/testall.pyc ... but WinZip creates files with either no path at all or the fully specified path. Am I missing something? Do all other ZIP tools do this too? JimA Return-Path: Delivered-To: python-dev@dinsdale.python.org Received: from python.org (parrot.python.org [132.151.1.90]) by dinsdale.python.org (Postfix) with ESMTP id EFDA11CDB9 for ; Mon, 13 Dec 1999 10:21:56 -0500 (EST) Received: from cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1] (may be forged)) by python.org (8.9.1a/8.9.1) with ESMTP id KAA06423 for ; Mon, 13 Dec 1999 10:21:55 -0500 (EST) Received: from kaluha.cnri.reston.va.us (kaluha.cnri.reston.va.us [132.151.7.31]) by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id KAA04774 for ; Mon, 13 Dec 1999 10:21:56 -0500 (EST) Received: from eric.cnri.reston.va.us (eric.cnri.reston.va.us [10.27.10.23]) by kaluha.cnri.reston.va.us (8.9.1b+Sun/8.9.1) with ESMTP id KAA04556 for ; Mon, 13 Dec 1999 10:22:34 -0500 (EST) Received: from CNRI.Reston.VA.US (localhost [127.0.0.1]) by eric.cnri.reston.va.us (8.9.3+Sun/8.9.1) with ESMTP id KAA18858 for ; Mon, 13 Dec 1999 10:22:34 -0500 (EST) Resent-Message-Id: <199912131522.KAA18858@eric.cnri.reston.va.us> Message-Id: <199912131522.KAA18858@eric.cnri.reston.va.us> To: "James C. Ahlstrom" Subject: Re: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-reply-to: Your message of "Mon, 13 Dec 1999 09:50:11 EST." <385507A3.9F6AAF0F@interet.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <38510254.ED15D32B@interet.com> <385507A3.9F6AAF0F@interet.com> Date: Mon, 13 Dec 1999 10:22:12 -0500 From: Guido van Rossum Resent-Cc: python-dev@python.org Resent-Date: Mon, 13 Dec 1999 10:22:34 -0500 Resent-From: Guido van Rossum Sender: python-dev-admin@python.org Errors-To: python-dev-admin@python.org X-BeenThere: python-dev@python.org X-Mailman-Version: 1.2 (experimental) Precedence: bulk List-Id: Python core developers > OK, I now have a new module "zipfile" which reads and > writes ZIP files. It is written in Python and has been tested > on Windows and Linux. I tested it with WinZip and found that > the files it creates are read OK with WinZip, and WinZip > files are read OK with zipfile. So I am withdrawing my > Python archive file format, and re-writing all my stuff > using zipfile. It should all be done in a week. Ah, good! (This saves me the trouble of cleaning up our own zip code :-) > Basically everything works fine. But there are some problems. > > Python seems to lack a CRC-32 function, so I wrote one > in Python. It is slow. We need to add a CRC-32 function > to some Python built-in module that it always present, like > md5 or binascci. The zlib module is not necessarily present. > > I can't seem to get WinZip to record a partial path. That is, > I want the ./Lib/test package to have these ZIP paths: > test/__init__.pyc > test/testall.pyc > ... > but WinZip creates files with either no path at all or the > fully specified path. Am I missing something? Do all > other ZIP tools do this too? Unclick the "Save Extra Folder Info" and then drag the *parent* folder into the archive. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Mon Dec 13 17:00:26 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 13 Dec 1999 12:00:26 -0500 (EST) Subject: [Python-Dev] confstr(), fpathconf(), pathconf(), sysconf() Message-ID: <14421.9770.623399.673010@weyr.cnri.reston.va.us> I've just checked in bindings for these POSIX.1 and POSIX.2 functions, and thought I'd explain the interfaces for those who don't want to read the diffs. ;) These functions expect a "name" parameter (that's how it's described in the man pages and the O'Reilly book). The value for "name" is an integer that's defined in the system headers. The constants all have the form _XX_SOME_NAME where XX is PC for fpathconf()- and pathconf()-related names, SC for sysconf()-related names, and CS for confstr()-related names. Some names are defined by the standards, but additional names are defined by implementations (there are a *lot* of sysconf() names under Solaris!). We don't want to expose enormous numbers of constants in the module's interface, however, as there are already a lot of names in the posix module. That would also slow down module initialization. We also don't want to force callers to use magic numbers in code that uses these functions, especially since the values may be system-specific. The best way to call these functions, then, is to use a *string* that corresponds to the name of the C #define sysmbol with the leading underscore stripped off. For example, to get the length of the arguments to exec(), you could say: num_args = os.sysconf("SC_ARG_MAX") The string will be mapped to the appropriate numeric value defined in an internal table. If the name isn't defined for the platform, a ValueError will be raised. >>> num_args = os.sysconf("FOO_BAR") Traceback (innermost last): File "", line 1, in ? ValueError: unrecognized configuration name To allow retrieval for platform-dependent configuration information, integers can also be passed in. On Solaris, this is equivalent to using "SC_ARG_MAX": num_args = os.sysconf(1) (Ignoring the portability and readability issues, ha!) There are three separate tables used for this; one for confstr(), one for sysconf(), and one shared by fpathconf() and pathconf(). The names used to build the tables come from Linux and Solaris; we can add other names as needed. To add names, I'd need the names to add and how to test for their existence at compile time (#ifdef, etc.). -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake@acm.org Mon Dec 13 18:35:49 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 13 Dec 1999 13:35:49 -0500 (EST) Subject: [Python-Dev] CVS: python/dist/src/Modules posixmodule.c,2.116,2.117 In-Reply-To: References: <199912131637.LAA17318@weyr.cnri.reston.va.us> Message-ID: <14421.15493.28263.387680@weyr.cnri.reston.va.us> Greg Stein writes: > I'm not very familiar with these APIs, but should you let go of the > interpreter lock when you call them? > (and for the other new funcs) None of these should be doing an I/O as far as I can determine. Whenever I get to getlogin() (which AMK & I decided should be included, based on the specs that /F pointed us to), I will release the interpreter lock for the getlogin_r() variant. I'm not sure I should release it for the non-reentrant getlogin(), however; the specification for getlogin*() pretty much requires that it read from utmp. ;( -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From gstein@lyra.org Mon Dec 13 20:31:22 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 13 Dec 1999 12:31:22 -0800 (PST) Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <385507A3.9F6AAF0F@interet.com> Message-ID: On Mon, 13 Dec 1999, James C. Ahlstrom wrote: >... > OK, I now have a new module "zipfile" which reads and > writes ZIP files. It is written in Python and has been tested > on Windows and Linux. I tested it with WinZip and found that > the files it creates are read OK with WinZip, and WinZip > files are read OK with zipfile. So I am withdrawing my > Python archive file format, and re-writing all my stuff > using zipfile. It should all be done in a week. Can you post zipfile.py so that people can starting reviewing that? >... > Python seems to lack a CRC-32 function, so I wrote one > in Python. It is slow. We need to add a CRC-32 function > to some Python built-in module that it always present, like > md5 or binascci. The zlib module is not necessarily present. See zlib.crc32() This is interesting, of course, because we have previously stated that zlib (and its compression) is optional. But if we need the CRC-32 function... hehe... Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Mon Dec 13 22:11:33 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 13 Dec 1999 17:11:33 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <385507A3.9F6AAF0F@interet.com> Message-ID: <000401bf45b7$04edfaa0$96a2143f@tim> [James C. Ahlstrom] > ... > Python seems to lack a CRC-32 function, so I wrote one > in Python. It is slow. We need to add a CRC-32 function > to some Python built-in module that it always present, like > md5 or binascci. The zlib module is not necessarily present. Unfortunately, there are many different CRC functions in common use. None belong in md5; if the intent is to support just zip's version, adding a (say) zipcrc32 function to binascii would be ok; if we expect to support others as well, a new parameterized crc module would be in order. > I can't seem to get WinZip to record a partial path. That is, > I want the ./Lib/test package to have these ZIP paths: > test/__init__.pyc > test/testall.pyc > ... > but WinZip creates files with either no path at all or the > fully specified path. Am I missing something? Do all > other ZIP tools do this too? No, it's a clumsiness unique to WinZip (damn GUIs <0.9 wink>). In the Add dialog box, you need to cd to the *Lib* directory, check the "Save extra folder info" box, and then, e.g., 1. Put test\*.pyc in the Add Files line, and click Add With Wildcards. Then all test\*.pyc files will be added, with paths test/__init__.pyc etc. or 2. Put "test\__init__.pyc" "test\testall.pyc" (including the quotes!) in the Add Files line, and click Add. Since #2 can be unbearable, other useful strategies include: 3. Use #1 (e.g. with dir\*.*) then delete the files you didn't really want. 4. Use #1 repeatedly, cleverly using a number of wildcard patterns that cover the files of interest. 5. Mixtures of #3 and #4. 6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has an "experimental" cmdline add-on too, but haven't tried it). From jim@interet.com Tue Dec 14 13:13:03 1999 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 08:13:03 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: Message-ID: <3856425F.8C5E7A42@interet.com> Greg Stein wrote: > > Can you post zipfile.py so that people can starting reviewing that? Yes, it will be available by next Monday. I just want to get it really working and pretty, and with documentation. JimA From jim@interet.com Tue Dec 14 13:26:50 1999 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 08:26:50 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000401bf45b7$04edfaa0$96a2143f@tim> Message-ID: <3856459A.BF5A798A@interet.com> Tim Peters wrote: > > [James C. Ahlstrom] > > ... > > Python seems to lack a CRC-32 function, so I wrote one > > Unfortunately, there are many different CRC functions in common use. None > belong in md5; if the intent is to support just zip's version, adding a > (say) zipcrc32 function to binascii would be ok; if we expect to support > others as well, a new parameterized crc module would be in order. OK, a CRC-32 in binascii it is. The CRC-32 I have comes with these comments which seem to indicate it is a more "official standard" CRC-32 than average: # * Crc - 32 BIT ANSI X3.66 CRC checksum files #*********************************************************************\ #* *| #* Demonstration program to compute the 32-bit CRC used as the frame *| #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71 *| #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level *| #* protocol). The 32-bit FCS was added via the Federal Register, *| #* 1 June 1982, p.23798. I presume but don't know for certain that *| #* this polynomial is or will be included in CCITT V.41, which *| #* defines the 16-bit CRC (often called CRC-CCITT) polynomial. FIPS *| #* PUB 78 says that the 32-bit FCS reduces otherwise undetected *| #* errors by a factor of 10^-5 over 16-bit FCS. *| #* *| #********************************************************************* #* Copyright (C) 1986 Gary S. Brown. You may use this program, or #* code or tables extracted from it, as desired without restriction. I can submit this as a patch to binascii, or if the Copyright bothers anyone, maybe it is better for Guido to use his CRC-32 from his ZIP code. Preference? > > I can't seem to get WinZip to record a partial path. That is, > > dialog box, you need to cd to the *Lib* directory, check the "Save extra > folder info" box, and then, e.g., Thanks. I knew there had to be some magic incantation to do it. > 6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has > an "experimental" cmdline add-on too, but haven't tried it). Actually pkzip 2.04g doesn't work because it writes names in upper case and is limited to 8.3 names (I think). My zipfile.py can be used as a basis for a command line tool. Actually I use makefiles with imbedded Python programs and find this easier than command line tools. JimA From guido@CNRI.Reston.VA.US Tue Dec 14 14:53:04 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 09:53:04 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Tue, 14 Dec 1999 08:26:50 EST." <3856459A.BF5A798A@interet.com> References: <000401bf45b7$04edfaa0$96a2143f@tim> <3856459A.BF5A798A@interet.com> Message-ID: <199912141453.JAA23429@eric.cnri.reston.va.us> > OK, a CRC-32 in binascii it is. The CRC-32 I > have comes with these comments which seem to indicate it is a > more "official standard" CRC-32 than average: > > # * Crc - 32 BIT ANSI X3.66 CRC checksum files > #*********************************************************************\ > #* *| > #* Demonstration program to compute the 32-bit CRC used as the frame *| > #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71 *| > #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level *| > #* protocol). The 32-bit FCS was added via the Federal Register, *| > #* 1 June 1982, p.23798. I presume but don't know for certain that *| > #* this polynomial is or will be included in CCITT V.41, which *| > #* defines the 16-bit CRC (often called CRC-CCITT) polynomial. FIPS *| > #* PUB 78 says that the 32-bit FCS reduces otherwise undetected *| > #* errors by a factor of 10^-5 over 16-bit FCS. *| > #* *| > #********************************************************************* > #* Copyright (C) 1986 Gary S. Brown. You may use this program, or > #* code or tables extracted from it, as desired without restriction. > > I can submit this as a patch to binascii, or if the Copyright bothers > anyone, maybe it is better for Guido to use his CRC-32 from his ZIP > code. Preference? I looked, but "my" crc32 in the zlib module (which was actually contributed by Andrew Kuchling) is just a wrapper around the crc32 function in zlib, which is copyrighted by Mark Adler and follows the zlib rules. I propose to use Gary Brown's code. I'll defend this to CNRI's lawyers if need be. Jim, have you checked that this is the right CRC to use for zip's CRC? (This in the light of Tim's assertion that there are many CRCs around.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jim@interet.com Tue Dec 14 15:22:56 1999 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 10:22:56 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000401bf45b7$04edfaa0$96a2143f@tim> <3856459A.BF5A798A@interet.com> <199912141453.JAA23429@eric.cnri.reston.va.us> Message-ID: <385660D0.C6C0C7B9@interet.com> Guido van Rossum wrote: > I propose to use Gary Brown's code. I'll defend this to CNRI's > lawyers if need be. > > Jim, have you checked that this is the right CRC to use for zip's CRC? > (This in the light of Tim's assertion that there are many CRCs around.) The CRC it calculates agrees with the CRC of WinZip for all files I have tried. The original Gary Brown code was much longer and included file reading. Here is the shortened version: JimA # * Crc - 32 BIT ANSI X3.66 CRC checksum files #*********************************************************************\ #* *| #* Demonstration program to compute the 32-bit CRC used as the frame *| #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71 *| #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level *| #* protocol). The 32-bit FCS was added via the Federal Register, *| #* 1 June 1982, p.23798. I presume but don't know for certain that *| #* this polynomial is or will be included in CCITT V.41, which *| #* defines the 16-bit CRC (often called CRC-CCITT) polynomial. FIPS *| #* PUB 78 says that the 32-bit FCS reduces otherwise undetected *| #* errors by a factor of 10^-5 over 16-bit FCS. *| #* *| #********************************************************************* # #* Copyright (C) 1986 Gary S. Brown. You may use this program, or #* code or tables extracted from it, as desired without restriction. # First, the polynomial itself and its table of feedback terms. The # polynomial is # X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X^1+X^0 # Note that we take it "backwards" and put the highest-order term in # the lowest-order bit. The X^32 term is "implied"; the LSB is the # X^31 term, etc. The X^0 term (usually shown as "+1") results in # the MSB being 1. # Note that the usual hardware shift register implementation, which # is what we're using (we're merely optimizing it by doing eight-bit # chunks at a time) shifts bits into the lowest-order term. In our # implementation, that means shifting towards the right. Why do we # do it this way? Because the calculated CRC must be transmitted in # order from highest-order term to lowest-order term. UARTs transmit # characters in order from LSB to MSB. By storing the CRC this way, # we hand it to the UART in the order low-byte to high-byte; the UART # sends each low-bit to hight-bit; and the result is transmission bit # by bit from highest- to lowest-order term without requiring any bit # shuffling on our part. Reception works similarly. # The feedback terms table consists of 256, 32-bit entries. Notes: # # 1. The table can be generated at runtime if desired; code to do so # is shown later. It might not be obvious, but the feedback # terms simply represent the results of eight shift/xor opera- # tions for all combinations of data and CRC register values. # # 2. The CRC accumulation logic is the same for all CRC polynomials, # be they sixteen or thirty-two bits wide. You simply choose the # appropriate table. Alternatively, because the table can be # generated at runtime, you can start by generating the table for # the polynomial in question and use exactly the same "updcrc", # if your application needn't simultaneously handle two CRC # polynomials. (Note, however, that XMODEM is strange.) # # 3. For 16-bit CRCs, the table entries need be only 16 bits wide; # of course, 32-bit entries work OK if the high 16 bits are zero. # # 4. The values must be right-shifted by eight bits by the "updcrc" # logic; the shift must be unsigned (bring in zeroes). On some # hardware you could probably optimize the shift in assembler by # using byte-swap instructions. # Converted to Python by James C. Ahlstrom crc_32_tab = [ # CRC polynomial 0xedb88320 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, 0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, 0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, 0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7, 0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, 0x14015c4f, 0x63066cd9, 0xfa0f3d63, 0x8d080df5, 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, 0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, 0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59, 0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423, 0xcfba9599, 0xb8bda50f, 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, 0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, 0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433, 0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d, 0x91646c97, 0xe6635c01, 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, 0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, 0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65, 0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7, 0xa4d1c46d, 0xd3d6f4fb, 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, 0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, 0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f, 0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81, 0xb7bd5c3b, 0xc0ba6cad, 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, 0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, 0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1, 0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb, 0x196c3671, 0x6e6b06e7, 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, 0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, 0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b, 0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55, 0x316e8eef, 0x4669be79, 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, 0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, 0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d, 0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f, 0x72076785, 0x05005713, 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, 0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, 0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777, 0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69, 0x616bffd3, 0x166ccf45, 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, 0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, 0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9, 0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693, 0x54de5729, 0x23d967bf, 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, 0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d ] def crc32(string): crc = 0xFFFFFFFF for ch in string: crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) & 0xFFFFFF) return ~crc From tim_one@email.msn.com Tue Dec 14 17:06:36 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 14 Dec 1999 12:06:36 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912141453.JAA23429@eric.cnri.reston.va.us> Message-ID: <000101bf4655$94e40840$3a2d153f@tim> [Guido] > I propose to use Gary Brown's code. I'll defend this to CNRI's > lawyers if need be. If there's a hassle, I can do a clean-room implementation easily enough -- although I'd rather not. > Jim, have you checked that this is the right CRC to use for zip's CRC? If WinZip unzips Jim's files without griping, the odds that he's got the wrong CRC are about 1 in 2**36 . > (This in the light of Tim's assertion that there are many CRCs > around.) There are, and several others are hiding in assorted communications stds (e.g., Ethernet uses a different 32-bit CRC); but the zip CRC is the one you'll find most commonly described on the Web. All the same, once Jim releases his code, I'll do an anal verification that it's the right one. From jim@interet.com Tue Dec 14 17:54:35 1999 From: jim@interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 12:54:35 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000101bf4655$94e40840$3a2d153f@tim> Message-ID: <3856845B.6C3C7330@interet.com> Tim Peters wrote: > If WinZip unzips Jim's files without griping, the odds that he's got the > wrong CRC are about 1 in 2**36 . You mean 2**32, right? Oh, sorry, you must be using a DEC-10 . JimA From gstein@lyra.org Tue Dec 14 19:23:36 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 11:23:36 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <3856425F.8C5E7A42@interet.com> Message-ID: On Tue, 14 Dec 1999, James C. Ahlstrom wrote: > Greg Stein wrote: > > > > > Can you post zipfile.py so that people can starting reviewing that? > > Yes, it will be available by next Monday. I just want to > get it really working and pretty, and with documentation. My point was that people could possibly use it *before* then. Not everybody needs it to be pretty, needs doc, or needs it fully working. Maybe people would like to provide feedback on the API. Maybe they'd like to start their own modules that use your library. This goes back to my years-old statement: release it now rather than later -- people can always use it now, and there might not be a later. Release early. Release often. :-) People are too hesitant to release code. Why? Just send it out there. When you update it, send out another. It doesn't hurt anybody to have more than one release. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Wed Dec 15 04:20:25 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 14 Dec 1999 23:20:25 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <3856845B.6C3C7330@interet.com> Message-ID: <000501bf46b3$b6184f40$05a0143f@tim> [Tim] > If WinZip unzips Jim's files without griping, the odds that he's > got the wrong CRC are about 1 in 2**36 . [JimA] > You mean 2**32, right? Nope! For each of the 2**32 polynomials you may have pulled out of thin air, there are about a dozen common variations in the details of CRC algorithms. For example, a CRC used for hashing usually initializes "the register" to 0, but a CRC used to protect against transmission errors usually initializes to a block of 1 bits (since leading zeroes don't affect the result, and a common transmission error is dropping a prefix of the msg). Similarly, algorithms vary in the order they scan the data; in whether they use the raw data or its complement; and in whether they return the actual remainder, the complement of the remainder, or a checksum cleverly computed so that "the other end" always sees a fixed remainder other than 0 (or ~0). > Oh, sorry, you must be using a DEC-10 . I used a Univac 1108 in college, back when ASCII was in its infancy. They couldn't decide on the natural size for a character, so the 36-bit 1108 could be configured to treat each word as either 6 6-bit bytes or 4 9-bit ones. If they had been thinking ahead, they would have defined it as two Unicode characters plus a 4-bit tag field for the Python implementation to play with . now-they-make-their-living-suing-.gif-bandits-ly y'rs - tim From tim_one@email.msn.com Wed Dec 15 07:40:11 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 15 Dec 1999 02:40:11 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <385660D0.C6C0C7B9@interet.com> Message-ID: <000b01bf46cf$9ebe27e0$05a0143f@tim> [JimA posts his Python rendering of Gary Brown's code] Yup! That's the zip algorithm, right down to the absurdly bit-reversed polynomial. > def crc32(string): > crc = 0xFFFFFFFF > for ch in string: > crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) & > 0xFFFFFF) > return ~crc Note that the last line is better (whether in Python or C!) as return crc ^ 0xffffffff Else you'll get a surprising result in a 64-bit Python, and in some 64-bit C implementations. it's-a-32-bit-algorithm-not-an-"int"-or-"long"-one-ly y'rs - tim From fredrik@pythonware.com Wed Dec 15 09:31:29 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 15 Dec 1999 10:31:29 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000101bf4655$94e40840$3a2d153f@tim> Message-ID: <002601bf46e0$06e25ca0$f29b12c2@secret.pythonware.com> > [Guido] > > I propose to use Gary Brown's code. I'll defend this to CNRI's > > lawyers if need be. > > If there's a hassle, I can do a clean-room implementation easily enough -- > although I'd rather not. or you can grab the code from PIL, which already comes with a Python compatible license... (it's based on ISO 3307, but judging from the table James posted, it's the same thing...) From fredrik@pythonware.com Wed Dec 15 09:39:19 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 15 Dec 1999 10:39:19 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000b01bf46cf$9ebe27e0$05a0143f@tim> Message-ID: <003001bf46e0$43860b20$f29b12c2@secret.pythonware.com> Tim Peters wrote: > Yup! That's the zip algorithm, right down to the absurdly bit-reversed > polynomial. also known as ISO 3307, according to some strange comments in PIL's sources... From jim@interet.com Wed Dec 15 15:53:34 1999 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 15 Dec 1999 10:53:34 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <3857B97E.3684224F@interet.com> Greg Stein wrote: > Release early. Release often. :-) You are right of course. OK, the zipfile.py code and docs are at: ftp://ftp.interet.com/pub/pylib.html Despite the ftp URL, clicking on it should display the html. Please don't panic if is seems to be slow. It uses a Python CRC-32 which is slow. You may want to hack it to use zlib.crc32() if you have it. I am testing with WinZip. If you have another zip tool, it would be interesting to see how compatible it is. JimA From guido@CNRI.Reston.VA.US Wed Dec 15 16:38:47 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 15 Dec 1999 11:38:47 -0500 Subject: [Python-Dev] Writers wanted for Linux Journal Python special issue Message-ID: <199912151638.LAA02522@eric.cnri.reston.va.us> Linux Journal is preparing a special issue devoted to Python (actually more like a pullout section or whatever I think). They are looking for writers, e.g. to write a piece about Python's history and/or an introduction. And probably anything else Python related. If you're interested, please write to Marjorie Richardson , who is coordinating. Also direct any questions to her. This is for the June issue which will be on newsstands mid-May and mailed to subscribers even earlier, I believe. The deadline is February 1st (magazine production takes forever!). --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Wed Dec 15 18:17:53 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Wed, 15 Dec 1999 13:17:53 -0500 (EST) Subject: [Python-Dev] fwd. from Paul Prescod Message-ID: <14423.56145.877163.395736@amarok.cnri.reston.va.us> This is a forwarded e-mail from the XML-SIG mailing list, in which Paul makes some good points. Some context: I've been arguing against adding more XML stuff to the base Python distribution, because 1) it's bloat for those people don't care about XML, and 2) the Distutils is supposed to fix this by making installing things easier. Paul's response, below, has shaken my conviction a bit (*only* a bit, though). If it's deemed valuable, perhaps the XML-SIG could concentrate on the minimal set of parser + SAX + DOM that could be included in 1.6. Please join the XML-SIG to follow the specifics of this thread further, as it relates only to XML. As a more general philosophical question for python-dev: do we want to add things to 1.6 following the "batteries included" philosophy? Or should we wave in the direction of the distutils and say they'll fix the problem? (In which case they should be given high priority, as in "1.6 doesn't ship until they're done".) -- A.M. Kuchling http://starship.python.net/crew/amk/ And after all, why should I go to bed every night? Sleep is only a habit. -- Cornelius Van Horne Paul Prescod writes: >"Andrew M. Kuchling" wrote: >> >> Huh? There's obviously a good deal of stuff in there, some of it >> perhaps too esoteric, but I don't see where there's overlap. > >Well, there are several parsers and parser wrappers. How is a user >supposed to choose? And there is PyDOM, Minidom and qp_dom. > >> Or are >> you talking about Python tools in general, where there are 3 DOM >> implementations? (PyDOM, 4DOM, and ZDOM hiding inside Zope.) > >That too. > >> I lean against shoveling more stuff into 1.6; better to get the >> Distutils widely used, which makes it easier to install *all* Python >> extensions. > >I don't think that XML is any more of an "add-on" to a modern scripting >language than URL support or regular expression support. I'm in the >"batteries included" camp for this and several other reasons: > > * standard Python libraries may soon need XML support. If WebDAV takes >off then there should be a libWebDAV right alongside libftp and libhttp. >And libWebDAV will require XML > > * there is a difference between theory and practice. In theory, >distutils will be done soon and everything will be easy. In practice, it >is the end of 1999 and at every conference I have to install the XML sig >package on the machines of several people who haven't been able to get >it going themselves. In practice, we can't wait for distutils because >people are choosing their XML tools now. > >> >Ideally we would have one (or at most two!) implementation of each of >> >the major specs: >> >XML >SAX >Unicode >XPath >XPointer >XSLT >DOM >> >> Do you mean "one implementation of each in a single package", or "one >> implementation existing for Python, distributed separately"? > >With the possible exception of XSLT, one implementation of each *in >Python 1.6*. > >> We need to come up with a position paper for developer's day, stating >> what needs to be discussed. Suggestions? I'd propose focusing on >> getting the XML-SIG package to 1.0, but that's just an idea. > >I don't see how the XML-SIG package can ever get to 1.0. Anybody can >contribute code at anytime and thus far we've been totally flexible >about putting it in. I think that's great. It just won't ever lead to a >stable, carefully maintained, tightly interoperable package. Some of the >maintainers of the individual pieces have probably lost interest and >there is probably nobody that understands it all enough to integrate it >nicely. > >-- > Paul Prescod - ISOGEN Consulting Engineer speaking for himself > From fdrake@acm.org Wed Dec 15 19:47:01 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 15 Dec 1999 14:47:01 -0500 (EST) Subject: [Python-Dev] posix module Message-ID: <14423.61493.90107.433664@weyr.cnri.reston.va.us> Ok, I think I'm done with the posix module updates, modulo bugs and additional symbols for the *conf*() tables. That leaves us with the following status for interfaces that Andrew brought up in the message that started this spate of additions: Worth adding? ============= opendir(), readdir(), closedir() -- not added The only thing these give us that os.listdir() doesn't is the inode numbers. Unless someone actually wants those, it's not worth having. Worth adding: ============= abort() -- added ctermid(), ctermid_r() -- added fpathconf(fd, name) -- added getlogin() -- added getgroups(gidsetsize, grouplist) -- added pathconf(path, name) -- added sysconf(int name) -- added; also added confstr(int name) Not worth adding: ================= clearerr() -- not added cuserid() -- not added difftime -- not added tmpfile(), tmpnam() -- added, also tempnam() mblen(), mbstowcs(), mbtowc(), wcstombs(), wctomb() -- not added -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jeremy@cnri.reston.va.us Wed Dec 15 19:58:16 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Wed, 15 Dec 1999 14:58:16 -0500 (EST) Subject: [Python-Dev] zipfile.py In-Reply-To: References: <3856425F.8C5E7A42@interet.com> Message-ID: <14423.62168.576273.719577@goon.cnri.reston.va.us> >>>>> "GS" == Greg Stein writes: GS> On Tue, 14 Dec 1999, James C. Ahlstrom wrote: >> Greg Stein wrote: > >> >> > Can you post zipfile.py so that people can starting reviewing >> that? >> >> Yes, it will be available by next Monday. I just want to get it >> really working and pretty, and with documentation. GS> My point was that people could possibly use it *before* GS> then. Not everybody needs it to be pretty, needs doc, or needs GS> it fully working. Maybe people would like to provide feedback GS> on the API. Maybe they'd like to start their own modules that GS> use your library. GS> This goes back to my years-old statement: release it now rather GS> than later -- people can always use it now, and there might not GS> be a later. Ok. I think we need some kind of zip file support in the core so that it can be used as a standard distribution format. I'd be happy if Jim's zipfile module ended up being it. We've got some zip code that we developed at CNRI; it's a bit of a mess, but it might be helpful to see what we did. Our code is at ftp://www.python.org/pub/tmp/zip.zip Jeremy From jim@interet.com Thu Dec 16 15:41:56 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 16 Dec 1999 10:41:56 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> Message-ID: <38590844.769C3025@interet.com> Did anyone look at this yet? ftp://ftp.interet.com/pub/pylib.html ftp://ftp.interet.com/pub/zipfile.py JimA From skip@mojam.com (Skip Montanaro) Thu Dec 16 15:46:28 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 16 Dec 1999 09:46:28 -0600 (CST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38590844.769C3025@interet.com> References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> Message-ID: <14425.2388.529932.61119@dolphin.mojam.com> JA> Did anyone look at this yet? JA> ftp://ftp.interet.com/pub/pylib.html JA> ftp://ftp.interet.com/pub/zipfile.py I thought it wasn't supposed to be out until Monday? You're looking for, perhaps, a time machine? ;-) (More seriously, it won't have any effect on my "gotta have this done yesterday" list, so I will let others comment...) Skip From jim@interet.com Thu Dec 16 17:16:21 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 16 Dec 1999 12:16:21 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> Message-ID: <38591E65.4885A39D@interet.com> "James C. Ahlstrom" wrote: > ftp://ftp.interet.com/pub/pylib.html I just changed zipfile.py so that regular zip compression works. And if zlib is available, its crc32() is used instead of the Python version. I should mention that the current code rejects zip files which have an archive comment added to the end. Accepting them would require a search, and I am not sure it is worth it. JimA From fdrake@acm.org Thu Dec 16 17:19:23 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 16 Dec 1999 12:19:23 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: References: <199912151831.NAA02685@weyr.cnri.reston.va.us> Message-ID: <14425.7963.347400.763562@weyr.cnri.reston.va.us> [Note that Greg's message went to python-checkins since he responded to a checkin message, but I suspect he meant to change the header to point to python-dev. ;) If not, too bad!] Greg Stein writes: > But this means that your tables no long reside in "const" space. Yet More > Per-Process Memory... > > It would be nice to have those tables marked as "const". Perhaps; as Guido points out, there haven't been a lot of complaints about this issue. I will note that only the tables aren't constant; the strings that are pointed to are still constant. I'm inclined to let the compiler/ linker care about this, and not change the code without a really clear need to do so. Here are the sizes of those tables and the strings they point to (including terminating null bytes for the strings): pathconf_names: 14 entries, 112 bytes, 176 string bytes confstr_names: 25 entries, 200 bytes, 576 string bytes sysconf_names: 108 entries, 864 bytes, 1774 string bytes Figures are for Solaris7. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From gstein@lyra.org Thu Dec 16 18:10:14 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 10:10:14 -0800 (PST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: <14425.7963.347400.763562@weyr.cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Fred L. Drake, Jr. wrote: > [Note that Greg's message went to python-checkins since he responded > to a checkin message, but I suspect he meant to change the header to > point to python-dev. ;) If not, too bad!] I didn't really care too much where it went. I would actually suggest that the Reply-To: on the checkin list is set to python-dev if that is where replies are Supposed To Go. [ I do this with mod_dav checkins; replies to dav-checkins mail goes to dav-dev. ] > Greg Stein writes: > > But this means that your tables no long reside in "const" space. Yet More > > Per-Process Memory... > > > > It would be nice to have those tables marked as "const". > > Perhaps; as Guido points out, there haven't been a lot of complaints > about this issue. > I will note that only the tables aren't constant; the strings that > are pointed to are still constant. I'm inclined to let the compiler/ > linker care about this, and not change the code without a really clear > need to do so. > Here are the sizes of those tables and the strings they point to > (including terminating null bytes for the strings): > > pathconf_names: 14 entries, 112 bytes, 176 string bytes > confstr_names: 25 entries, 200 bytes, 576 string bytes > sysconf_names: 108 entries, 864 bytes, 1774 string bytes > > Figures are for Solaris7. Ah. I just replied to that. Guess that one went to python-checkins :-) True, this is a small amount of memory. But they start to add up. non-const globals also pain me when I start to work on free-threading stuff (each must be examined to see if synchronization is needed), so reducing the number there is important. Regarding the memory itself: as I mentioned in the other note, I just want to ensure that Python's working set remains low (reasons given in that email). Cheers, -g -- Greg Stein, http://www.lyra.org/ From skip@mojam.com (Skip Montanaro) Thu Dec 16 18:09:11 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 16 Dec 1999 12:09:11 -0600 (CST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: References: <199912161553.KAA08428@eric.cnri.reston.va.us> Message-ID: <14425.10951.169751.843764@dolphin.mojam.com> >>>>> "Greg" == Greg Stein writes: Greg> On Thu, 16 Dec 1999, Guido van Rossum wrote: >> I don't think there's much of a need to worry about this. Why are >> you always bringing up this subject? No-one else that I know has >> ever had this concern... Greg> Somebody has to :-) Greg> Keeping the working set low is more efficient from a system Greg> standpoint. Not to mention the not-all-that-occasional-anymore requests to have Python on various itty-bitty things like Palm Pilots and WinCE devices. It's one thing to add size to modules people can live without for many applications, but I think the posix module and its other platform-specific relations are fairly heavily used. (I realize this specific example isn't likely to apply to PP/WinCE.) Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From gstein@lyra.org Thu Dec 16 18:21:54 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 10:21:54 -0800 (PST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released In-Reply-To: <199912161527.KAA08308@eric.cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Guido van Rossum wrote: >... > I realize it's just a rant. In this case (distutils) your advice is > correct. (I usually paraphrase it as "release early, release often".) True. I prefer that phrase, too, but I used it on JimA earlier in the day or the previous day. I didn't want to sound like a broken record :-). But that is why I moved into mode... it seems like the mindset was spreading :-) I've railed at AMK for it, too :-), when he was talking about 0.5.1pre1 or whatever, rather than just releasing 0.5.1 and doing an 0.5.2 if there was a problem. > However there are other situations, like core Python itself, where > it's really useful to have stable releases -- if only for those users > who won't touch anything with "beta" in its name. I still hear from > people who haven't upgraded to 1.5.2. But this doesn't explain why there isn't a 1.5.3b1, 1.5.3b2, etc. Or 1.6.0a1 or whatever (maybe "d" or "r" for dev release, as opposed to alpha). There are some people would like the releases rather than using CVS. Some people can't even use CVS because of firewall issues. Of course, an alternative is snapshot-tarballs of the CVS repository. But a snapshot could *really* be broken; something like 1.6.0d1 says "well, it's a development release, but I've hit a good point between some changes." > I wonder if perhaps for those cases (where there's a demand for stable > releases) some other strategy could be used? Such as labeling > releases "stable" after the fact? Or what Linus seems to do with the > Linux kernel (even = stable, odd = development; or was it the other > way around?). Yes: even are stable (e.g. 1.0, 1.2, 2.0, 2.2). The odd numbers are for development. Linus is currently working 2.3.x, but declared in the past couple days that things will be wrapping up to move towards 2.4. Once he thinks it is ready, he'll start off with 2.4.0pre1, pre2, pre3... At some point the "pre" suffix will drop and 2.4.0 will be released. You might have a bit of problem using that mechanism since the current stable release is 1.5 :-). Once 1.6 hits the street, then you could start doing 1.9 releases (dev) and shift to 2.0 once it is "stable". Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Thu Dec 16 18:02:55 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 10:02:55 -0800 Subject: [Python-Dev] Re: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> Message-ID: <3859294F.138FF398@prescod.net> "Andrew M. Kuchling" wrote: > > * Python revisions come out slowly, once every year or two. XML > standards have been revolving faster , and we don't want to wait > until 1.7 for SAX2, or DOM Level2, or other new revisions. > Keeping the modules out of the core lets them be updated at their > own pace. A counterargument is that the XML specs are slowing > down -- add namespace support to SAX, and finalize DOM > Level 2, and I don't think any other standards are very important > to basic XML programming. I agree with your counterargument. :) Anyhow, isn't there a logical fallacy in your original argument? Why can't we offer a DOM 3 module or extension after Python ships with DOM 2? > * We really want a C-based parser to be commonly available. > sgmlop is the only reasonable choice for this, because I'd be > against including Expat. To replay some arguments I made against > including the zlib library in 1.6, what if a C extension requires > a newer version of the library? Symbol conflicts if you're lucky, > hard-to-debug problems if you're not. I don't understand this issue. Why would a C extension build on sgmlop which is designed to make XML information available to *Python* programmers? > * We can drop various marginal bits of the CVS tree; the xmlarch > support is probably not of very wide interest, for example. How about "expat", "mac", "pyexpat", "utils", "windows". There is just too much stuff there! And I daresay that alot of it has not been "quality controlled" to the level that we would expect if it were a part of the real Python library. In other words, there is no single place to go to get only XML-processing software that works well and works together. > I think I'm on the record as saying that Python's major problems now > aren't language-related, but are with the development environment. > Language changes (from minor, like 'for i in 1..9', to major, like > fixing the type/class dichotomy or adding static types) aren't going > to bring in piles of new users, useful though they might be to > experienced Pythoneers, large projects, or some other specific > application. (irrelevant aside: I agree 100% that making things easier to install will actually improve newbies experience more than (e.g.) static type checking but I do not agree that it is a better "sales tool". Most people are sold based on the language and its libraries before they start trying to install extensions.) > If installing things is a problem, then we need to > buckle down and finish the distutils. So, overall, I'd still vote > against inclusion in 1.6. So are you saying that Python 2 might have only five packages and everything else must be downloaded? No httplib, no pickle, no random or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? When people download Python and go to the library documentation that impressive array of BUILT-IN-FEATURES is part of what sells them on Python. Hell, I can download all of that stuff for Scheme but what makes Python beautiful is that I don't have to download it for Python. It's just there. But if an XML person comes to Python after hearing us rant about how great it is for processing XML and all they find is xmllib...they will be underwhelmed. > No, it's *got* to reach 1.0. The point of the package is that it's > exactly *one* thing to install that gives basic XML tools; you don't > need to chase down the SAX modules from Lars' page, PyExpat from > ftp.cwi.nl, sgmlop from pythonware.com, and so forth. If the > Distutils made it as easy as: > > python fetchpackage.py SAX PyExpat DOM sgmlop > > > > etc... > > then much of the need for a single package goes away, but, as you > point out, that isn't currently the case. I'm a little lost here. We need xmllib to continue because distutils doesn't do what we need yet but we don't need to put the stuff in the Python library because disutils will work well enough soon. But there is an important issue that disutils will not solve. One of the beautiful things about the Python library is that everything is at the same version level. When you install it you know that everything works together or else it WILL in the next patch level if you report the incompatibility. When the xml package gets versioned incompatibly with the Python library you don't have that safe feeling. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From akuchlin@mems-exchange.org Thu Dec 16 18:50:48 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Thu, 16 Dec 1999 13:50:48 -0500 (EST) Subject: [Python-Dev] Re: [XML-SIG] Developer's Day In-Reply-To: <3859294F.138FF398@prescod.net> References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> <3859294F.138FF398@prescod.net> Message-ID: <14425.13448.737831.460241@amarok.cnri.reston.va.us> (Responding to the python-dev related portion of this...) Paul Prescod writes: >I don't understand this issue. Why would a C extension build on sgmlop >which is designed to make XML information available to *Python* >programmers? No, no; I'm arguing against shipping with Expat; sgmlop good! Consider this scenario: * Python includes Expat 1.0 * Some C library (for DAV or whatever) uses Expat 1.1 * Someone writes a Python interface to this C library and attempts to compile it statically. * Two versions of Expat in the same binary; symbol conflicts and core dumps, oh my! >So are you saying that Python 2 might have only five packages and >everything else must be downloaded? No httplib, no pickle, no random or >math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? I'm not arguing for dropping existing packages; I'm against adding many more of them. Existing library modules can stay where they are. But I wouldn't mind a minimalist Python too much, if it came with a script fetch-basic-packages: python fetch-packages.py httplib python fetch-packages.py imaplib ... 200 more lines ... >I'm a little lost here. We need xmllib to continue because distutils >doesn't do what we need yet but we don't need to put the stuff in the >Python library because disutils will work well enough soon. Basically, yes. -- A.M. Kuchling http://starship.python.net/crew/amk/ And now let us hasten to the station. I have commanded the rain to fall at exactly one-fifteen and I would hate to get my shoes wet. -- Lord Lavender, in SEBASTIAN O #2 From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu Dec 16 18:50:49 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 16 Dec 1999 13:50:49 -0500 (EST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released References: <199912161527.KAA08308@eric.cnri.reston.va.us> Message-ID: <14425.13449.954026.960703@anthem.cnri.reston.va.us> >> I wonder if perhaps for those cases (where there's a demand for >> stable releases) some other strategy could be used? Such as >> labeling releases "stable" after the fact? Or what Linus seems >> to do with the Linux kernel (even = stable, odd = development; >> or was it the other way around?). I really dislike the odd/even distinction for exactly this reason. -Barry From guido@CNRI.Reston.VA.US Thu Dec 16 19:02:16 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 16 Dec 1999 14:02:16 -0500 Subject: [Python-Dev] Batteries Included? Message-ID: <199912161902.OAA11345@eric.cnri.reston.va.us> I like the batteries included approach, but I also feel resistence against including stuff I cannot maintain. The XML code base is a point in case; I don't understand enough about XML. (I just read that xmllib.py is "illegal". Jeez! What happened? Did Congress pass a law against it?) I think it may be time for separate Python distributions, like Linux -- I can concentrate on the core, and keep it really small; others can make all-encompassing distributions. There are currently some drawbacks to this approach: non-core modules have less status; and the documentation process is fundamentally different for core and non-core modules. There's also the version dependency stuff, but I think resolving that is the responsibility of the distribution makers. I think the status problem will be gone once there is a respected distribution -- then you derive status from being in that distribution, rather than from being in the core distribution. (Well, you would still derive status from being in the core, but it would be much harder to obtain, since I can set a much higher standard.) The documentation problem is the one that's left. I think the doc-sig may be on its way as we speak to solve this, though. Fred? This isn't rocket science. Red Hat Python? I'm all for it! :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@mojam.com (Skip Montanaro) Thu Dec 16 19:05:05 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 16 Dec 1999 13:05:05 -0600 (CST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released In-Reply-To: <14425.13449.954026.960703@anthem.cnri.reston.va.us> References: <199912161527.KAA08308@eric.cnri.reston.va.us> <14425.13449.954026.960703@anthem.cnri.reston.va.us> Message-ID: <14425.14305.907618.978628@dolphin.mojam.com> >>> Or what Linus seems to do with the Linux kernel (even = stable, odd >>> = development; or was it the other way around?). BAW> I really dislike the odd/even distinction for exactly this reason. It's one saving grace is that it is a uniform format. There are no "optional" tokens like "pre", "alpha", "beta", etc for the most part. To remember which way it is, I find it useful to execute "uname -r", check the second digit, then look down at my shirt for a pocket protector. The two pieces of information together work for me. I currently get "2.2.13-4mdk" from uname. I don't even have a pocket, let alone a pocket protector, so even numbers must be stable releases... ;-) Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From fdrake@acm.org Thu Dec 16 19:05:22 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 16 Dec 1999 14:05:22 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: <14425.10951.169751.843764@dolphin.mojam.com> References: <199912161553.KAA08428@eric.cnri.reston.va.us> <14425.10951.169751.843764@dolphin.mojam.com> Message-ID: <14425.14322.355507.500813@weyr.cnri.reston.va.us> Skip Montanaro writes: > fairly heavily used. (I realize this specific example isn't likely to apply > to PP/WinCE.) Or any version of Windows, I suspect; perhaps Mark Hammond can elaborate. Appearantly none of the pathconf() constants are defined on that platform, at least not as #define constants. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jcw@equi4.com Thu Dec 16 19:09:42 1999 From: jcw@equi4.com (Jean-Claude Wippler) Date: Thu, 16 Dec 1999 20:09:42 +0100 Subject: [Python-Dev] Re: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> <3859294F.138FF398@prescod.net> Message-ID: <385938F6.C4164756@equi4.com> Paul Prescod wrote: [...] > (irrelevant aside: [...] Most people are sold based on the language > and its libraries before they start trying to install extensions.) > > [AMK] > > If installing things is a problem, then we need to > > buckle down and finish the distutils. So, overall, I'd still vote > > against inclusion in 1.6. > > So are you saying that Python 2 might have only five packages and > everything else must be downloaded? No httplib, no pickle, no random > or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? > > When people download Python and go to the library documentation that > impressive array of BUILT-IN-FEATURES is part of what sells them on > Python. Hell, I can download all of that stuff for Scheme but what > makes Python beautiful is that I don't have to download it for Python. > It's just there. But if an XML person comes to Python after hearing us > rant about how great it is for processing XML and all they find is > xmllib...they will be underwhelmed. (Nodding in agreement) Could this perhaps be solved with a large batteries-included standard distribution, plus a real easy/effective way to strip Python down and wrap things up for deployment? In other words, aim for two very distinct goals: everything within easy reach for development + fully signed-sealed-delivered products. The first goal can evolve to do fancy net-bourne distribution, even if it is a brittle process, because this is for Python developers. They want it all, so open the floodgate to give it all to them. The second becomes a matter or pruning down and wrapping up. All the way down to an single installation-less executable, if possible. I may well be wrong (and I'm not tracking distutils), but might it not be simpler to focus on 1) power users + 2) production-grade deployment, instead of trying to streamline a tangled-web-of-module-dependencies into a distribution system which tries to meet a wide range of needs? > [...] One of the beautiful things about the Python library is that > everything is at the same version level. When you install it you know > that everything works together or else it WILL in the next patch level > if you report the incompatibility. [...] More nods. So why not allow the Python distribution to become very large - with every release moving to a better-tuned combination of all the different parts (occasional mishaps can quickly be fixed)? Plus some tools to dist(ut)il(l) a turnkey solution from this big soup. Sort-of-from-violin-to-quartet-all-the-way-to-symphony-orchestra... -- Jean-Claude From gstein@lyra.org Thu Dec 16 20:02:46 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 12:02:46 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38590844.769C3025@interet.com> Message-ID: On Thu, 16 Dec 1999, James C. Ahlstrom wrote: > Did anyone look at this yet? > > ftp://ftp.interet.com/pub/pylib.html > > ftp://ftp.interet.com/pub/zipfile.py I went to look for it, but I think that was before you put zipfile up. Looking at it now... The writepy() as a method is questionable, I think. I think it should open the file at instantiation time. I don't see a reason to allow that to be deferred. Especially given that some of the methods fail if open() hasn't been called. It would be good to have symbolic names for the 0 and 8 compression constants, and to fail if 8 is passed and zlib is not available (otherwise, it doesn't fail until read/write time, and with a NameError). There should probably be a __del__ that calls close(). Oh, and a "closed" attribute that can be checked and an error raised if an operation is done after the file has been closed. I think dir() should return the contents, rather than print them. read() and write() ought to fail if the mode is incorrect. Oh, some symbolic constants for things like "PK\005\006" would be nice. Do you have a ZipImporter written? Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 20:12:30 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 12:12:30 -0800 (PST) Subject: [Python-Dev] Re: [XML-SIG] Developer's Day In-Reply-To: <14425.13448.737831.460241@amarok.cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Andrew M. Kuchling wrote: > Paul Prescod writes: > >I don't understand this issue. Why would a C extension build on sgmlop > >which is designed to make XML information available to *Python* > >programmers? > > No, no; I'm arguing against shipping with Expat; sgmlop good! > Consider this scenario: > > * Python includes Expat 1.0 > * Some C library (for DAV or whatever) uses Expat 1.1 > * Someone writes a Python interface to this C library and > attempts to compile it statically. > * Two versions of Expat in the same binary; symbol conflicts > and core dumps, oh my! We should ship pyexpat, not Expat. (IMO) > >So are you saying that Python 2 might have only five packages and > >everything else must be downloaded? No httplib, no pickle, no random or > >math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? > > I'm not arguing for dropping existing packages; I'm against adding > many more of them. Existing library modules can stay where they are. > But I wouldn't mind a minimalist Python too much, if it came with a > script fetch-basic-packages: > > python fetch-packages.py httplib > python fetch-packages.py imaplib > ... 200 more lines ... Considering that it would probably use HTTP to fetch the packages, I think you wouldn't be fetching httplib :-) But yes: I agree with the basic sentiment. Cheers, -g -- Greg Stein, http://www.lyra.org/ From petrilli@amber.org Thu Dec 16 20:55:16 1999 From: petrilli@amber.org (Christopher Petrilli) Date: Thu, 16 Dec 1999 15:55:16 -0500 Subject: [Python-Dev] Batteries Included? In-Reply-To: <199912161902.OAA11345@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Thu, Dec 16, 1999 at 02:02:16PM -0500 References: <199912161902.OAA11345@eric.cnri.reston.va.us> Message-ID: <19991216155516.A28037@trump.amber.org> Guido van Rossum [guido@CNRI.Reston.VA.US] wrote: > I think it may be time for separate Python distributions, like Linux > -- I can concentrate on the core, and keep it really small; others can > make all-encompassing distributions. My fear is what we face in the Zope world---different distributions break in totally diffrent ways, and sometimes we have to ask 30 questions to figure out what might be going wrong :/ The nice thing is hat if someone installes Python from the source, we know what's going to happen. I don't know if this is solvable, honestly. > This isn't rocket science. Red Hat Python? I'm all for it! :-) I think Guido just wants to IPO and retire :-) Chris -- | Christopher Petrilli | petrilli@amber.org From gward@cnri.reston.va.us Thu Dec 16 21:03:26 1999 From: gward@cnri.reston.va.us (Greg Ward) Date: Thu, 16 Dec 1999 16:03:26 -0500 Subject: [Python-Dev] distutils-sig/python-dev crosstalk Message-ID: <19991216160325.H4289@cnri.reston.va.us> Most recent threads on distutils-sig seem to have migrated to python-dev pretty quickly. This means that a) there are python-dev people on distutils-sig (duh), b) they think what goes on there is important enough to interest the other core developers (good!), and c) they assume there are people on python-dev who are not also on distutils-sig. Is this last assumption true? If you read python-dev, are interested in distutils issues, but do *not* read distutils-sig, please drop me a note. If no one says anything, I will (politely, tentatively) propose that we keep the distutils threads on distutils-sig and leave python-dev for, well, core Pythond development. If you think that the two are inextricably linked and I might as well just cross-post everything on distutils-sig to python-dev, let me know about that too. ;-) Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913 From gstein@lyra.org Thu Dec 16 21:18:50 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:18:50 -0800 (PST) Subject: [Python-Dev] distutils-sig/python-dev crosstalk In-Reply-To: <19991216160325.H4289@cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Greg Ward wrote: >... > If you think that the two are inextricably linked and I might as well > just cross-post everything on distutils-sig to python-dev, let me know > about that too. ;-) :-) I think distutils is about the mechanics. And it is a large and sophisticated problem (which why it has a SIG :-). You could almost view it as a spinoff of the python-dev grand problem set. When we get into the question of "what does Python ship with?", then I think it belongs in python-dev, as that is a discussion of what constitutes Python itself. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 21:21:12 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:21:12 -0800 (PST) Subject: [Python-Dev] distutils-sig/python-dev crosstalk In-Reply-To: <19991216160325.H4289@cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Greg Ward wrote: > Most recent threads on distutils-sig seem to have migrated to python-dev > pretty quickly. This means that a) there are python-dev people on > distutils-sig (duh), b) they think what goes on there is important > enough to interest the other core developers (good!), and c) they assume > there are people on python-dev who are not also on distutils-sig. Oh. One more thing. Actually, what I am somewhat worried about is whether there was relevant discussion on python-dev that should have been visible to the distutils people. Not sure if there was, but that is always a potential problem. Same with the recent xml-sig / python-dev crosstalk. Specifically, Paul Prescod is not on python-dev, so he may have missed a response or two. Cheers, -g -- Greg Stein, http://www.lyra.org/ From mal@lemburg.com Thu Dec 16 21:23:30 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 16 Dec 1999 22:23:30 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> Message-ID: <38595852.E8054741@lemburg.com> "James C. Ahlstrom" wrote: > > "James C. Ahlstrom" wrote: > > > ftp://ftp.interet.com/pub/pylib.html > > I just changed zipfile.py so that regular zip compression > works. And if zlib is available, > its crc32() is used instead of the Python version. > > I should mention that the current code rejects zip files which have > an archive comment added to the end. Accepting them would require > a search, and I am not sure it is worth it. I don't think it is needed for our purposes, but maybe a subclass could provide it ? FYI, I've tested the module against mxStack-0.3.0.zip which you can find on my Python Pages. It was created using Info-ZIP's zip 2.2 on Linux. Unfortunately, I always get the following traceback when trying to print the directory: >>> z.open('../projects/distribution/mxStack-0.3.0.zip','rb') >>> z.dir() File Name Modified Size Stack/mxStack/mxStack.h 1999-04-16 10:50:06 4368 Stack/mxStack/mxstdlib.h 1999-04-13 15:37:52 5433 Traceback (innermost last): File "", line 1, in ? File "/home/lemburg/lib/zipfile.py", line 120, in dir bytes = self.read(name) # Just to check CRC-32 File "/home/lemburg/lib/zipfile.py", line 133, in read bytes = zlib.decompress(bytes, -15) zlib.error: Error -5 while decompressing data Some notes on the API: ---------------------- * I would find it more convenient if the filename and mode would be constructor parameters, e.g. zfile = zipfile('myfile.zip','rb') with compression defaulting to 8 rather than 0 (most zip files will be deflated since this is the ZIP default). * Also, I would like a method much like the os.listdir() which returns a list of filenames rather than print it to stdout. * .is_zipfile() should probably be a separate function: it doesn't use any of the class' features. More wishes to come ;-) So far: Great Work ! Aside: I found that you are using undocumented arguments to zlib.compressobj() ... are these extra arguments left out of the documentation on purpose or by simple oversight ? I couldn't find them in the HTML docs and neither in the docstrings. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 15 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein@lyra.org Thu Dec 16 21:32:09 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:32:09 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38595852.E8054741@lemburg.com> Message-ID: On Thu, 16 Dec 1999, M.-A. Lemburg wrote: >... > Some notes on the API: > ---------------------- > * I would find it more convenient if the filename and mode > would be constructor parameters, e.g. > > zfile = zipfile('myfile.zip','rb') > > with compression defaulting to 8 rather than 0 (most zip files > will be deflated since this is the ZIP default). > > * Also, I would like a method much like the os.listdir() > which returns a list of filenames rather than print it > to stdout. The above two items were in my ramble, just not as clear as MAL :-) > * .is_zipfile() should probably be a separate function: it > doesn't use any of the class' features. Ah! Good call. It is even more important to shift it out if the constructor now opens a file. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fdrake@acm.org Thu Dec 16 21:33:36 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 16 Dec 1999 16:33:36 -0500 (EST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38595852.E8054741@lemburg.com> References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> Message-ID: <14425.23216.636687.704436@weyr.cnri.reston.va.us> M.-A. Lemburg writes: > Aside: I found that you are using undocumented arguments to > zlib.compressobj() ... are these extra arguments left out of > the documentation on purpose or by simple oversight ? I couldn't > find them in the HTML docs and neither in the docstrings. The documentation is way out of date and Jeremy Hylton and Andrew Kuchling haven't updated it. I'm not sure which of them changed the signatures for that module, but I've pestered Jeremy about it a few times. If anyone would like to update the documentation, I'd certainly appreciate it. I don't know the details of those interfaces, and this is somewhere where the details are pretty critical. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw@python.org Thu Dec 16 23:10:11 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Thu, 16 Dec 1999 18:10:11 -0500 (EST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released References: <199912161527.KAA08308@eric.cnri.reston.va.us> <14425.13449.954026.960703@anthem.cnri.reston.va.us> <14425.14305.907618.978628@dolphin.mojam.com> Message-ID: <14425.29011.429867.485070@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> To remember which way it is, I find it useful to execute SM> "uname -r", check the second digit, then look down at my shirt SM> for a pocket protector. The two pieces of information SM> together work for me. I currently get "2.2.13-4mdk" from SM> uname. I don't even have a pocket, let alone a pocket SM> protector, so even numbers must be stable releases... What do you do if it's the second Thursday after the full moon, and the local hockey team has just skated to a 3-3 tie? -Barry From mal@lemburg.com Thu Dec 16 21:53:36 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 16 Dec 1999 22:53:36 +0100 Subject: [Python-Dev] Batteries Included? References: <199912161902.OAA11345@eric.cnri.reston.va.us> Message-ID: <38595F60.7C1B34FF@lemburg.com> Guido van Rossum wrote: > > I like the batteries included approach, but I also feel resistence > against including stuff I cannot maintain. > ... > This isn't rocket science. Red Hat Python? I'm all for it! :-) I think we should wait for distutils to get up and running perfectly for everyone before taking such a step. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 15 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein@lyra.org Fri Dec 17 08:31:38 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 00:31:38 -0800 (PST) Subject: [Python-Dev] Batteries Included? In-Reply-To: <38595F60.7C1B34FF@lemburg.com> Message-ID: On Thu, 16 Dec 1999, M.-A. Lemburg wrote: > Guido van Rossum wrote: > > I like the batteries included approach, but I also feel resistence > > against including stuff I cannot maintain. This is an interesting comment, and is similar to the Apache sentiment. Nothing gets added to the standard distribution unless somebody in the Group is willing to maintain it. It provides a good mechanism for keeping the module set to a reasonable size and a set that can/will actually be maintained. > > ... > > This isn't rocket science. Red Hat Python? I'm all for it! :-) > > I think we should wait for distutils to get up and running > perfectly for everyone before taking such a step. You can also operate on the assumption that it will be done by the time 1.6 is ready to be released. In other words: do the work (distutils and minimizing the release) in parallel, rather than in sequence. I would also think that a large distro isn't going to be assembled with distutils. Somebody will sit down, pull all the components together, and make a big release. However, I do see the distutils as being needed for the people who grab the minimal distro. They need it to grab add'l packages. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik@pythonware.com Fri Dec 17 09:06:20 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 17 Dec 1999 10:06:20 +0100 Subject: [Python-Dev] zipfile.py References: Message-ID: <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com> James C. Ahlstrom wrote: > > Did anyone look at this yet? > > > > ftp://ftp.interet.com/pub/pylib.html > > > > ftp://ftp.interet.com/pub/zipfile.py > > I went to look for it, but I think that was before you put zipfile up. just a few comments (from reading the docs): -- it would be great if "open" could take an open file object as well as a file name. (in this case, you also need to document what you expect from the underlying file object: read, write, seek, tell should be enough, right? haven't looked at the code -- assuming it works, I'm only interested in the interface) -- or you could nuke "open" and pass those arguments to the constructor instead. -- I assume "open" adds "b" to the given mode argument. -- "dir" looks a bit strange. and hey, there's no "listdir" in there. I'd prefer a recursive "listdir" method, which takes an optional "depth" argument (e.g. 0=this dir, 1=this dir and first subdir, None=infinity, i.e. the full tree). that's all for now. From fredrik@pythonware.com Fri Dec 17 12:21:03 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 17 Dec 1999 13:21:03 +0100 Subject: [Python-Dev] posix module References: <14423.61493.90107.433664@weyr.cnri.reston.va.us> Message-ID: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com> > Ok, I think I'm done with the posix module updates, modulo bugs and > additional symbols for the *conf*() tables. gcc -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H -c ./posixmodule.c ./posixmodule.c:3789: `_SC_AIO_LIST_MAX' undeclared here (not in a function) ./posixmodule.c:3789: initializer element for `posix_constants_sysconf[10].value' is not constant make[1]: *** [posixmodule.o] Error 1 make[1]: Leaving directory `/data/repository/BleedingEdge/python/dist/src/Modules' (current CVS stuff, on Red Hat 5.2) From jim@interet.com Fri Dec 17 14:33:31 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 09:33:31 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <385A49BB.4D064240@interet.com> Greg Stein wrote: > > On Thu, 16 Dec 1999, James C. Ahlstrom wrote: > > Did anyone look at this yet? > > > > ftp://ftp.interet.com/pub/pylib.html > > > > ftp://ftp.interet.com/pub/zipfile.py > > Looking at it now... The writepy() as a method is questionable, I think. > I think it should open the file at instantiation time. I don't see a > reason to allow that to be deferred. Especially given that some of the > methods fail if open() hasn't been called. I eliminated open and added its args to the constructor. > It would be good to have > symbolic names for the 0 and 8 compression constants, and to fail if 8 is > passed and zlib is not available (otherwise, it doesn't fail until > read/write time, and with a NameError). There should probably be a > __del__ that calls close(). Oh, and a "closed" attribute that can be > checked and an error raised if an operation is done after the file has > been closed. All done. > I think dir() should return the contents, rather than print > them. I added listdir() and documented self.TOC. I kept printdir() as example code. > read() and write() ought to fail if the mode is incorrect. Oh, some > symbolic constants for things like "PK\005\006" would be nice. All done. JimA From guido@CNRI.Reston.VA.US Fri Dec 17 14:43:23 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 17 Dec 1999 09:43:23 -0500 Subject: [Python-Dev] Batteries Included? In-Reply-To: Your message of "Thu, 16 Dec 1999 22:53:36 +0100." <38595F60.7C1B34FF@lemburg.com> References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> Message-ID: <199912171443.JAA12414@eric.cnri.reston.va.us> > Guido van Rossum wrote: > > > > I like the batteries included approach, but I also feel resistence > > against including stuff I cannot maintain. > > ... > > This isn't rocket science. Red Hat Python? I'm all for it! :-) MAL: > I think we should wait for distutils to get up and running > perfectly for everyone before taking such a step. Fair enough -- but in the mean time, no more pushing for new modules in the core distribution (distutils excluded). --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@cnri.reston.va.us Fri Dec 17 14:59:09 1999 From: gward@cnri.reston.va.us (Greg Ward) Date: Fri, 17 Dec 1999 09:59:09 -0500 Subject: [Python-Dev] Batteries Included? In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us>; from guido@cnri.reston.va.us on Fri, Dec 17, 1999 at 09:43:23AM -0500 References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> <199912171443.JAA12414@eric.cnri.reston.va.us> Message-ID: <19991217095908.B8799@cnri.reston.va.us> On 17 December 1999, Guido van Rossum said: > Fair enough -- but in the mean time, no more pushing for new modules > in the core distribution (distutils excluded). So anyone who wants a new module snuck into the core just has to convince me to add it the distutils package, right? >snicker< Greg From jeremy@cnri.reston.va.us Fri Dec 17 18:30:37 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Fri, 17 Dec 1999 13:30:37 -0500 (EST) Subject: [Python-Dev] Batteries Included? In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us> References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> <199912171443.JAA12414@eric.cnri.reston.va.us> Message-ID: <14426.33101.757523.853781@goon.cnri.reston.va.us> >>>>> "GvR" == Guido van Rossum writes: >> Guido van Rossum wrote: I like the batteries included >> approach, but I also feel resistence against including stuff I >> cannot maintain. ... This isn't rocket science. Red Hat >> Python? I'm all for it! :-) >> MAL wrote: >> I think we should wait for distutils to get up and running >> perfectly for everyone before taking such a step. GvR> Fair enough -- but in the mean time, no more pushing for new GvR> modules in the core distribution (distutils excluded). Perhaps the right long-term solution (post-distutils) is to split Python into a core architected by Guido and a bazaar-style standard library maintained in a more apache-style. Jeremy From jim@interet.com Fri Dec 17 15:25:10 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 10:25:10 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> Message-ID: <385A55D6.A8A05EB9@interet.com> "M.-A. Lemburg" wrote: > Unfortunately, I always get the following traceback when trying > to print the directory: OK, I changed the decompress code (10:23 AM), please re-try. > with compression defaulting to 8 rather than 0 (most zip files > will be deflated since this is the ZIP default). The compress mode only applies to writing. On read, the method recorded in the file controls. JimA From jim@interet.com Fri Dec 17 14:49:20 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 09:49:20 -0500 Subject: [Python-Dev] zipfile.py References: <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com> Message-ID: <385A4D70.A162C584@interet.com> Fredrik Lundh wrote: > > James C. Ahlstrom wrote: > > > > > > ftp://ftp.interet.com/pub/pylib.html > -- it would be great if "open" could take an open file > object as well as a file name. I put these arguments into the constructor now. > (in this case, you also need to document what you > expect from the underlying file object: read, write, > seek, tell should be enough, right? haven't looked > at the code -- assuming it works, I'm only interested > in the interface) OK, docs updated. > -- I assume "open" adds "b" to the given mode argument. Correct. The mode can be either "w" or "wb" etc., and it works. > -- "dir" looks a bit strange. and hey, there's no "listdir" > in there. I'd prefer a recursive "listdir" method, which > takes an optional "depth" argument (e.g. 0=this dir, > 1=this dir and first subdir, None=infinity, i.e. the full > tree). I added a plain listdir() and changed dir() to printdir(). I also documented self.TOC which gets you the values too. JimA From jim@interet.com Fri Dec 17 14:39:51 1999 From: jim@interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 09:39:51 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> Message-ID: <385A4B37.333B9443@interet.com> "M.-A. Lemburg" wrote: > > "James C. Ahlstrom" wrote: > > > ftp://ftp.interet.com/pub/pylib.html > > > Unfortunately, I always get the following traceback when trying > to print the directory: Yes, compression isn't there yet. I am looking into it. > Some notes on the API: > ---------------------- > * I would find it more convenient if the filename and mode > would be constructor parameters, e.g. > > zfile = zipfile('myfile.zip','rb') OK, done. > with compression defaulting to 8 rather than 0 (most zip files > will be deflated since this is the ZIP default). Until compression works, and zlib ships with Python I would rather default to no compression (method 0). Otherwise this is not useful as a Python import archive. > * Also, I would like a method much like the os.listdir() > which returns a list of filenames rather than print it > to stdout. OK, done. > * .is_zipfile() should probably be a separate function: it > doesn't use any of the class' features. OK, done. > Aside: I found that you are using undocumented arguments to > zlib.compressobj() ... are these extra arguments left out of > the documentation on purpose or by simple oversight ? I couldn't > find them in the HTML docs and neither in the docstrings. I am following the CNRI code blindly here. I don't have docs either. JimA From jack@oratrix.nl Fri Dec 17 22:54:03 1999 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 17 Dec 1999 23:54:03 +0100 Subject: [Python-Dev] Batteries Included? In-Reply-To: Message by Jeremy Hylton , Fri, 17 Dec 1999 13:30:37 -0500 (EST) , <14426.33101.757523.853781@goon.cnri.reston.va.us> Message-ID: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl> Recently, Jeremy Hylton said: > Perhaps the right long-term solution (post-distutils) is to split > Python into a core architected by Guido and a bazaar-style standard > library maintained in a more apache-style. I can't help feeling uncomfortable with this. I've had quite some work to get an Apache with SSL up and running, even though someone gave me quite precise instructions. With Perl I fared even worse, despite their distutils-like package, when I wanted to try a PalmPilot package for Unix that needed Perl. I finally had to give up after quite some effort because the addon installers kept finding the older version of Perl that the system mgr had installed in stead of my newer version. I think distutils will be wonderful for us, the Python community, but something more RedHattish is needed for the general world who just want Python plus a certain set of extensions because some application needs it, so they can just download a fresh copy of ParrotPython 3.4.4 and know the application will work, without interfering with another application that happens to use Inquisition 1a5 and lives elsewhere on the disk. And maybe the answer is a much simpler freezing process, like MacPython BuildApplication where any Python user can drop a script on it and end up with a fully self-contained app guaranteed (well.... No reports to the contrary have been heard so far, at least:-) to contain everything needed and not interfere with an existing MacPython installation (or be interfered with by it). Then a popular app will have prebuilt binaries available for all platforms quickly, made by the Python community, and the enduser interested in the app but not in Python can simply download that. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mal@lemburg.com Sat Dec 18 13:17:52 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 18 Dec 1999 14:17:52 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A4B37.333B9443@interet.com> Message-ID: <385B8980.11CDE9AC@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > > "James C. Ahlstrom" wrote: > > > > ftp://ftp.interet.com/pub/pylib.html > > > > > > Unfortunately, I always get the following traceback when trying > > to print the directory: > > Yes, compression isn't there yet. I am looking into it. Great :-) > > Some notes on the API: > > ---------------------- > > * I would find it more convenient if the filename and mode > > would be constructor parameters, e.g. > > > > zfile = zipfile('myfile.zip','rb') > > OK, done. > > > with compression defaulting to 8 rather than 0 (most zip files > > will be deflated since this is the ZIP default). > > Until compression works, and zlib ships with Python I > would rather default to no compression (method 0). Otherwise > this is not useful as a Python import archive. Point taken. Perhaps it would be even better to not have a default at all: that way people will have to think about the issue *before* implementing it, rather than debug code that produces tracebacks. > > * Also, I would like a method much like the os.listdir() > > which returns a list of filenames rather than print it > > to stdout. > > OK, done. > > > * .is_zipfile() should probably be a separate function: it > > doesn't use any of the class' features. > > OK, done. Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 13 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Sat Dec 18 15:16:44 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 18 Dec 1999 16:16:44 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> Message-ID: <385BA55C.9DFCA88D@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > Unfortunately, I always get the following traceback when trying > > to print the directory: > > OK, I changed the decompress code (10:23 AM), please re-try. Everything is fine now... it's really impressive how easy you can manipulate ZIP files with it. One thing I'd suugest is to include some way to delete and update contents, e.g. the write() method should overwrite any existing entry in the archive (if it not already does -- I haven't tested it, just read the code and it seems to raise an exception), plus maybe a .remove() method which deletes an entry. > > with compression defaulting to 8 rather than 0 (most zip files > > will be deflated since this is the ZIP default). > > The compress mode only applies to writing. On read, the > method recorded in the file controls. True. How about making the compression argument mandatory for file opened in 'wb' mode only ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 13 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From da@ski.org Sat Dec 18 17:35:00 1999 From: da@ski.org (David Ascher) Date: Sat, 18 Dec 1999 09:35:00 -0800 Subject: [Python-Dev] Year 2000 O'Reilly Python Conference Message-ID: <003501bf497e$368f6f60$e655cfc0@ski.org> I just got off the phone with someone at O'Reilly, who is starting to plan the next O'Reilly Open Source Convention. I've agreed to be the chair of the Python conference, just so that there are no delays in getting the conference organized. If someone feels that I should not be chair, speak now and we can figure out who takes the 'job'. There are short-term and long-term issues to discuss: Short term: - We need a program committee -- If you're interested in being on said committee or know someone who should be, let me know. I'd like to get representatives from various subconstituencies on there (web types, zope types, business types, scientist types, linux types, hackers, etc.) - The call for papers is going on the O'Reilly website soon. I will try and get them to pass things by me first, but if we want to emphasize specific kinds of paper submissions, we need to decide that soon. - Greg or Barry, is it possible for one of you to setup a mailman mailing list which will be used by the program committee? eGroups is easy for me to setup, but lots of people hated it last year. I don't want to pollute python-dev with conference discussions. Longer term: - The schedule for the conference is (supposedly) going to be the same as last year. conference-wide keynotes at the beginning of both days, and 4x90minute segments. - We have two parallel tracks - We have 4 half-day tutorial slots - All of the paper materials have to be 'in' by March 1. We need to decide how much time we need to go through the review/revision process ourselves. In other words, the deadline for submissions is up to us, but we don't have that much time. --david ascher From jeremy@cnri.reston.va.us Sat Dec 18 22:39:58 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Sat, 18 Dec 1999 17:39:58 -0500 (EST) Subject: [Python-Dev] zipfile.py In-Reply-To: <385A4B37.333B9443@interet.com> References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A4B37.333B9443@interet.com> Message-ID: <14428.3390.671438.663889@bitdiddle.cnri.reston.va.us> >>>>> "JCA" == James C Ahlstrom writes: >> Aside: I found that you are using undocumented arguments to >> zlib.compressobj() ... are these extra arguments left out of the >> documentation on purpose or by simple oversight ? I couldn't find >> them in the HTML docs and neither in the docstrings. JCA> I am following the CNRI code blindly here. I don't have docs JCA> either. The docs for the zlib module are quite out of date, although I think the docstrings may be better (not necessarily completely up-to-date thought :-). The specific parameters to pass to zlib don't seem to be documented anywhere either; IIRC I dug them out of some example C code somewhere that used zlib to read Zip files. Jeremy From gstein@lyra.org Sat Dec 18 23:14:02 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 15:14:02 -0800 (PST) Subject: [Python-Dev] Year 2000 O'Reilly Python Conference In-Reply-To: <003501bf497e$368f6f60$e655cfc0@ski.org> Message-ID: On Sat, 18 Dec 1999, David Ascher wrote: >... > - Greg or Barry, is it possible for one of you to setup a mailman mailing > list which will be used by the program committee? eGroups is easy for me to > setup, but lots of people hated it last year. I don't want to pollute > python-dev with conference discussions. Done. ora-pc@pythonpros.com. http://mailman.pythonpros.com/mailman/listinfo/ora-pc I also removed the old monterey-speakers mailing list :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From da@ski.org Sun Dec 19 07:24:51 1999 From: da@ski.org (David Ascher) Date: Sat, 18 Dec 1999 23:24:51 -0800 Subject: [Python-Dev] Year 2000 O'Reilly Python Conference References: Message-ID: <013301bf49f2$243946f0$df55cfc0@ski.org> From: Greg Stein > On Sat, 18 Dec 1999, David Ascher wrote: > >... > > - Greg or Barry, is it possible for one of you to setup a mailman mailing > > list which will be used by the program committee? > Done. ora-pc@pythonpros.com. > http://mailman.pythonpros.com/mailman/listinfo/ora-pc Thanks, Greg. Now, folks, please consider joining the program committee. We need a few volunteers - not too many, but somewhere between 5 and 10 would be good. You don't even have to commit to making it to the conference, if that's a concern. -- david From jim@interet.com Mon Dec 20 14:18:17 1999 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 09:18:17 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <385E3AA9.162BE568@interet.com> Greg Stein wrote: > Do you have a ZipImporter written? Yes, it is ftp://ftp.interet.com/pub/importer.py JimA From jim@interet.com Mon Dec 20 14:35:58 1999 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 09:35:58 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> Message-ID: <385E3ECE.F8DCDE28@interet.com> "M.-A. Lemburg" wrote: > One thing I'd suugest is to include some way to delete and > update contents, e.g. the write() method should overwrite > any existing entry in the archive (if it not already does -- > I haven't tested it, just read the code and it seems to raise > an exception), plus maybe a .remove() method which deletes > an entry. Currently, adding a file requires the "a" append mode, while the "w" mode re-writes the file. Adding a duplicate file name produces an error message. I can change this, but removing a file would either waste space, or else the file contents must be copied over the old file and all the offsets updated. I don't like this because it is complicated, and I think it is fast enough to just re-write the archive. But it could be added if people want. > True. How about making the compression argument mandatory > for file opened in 'wb' mode only ? The default of zero provides a little guidance that you should use zero. I added a warning message if 8 is used which should discourage people from using 8. Or I could disallow 8. Is that OK? JimA From jim@interet.com Mon Dec 20 15:34:02 1999 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 10:34:02 -0500 Subject: [Python-Dev] Batteries Included? References: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl> Message-ID: <385E4C6A.BEC0F728@interet.com> Jack Jansen wrote: > And maybe the answer is a much simpler freezing process, like > MacPython BuildApplication where any Python user can drop a script on > it and end up with a fully self-contained app guaranteed (well.... No > reports to the contrary have been heard so far, at least:-) to contain > everything needed and not interfere with an existing MacPython > installation (or be interfered with by it). Then a popular app will > have prebuilt binaries available for all platforms quickly, made by > the Python community, and the enduser interested in the app but not in > Python can simply download that. IMHO the "much simpler freezing process" is archive files. A simple script can build them, imputil can import them, and the only remaining problem is to find them. Please see: ftp://ftp.interet.com/pub/bootmodule.html ftp://ftp.interet.com/pub/pylib.html JimA From jack@oratrix.nl Mon Dec 20 16:50:32 1999 From: jack@oratrix.nl (Jack Jansen) Date: Mon, 20 Dec 1999 17:50:32 +0100 Subject: [Python-Dev] Batteries Included? In-Reply-To: Message by "James C. Ahlstrom" , Mon, 20 Dec 1999 10:34:02 -0500 , <385E4C6A.BEC0F728@interet.com> Message-ID: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl> > IMHO the "much simpler freezing process" is archive files. A simple > script can build them, imputil can import them, and the only > remaining problem is to find them. Please see: Archive files solves the problem for Python modules. But that leaves the problem of dynamically loaded modules. And resources for dialogs and such, if you use native GUI stuff on Mac or Windows. And most serious applications that I've seen (GRiNS and Zope, to name two, Mailman is the only exception I can think of) depend on non-standard plugin modules. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mal@lemburg.com Mon Dec 20 14:44:42 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 20 Dec 1999 15:44:42 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> Message-ID: <385E40DA.37AD704F@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > One thing I'd suugest is to include some way to delete and > > update contents, e.g. the write() method should overwrite > > any existing entry in the archive (if it not already does -- > > I haven't tested it, just read the code and it seems to raise > > an exception), plus maybe a .remove() method which deletes > > an entry. > > Currently, adding a file requires the "a" append mode, while > the "w" mode re-writes the file. Adding a duplicate file name > produces an error message. I can change this, > but removing a file would either waste space, or else the file > contents must be copied over the old file and all the offsets > updated. I don't like this because it is complicated, and I think > it is fast enough to just re-write the archive. But it > could be added if people want. I guess it would be ok to waste space. You could provide a .cleanup() or .rewrite() method that takes care of reorganizing the file to fill up the gaps. > > True. How about making the compression argument mandatory > > for file opened in 'wb' mode only ? > > The default of zero provides a little guidance that you should > use zero. I added a warning message if 8 is used which should > discourage people from using 8. Or I could disallow 8. > Is that OK? Well the module seems to work just fine with compression on, so disallowing it or issuing a warning would reduce its value, IMHO. How about making compression a boolean value and then converting any true value to 8 ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 11 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake@acm.org Mon Dec 20 18:52:41 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 20 Dec 1999 13:52:41 -0500 (EST) Subject: [Python-Dev] posix module In-Reply-To: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com> References: <14423.61493.90107.433664@weyr.cnri.reston.va.us> <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com> Message-ID: <14430.31481.402469.896400@weyr.cnri.reston.va.us> Fredrik Lundh writes: > (current CVS stuff, on Red Hat 5.2) Ok, Guido figured it out; this is a typo in the header /usr/include/confname.h; the enum and the #define don't have the same name. Do you know a way to detect the Linux kernel version using pre-preprocessor macros? (Seems very fragile.) Would it be reasonable to only add that table entry for kernel versions >= 2.2? -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim@interet.com Mon Dec 20 19:25:27 1999 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 14:25:27 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> Message-ID: <385E82A7.72345807@interet.com> "M.-A. Lemburg" wrote: > I guess it would be ok to waste space. You could provide > a .cleanup() or .rewrite() method that takes care of > reorganizing the file to fill up the gaps. OK, adding a duplicate name replaces the old file. > Well the module seems to work just fine with compression > on, so disallowing it or issuing a warning would reduce its value, > IMHO. Yes compression works, but 90% of Python installations don't have zlib, so it is an ERROR to create archives with compression when these archives are distributed to other sites. > How about making compression a boolean value and then > converting any true value to 8 ? It would close the door to future or other compression methods. Currently the method must be 0 or 8 or a traceback will result. JimA From jim@interet.com Mon Dec 20 19:33:11 1999 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 14:33:11 -0500 Subject: [Python-Dev] Batteries Included? References: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl> Message-ID: <385E8477.F727E0F8@interet.com> Jack Jansen wrote: > Archive files solves the problem for Python modules. But that leaves the > problem of dynamically loaded modules. And resources for dialogs and such, if > you use native GUI stuff on Mac or Windows. Point taken. For dynamically loaded modules, I believe in following the native system's DLL path, and not adding eccentric Python logic. But many disagreed a couple week's ago when I raised this. For resources, I think the archive file can accommodate this, although it seems highly system dependent. Anyway, any file at all can live in the archive and the import mechanism for *.pyc will not be damaged nor unduly slowed down by its presence. JimA From gstein@lyra.org Mon Dec 20 20:11:50 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 20 Dec 1999 12:11:50 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <385E82A7.72345807@interet.com> Message-ID: On Mon, 20 Dec 1999, James C. Ahlstrom wrote: > "M.-A. Lemburg" wrote: > > I guess it would be ok to waste space. You could provide > > a .cleanup() or .rewrite() method that takes care of > > reorganizing the file to fill up the gaps. > > OK, adding a duplicate name replaces the old file. But it shouldn't print a warning(!). If an application wants to replace a file, then stuff shouldn't appear on stdout as a result. > > Well the module seems to work just fine with compression > > on, so disallowing it or issuing a warning would reduce its value, > > IMHO. > > Yes compression works, but 90% of Python installations don't have > zlib, so it is an ERROR to create archives with compression when > these archives are distributed to other sites. While it may be problem to distribute them to other sites, that is not up to the library. If I want compression, then I should get compression. A library module should not determine application-level policy. The warning that __init__ prints shouldn't be there. Really: there should not be a single "print" in the library (well, printdir() is fine... that's what it is supposed to do; printing in the test code would be fine). In normal, or even exceptional(!), operation there should never be a print. > > How about making compression a boolean value and then > > converting any true value to 8 ? > > It would close the door to future or other compression methods. > Currently the method must be 0 or 8 or a traceback will result. I definitely agree with JimA here. For example, maybe we want bzip compression in there. Sure, non-portable, but that's my problem :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim@interet.com Mon Dec 20 20:50:46 1999 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 15:50:46 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <385E96A6.40CCF285@interet.com> Greg Stein wrote: > > On Mon, 20 Dec 1999, James C. Ahlstrom wrote: > > "M.-A. Lemburg" wrote: > But it shouldn't print a warning(!). If an application wants to replace a > file, then stuff shouldn't appear on stdout as a result. OK, no warning. > The warning that __init__ prints shouldn't be there. OK, it is gone. > Really: there should not be a single "print" in the library (well, No print unless _debug > 0 JimA From mal@lemburg.com Mon Dec 20 21:16:39 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 20 Dec 1999 22:16:39 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com> Message-ID: <385E9CB7.5DE4848A@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > I guess it would be ok to waste space. You could provide > > a .cleanup() or .rewrite() method that takes care of > > reorganizing the file to fill up the gaps. > > OK, adding a duplicate name replaces the old file. Cool. > > Well the module seems to work just fine with compression > > on, so disallowing it or issuing a warning would reduce its value, > > IMHO. > > Yes compression works, but 90% of Python installations don't have > zlib, so it is an ERROR to create archives with compression when > these archives are distributed to other sites. Sure, for the sake of creating Python code archives, but your module is much more versatile: e.g. I could automatically create ZIP archives of log files or sets of other files and then have Python email them to someone who uses these archives through standard tools such as WinZip -- the target doesn't always have to be a Python process :-) > > How about making compression a boolean value and then > > converting any true value to 8 ? > > It would close the door to future or other compression methods. > Currently the method must be 0 or 8 or a traceback will result. Ok. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 11 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jim@interet.com Mon Dec 20 21:37:20 1999 From: jim@interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 16:37:20 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com> <385E9CB7.5DE4848A@lemburg.com> Message-ID: <385EA190.6AF511BD@interet.com> "M.-A. Lemburg" wrote: > > Sure, for the sake of creating Python code archives, but > your module is much more versatile: e.g. I could automatically > create ZIP archives of log files or sets of other files and OK, zipfile.py no longer complains about compression != 0 JimA From fdrake@acm.org Tue Dec 21 22:42:26 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 21 Dec 1999 17:42:26 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212238.RAA13660@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> Message-ID: <14432.594.33416.600794@weyr.cnri.reston.va.us> Guido van Rossum writes: > + > + class GetoptError(Exception): > + opt = '' > + msg = '' > + def __init__(self, *args): > + self.args = args > + if len(args) == 1: > + self.msg = args[0] > + elif len(args) == 2: > + self.msg = args[0] > + self.opt = args[1] > + > + def __str__(self): > + return self.msg > > ! error = GetoptError # backward compatibility This breaks as soon as the standard exceptions are strings; does this mean -X will be removed in the next release? (Please????) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Tue Dec 21 22:44:46 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Tue, 21 Dec 1999 17:44:46 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> Message-ID: <14432.734.155183.508785@anthem.cnri.reston.va.us> >>>>> "Fred" == Fred L Drake, Jr writes: Fred> This breaks as soon as the standard exceptions are Fred> strings; does this mean -X will be removed in the next Fred> release? (Please????) Pretty please? :) From guido@CNRI.Reston.VA.US Tue Dec 21 23:05:28 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:05:28 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 17:42:26 EST." <14432.594.33416.600794@weyr.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> Message-ID: <199912212305.SAA13722@eric.cnri.reston.va.us> > Guido van Rossum writes: > > + > > + class GetoptError(Exception): > > + opt = '' > > + msg = '' > > + def __init__(self, *args): > > + self.args = args > > + if len(args) == 1: > > + self.msg = args[0] > > + elif len(args) == 2: > > + self.msg = args[0] > > + self.opt = args[1] > > + > > + def __str__(self): > > + return self.msg > > > > ! error = GetoptError # backward compatibility [Fred Drake] > This breaks as soon as the standard exceptions are strings; does > this mean -X will be removed in the next release? (Please????) Not a bad idea. Anybody got a reason why -X should stay? (The next step would be to outlaw raise with a string argument; I think I can't make that for 1.6. But it would be a good idea to scan the standard library for string exceptions and convert all of them.) --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Tue Dec 21 23:21:38 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:21:38 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <14432.2946.857539.898577@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Anybody got a reason why -X should stay? Kill it. Guido> (The next step would be to outlaw raise with a string Guido> argument; I think I can't make that for 1.6. But it would Guido> be a good idea to scan the standard library for string Guido> exceptions and convert all of them.) Or require that exception classes be derived from exceptions.Exception :) -Barry From guido@CNRI.Reston.VA.US Tue Dec 21 23:23:29 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:23:29 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 18:21:38 EST." <14432.2946.857539.898577@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> Message-ID: <199912212323.SAA13803@eric.cnri.reston.va.us> [Barry] > Guido> Anybody got a reason why -X should stay? > > Kill it. You already said that. Anybody else? > Guido> (The next step would be to outlaw raise with a string > Guido> argument; I think I can't make that for 1.6. But it would > Guido> be a good idea to scan the standard library for string > Guido> exceptions and convert all of them.) > > Or require that exception classes be derived from exceptions.Exception > :) That's hard to require. But it could easily be a requirement checked by one of the hypothetical typecheckers that are being discussed in the types-sig. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Tue Dec 21 23:27:31 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:27:31 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> Message-ID: <14432.3299.404561.698836@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: BAW> Or require that exception classes be derived from BAW> exceptions.Exception :) Guido> That's hard to require. But it could easily be a Guido> requirement checked by one of the hypothetical typecheckers Guido> that are being discussed in the types-sig. Hmm, the raise could probably enforce this, but it might not be that useful. -Barry From guido@CNRI.Reston.VA.US Tue Dec 21 23:40:22 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:40:22 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 18:27:31 EST." <14432.3299.404561.698836@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> Message-ID: <199912212340.SAA13851@eric.cnri.reston.va.us> > >>>>> "Guido" == Guido van Rossum writes: > > BAW> Or require that exception classes be derived from > BAW> exceptions.Exception :) > > Guido> That's hard to require. But it could easily be a > Guido> requirement checked by one of the hypothetical typecheckers > Guido> that are being discussed in the types-sig. > > Hmm, the raise could probably enforce this, but it might not be that > useful. > > -Barry The raise could easily enforce this, but it would break lots of existing code. I wish I had done it right from the start -- then exceptions would have been classes from the start and would have required inheritance from the Exception base class. Like in Java. (And in C++?) --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@python.org Tue Dec 21 23:43:59 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:43:59 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> Message-ID: <14432.4287.543786.308468@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> The raise could easily enforce this, but it would break Guido> lots of existing code. Maybe not (I'm not sure). All the standard exceptions inherit from Exception, and of course there'd be nothing to enforce for existing user-defined string based exceptions. How pervasive are user-defined class based exceptions that don't inherit from Exception? (I don't know, and I haven't grepped, but I think we've been making that recommendation from day 1 of class-based standard exceptions, and I try to follow this recommendation in my own code). Guido> I wish I had done it right from the start -- then Guido> exceptions would have been classes from the start and would Guido> have required inheritance from the Exception base class. Guido> Like in Java. (And in C++?) All Hail, Python 2.0, our Savior and Redeemer! :) -Barry From guido@CNRI.Reston.VA.US Tue Dec 21 23:49:09 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:49:09 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 18:43:59 EST." <14432.4287.543786.308468@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> <14432.4287.543786.308468@anthem.cnri.reston.va.us> Message-ID: <199912212349.SAA13892@eric.cnri.reston.va.us> > From: "Barry A. Warsaw" > >>>>> "Guido" == Guido van Rossum writes: > > Guido> The raise could easily enforce this, but it would break > Guido> lots of existing code. > > Maybe not (I'm not sure). All the standard exceptions inherit from > Exception, and of course there'd be nothing to enforce for existing > user-defined string based exceptions. How pervasive are user-defined > class based exceptions that don't inherit from Exception? (I don't > know, and I haven't grepped, but I think we've been making that > recommendation from day 1 of class-based standard exceptions, and I > try to follow this recommendation in my own code). Yes, but class-based user exceptions existed many Python versions before class-based standard exceptions! Two examples in the standard library: ConfigParser.py and xdrlib.py. > All Hail, Python 2.0, our Savior and Redeemer! :) Or, the perfect excuse for procrastination :) (But yes, 2.0 will enforce this.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Tue Dec 21 23:53:50 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 15:53:50 -0800 (PST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: On Tue, 21 Dec 1999, Guido van Rossum wrote: >... > [Fred Drake] > > This breaks as soon as the standard exceptions are strings; does > > this mean -X will be removed in the next release? (Please????) > > Not a bad idea. > > Anybody got a reason why -X should stay? Kill it. > (The next step would be to outlaw raise with a string argument; I > think I can't make that for 1.6. But it would be a good idea to scan > the standard library for string exceptions and convert all of them.) Keep string exceptions. I think there is probably a lot of code that still uses them. I know I do :-) We can issues warnings about string exceptions via the type-checking tool. Cheers, -g -- Greg Stein, http://www.lyra.org/ From bwarsaw@python.org Tue Dec 21 23:54:04 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:54:04 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> <14432.4287.543786.308468@anthem.cnri.reston.va.us> <199912212349.SAA13892@eric.cnri.reston.va.us> Message-ID: <14432.4892.908107.421149@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Yes, but class-based user exceptions existed many Python Guido> versions before class-based standard exceptions! True, but I suspect that legacy class-based user exceptions are rare. I might be wrong, but you're absolutely right that these would all be broken. Guido> Two examples in the standard library: ConfigParser.py and Guido> xdrlib.py. Fortunately these are fixed with two 11 character patches :) I'm not necessarily arguing for or against tightening this. -Barry From gmcm@hypernet.com Tue Dec 21 23:55:07 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Tue, 21 Dec 1999 18:55:07 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us> References: Your message of "Tue, 21 Dec 1999 18:27:31 EST." <14432.3299.404561.698836@anthem.cnri.reston.va.us> Message-ID: <1266302877-22249299@hypernet.com> [Guido] > I wish I had done it right from the start -- then exceptions > would have been classes from the start and would have required > inheritance from the Exception base class. Like in Java. (And > in C++?) In C++ you can throw anything at all. Strings, ints, that Warsaw blockhead... off-topic-ly y'rs - Gordon From tismer@appliedbiometrics.com Wed Dec 22 00:57:27 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 22 Dec 1999 01:57:27 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> Message-ID: <386021F7.4F94C458@appliedbiometrics.com> Guido van Rossum wrote: > > [Barry] > > Guido> Anybody got a reason why -X should stay? > > > > Kill it. > > You already said that. > > Anybody else? I'd say kill -X, but keep allowing string exceptions if it doesn't cost too much. I think of C++, like Gordon said. Also I'd take the chance and move the exceptions Python module back into the core, as a frozen mdule or whatever. Reason: At the moment, the CVS version of the Python library is incompatible to 1.5.2, which makes testing against the standard dist quite inconvenient. A compiled CVS Python does not run under PythonWin when I put it into my standard installation. Or is there an easy way to switch all settings to a completely different path? Anyway, I'm most probably off until Y2K. See ya all then, provided we survive - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido@CNRI.Reston.VA.US Wed Dec 22 01:01:16 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 20:01:16 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 01:57:27 +0100." <386021F7.4F94C458@appliedbiometrics.com> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <386021F7.4F94C458@appliedbiometrics.com> Message-ID: <199912220101.UAA14109@eric.cnri.reston.va.us> > I'd say kill -X, but keep allowing string exceptions if > it doesn't cost too much. I think of C++, like Gordon said. Agreed. > Also I'd take the chance and move the exceptions Python > module back into the core, as a frozen mdule or whatever. > > Reason: At the moment, the CVS version of the Python library > is incompatible to 1.5.2, which makes testing against the > standard dist quite inconvenient. A compiled CVS Python > does not run under PythonWin when I put it into my standard > installation. Or is there an easy way to switch all settings > to a completely different path? Point the PYTHONHOME variable to the top of your install directory. (On Windows you may have to kill the registry settings -- this is a bug.) > Anyway, I'm most probably off until Y2K. Ditto. > See ya all then, provided we survive - chris Best wishes to all, --Guido van Rossum (home page: http://www.python.org/~guido/) From jim@digicool.com Wed Dec 22 13:54:41 1999 From: jim@digicool.com (Jim Fulton) Date: Wed, 22 Dec 1999 08:54:41 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <3860D821.576B3146@digicool.com> Guido van Rossum wrote: > > (The next step would be to outlaw raise with a string argument; I > think I can't make that for 1.6. But it would be a good idea to scan > the standard library for string exceptions and convert all of them.) This would be waaaaay to big a change for Python 1.x. There are alot of Python modules outside the standard distribution that use string exceptions. This would be a huge backward incompatability. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fdrake@acm.org Wed Dec 22 14:23:29 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 22 Dec 1999 09:23:29 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <14432.57057.535205.558@weyr.cnri.reston.va.us> Guido van Rossum writes: > (The next step would be to outlaw raise with a string argument; I > think I can't make that for 1.6. But it would be a good idea to scan > the standard library for string exceptions and convert all of them.) I don't know if requiring class-based exceptions will make the runtime any simpler, but that seems the only reason to do it. The only reason to remove -X, and possibly the string exception fallback code, is to ensure that we *can* subclass Exception and friends without having to catch TypeError and do something different. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake@acm.org Wed Dec 22 14:25:33 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 22 Dec 1999 09:25:33 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <14432.2946.857539.898577@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> Message-ID: <14432.57181.944364.427093@weyr.cnri.reston.va.us> Barry A. Warsaw writes: > Or require that exception classes be derived from exceptions.Exception > :) Ok, it's early, and maybe I haven't had enough coffee(!). But is this serious? Does JPython gain some benefit from this, is it your preference, or are you just yanking on my leg? ("Pulling my arm" as my 5-year-old says!) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From guido@CNRI.Reston.VA.US Wed Dec 22 14:40:39 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 09:40:39 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 09:23:29 EST." <14432.57057.535205.558@weyr.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.57057.535205.558@weyr.cnri.reston.va.us> Message-ID: <199912221440.JAA16198@eric.cnri.reston.va.us> > From: "Fred L. Drake, Jr." > > Guido van Rossum writes: > > (The next step would be to outlaw raise with a string argument; I > > think I can't make that for 1.6. But it would be a good idea to scan > > the standard library for string exceptions and convert all of them.) > > I don't know if requiring class-based exceptions will make the > runtime any simpler, but that seems the only reason to do it. Do what? *Require* class exceptions? You're probably right, and I think the gain is minimal. There's another reason to scan the std library though -- not to set a bad example. I want to eventually (in 2.0) move to a class-derived-from-Exception-only scheme. > The only reason to remove -X, and possibly the string exception > fallback code, is to ensure that we *can* subclass Exception and > friends without having to catch TypeError and do something different. And that's a very good reason indeed. Let me repeat my plans for 1.6. - Remove -X; the standard exceptions are always class-based. - Change all standard library and other example code to use class-based exceptions with a standard exception as base class, to set an example. - Still allow string exceptions in user code. - Still allow class exceptions that don't use a standard exception base class in user code. --Guido van Rossum (home page: http://www.python.org/~guido/) From Vladimir.Marangozov@inrialpes.fr Wed Dec 22 18:09:47 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Wed, 22 Dec 1999 19:09:47 +0100 (CET) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912221440.JAA16198@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 09:40:39 AM Message-ID: <199912221809.TAA25322@python.inrialpes.fr> Guido van Rossum wrote: > > [Fred Drake] > > I don't know if requiring class-based exceptions will make the > > runtime any simpler, but that seems the only reason to do it. > > Do what? *Require* class exceptions? You're probably right, and I > think the gain is minimal. Yes. Besides, I still think that string-based exceptions are just convenient for quick & dirty, throw-away test scripts. > > Let me repeat my plans for 1.6. > > - Remove -X; the standard exceptions are always class-based. > > - Change all standard library and other example code to use > class-based exceptions with a standard exception as base class, to set > an example. > > - Still allow string exceptions in user code. > > - Still allow class exceptions that don't use a standard exception > base class in user code. Sounds okay. --- PS: I'm particularly happy today :-) because I've finally published the new version of our Web site http://www.inrialpes.fr. Two things I'd like to mention: (1) it shouldn't have been possible without quick Python scripts ;) (2) I'll find the time to reinvoke some of the topics discussed here instead of being mute as a fish. That said, Merry Christmas and a Happy New Year to all of you! -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From guido@CNRI.Reston.VA.US Wed Dec 22 18:23:45 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 13:23:45 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 19:09:47 +0100." <199912221809.TAA25322@python.inrialpes.fr> References: <199912221809.TAA25322@python.inrialpes.fr> Message-ID: <199912221823.NAA16517@eric.cnri.reston.va.us> Vladimir.Marangozov@inrialpes.fr: > Yes. Besides, I still think that string-based exceptions are just > convenient for quick & dirty, throw-away test scripts. They have a hard-to-understand quirk though: the id() of the string is used to check rather than its value, so that except "foo" doesn't necessarily catch raise "foo"; but due to various optimization, this usually works, and people get bent out of shape when it doesn't. Since you have to give your exception a name, how hard is it to say class MyError(Exception): pass rathern than MyError = "MyError" ? --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Wed Dec 22 18:33:19 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 10:33:19 -0800 (PST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us> Message-ID: On Wed, 22 Dec 1999, Guido van Rossum wrote: > Vladimir.Marangozov@inrialpes.fr: > > Yes. Besides, I still think that string-based exceptions are just > > convenient for quick & dirty, throw-away test scripts. > > They have a hard-to-understand quirk though: the id() of the string is > used to check rather than its value, so that except "foo" doesn't > necessarily catch raise "foo"; but due to various optimization, this > usually works, and people get bent out of shape when it doesn't. > Since you have to give your exception a name, how hard is it to say > > class MyError(Exception): pass > > rathern than > > MyError = "MyError" > > ? It is very hard. My fingers do the typing for me, and they fill in strings. I'm trying to teach them otherwise, but they insist. You're also assuming that MyError gets defined. Sometimes, my little fingers like typing: try: foo except: raise "foo broke for some reason" Quick and dirty, indeed! :-) Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From fdrake@acm.org Wed Dec 22 19:59:55 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 22 Dec 1999 14:59:55 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> Message-ID: <14433.11707.607533.698901@weyr.cnri.reston.va.us> Guido van Rossum writes: > I wish I had done it right from the start -- then exceptions would > have been classes from the start and would have required inheritance > from the Exception base class. Like in Java. (And in C++?) I've seen this said or hinted at in a couple of places (the specific requirement that exception derive from Exception), but I've seen nothing that indicates any reason or derived value for this. Could someone please clarify? -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From guido@CNRI.Reston.VA.US Wed Dec 22 20:05:52 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 15:05:52 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 14:59:55 EST." <14433.11707.607533.698901@weyr.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> <14433.11707.607533.698901@weyr.cnri.reston.va.us> Message-ID: <199912222005.PAA17291@eric.cnri.reston.va.us> > From: "Fred L. Drake, Jr." > Guido van Rossum writes: > > I wish I had done it right from the start -- then exceptions would > > have been classes from the start and would have required inheritance > > from the Exception base class. Like in Java. (And in C++?) > > I've seen this said or hinted at in a couple of places (the specific > requirement that exception derive from Exception), but I've seen > nothing that indicates any reason or derived value for this. Could > someone please clarify? It's simply an extra bit of checking that your program is reasonable -- if you accidentally raise a non-exception class, there's probably something wrong with your program, and it gives the reader a hint about the intended use of the class. Other languages (e.g. Modula-3) have a specific exception type that can be used only for that one purpose. However it's useful to allow methods an subclassing of exceptions, so they might as well be classes. So, all exceptions are classes. But not all classes are exceptions. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Wed Dec 22 20:11:43 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 12:11:43 -0800 (PST) Subject: [Python-Dev] Please test new dynamic load behavior Message-ID: Hi all, I reorganized Python's dynamic load/import code over the past few days. Gudio provided some feedback, I did some more mods, and now it is checked into CVS. The new loading behavior has been tested on Linux, IRIX, and Solaris (and probably Windows by now). For people with CVS access, I'd like to ask that you grab an updated copy and shake out the new code. There have been updates to the "configure" process, so you'll need to run configure again. Make sure that you alter your Modules/Setup to build some shared modules, and then try it out. Here are some of the platforms that I believe need specific testing: - NetBSD, FreeBSD, OpenBSD, ... - AIX - HP/UX - BeOS - NeXT - Mac - OS/2 - Win16 I believe it should work for most people, but we may be looking for the wrong "init" symbol on some platforms. We might even be selecting the wrong import mechanism (or missing it altogether!) on some platforms. If you get a chance to test this, then please drop me a note with your platform and whether it succeeded or failed (and how it failed). Thanx! -g p.s. you can tell if dynamic loading is missing by watching for DYNLOADFILE in the configure process and seeing if it used dynload_stub. alternatively, you can import the "imp" module and see if "load_dynamic" is missing. -- Greg Stein, http://www.lyra.org/ From gvwilson@nevex.com Thu Dec 23 03:43:40 1999 From: gvwilson@nevex.com (gvwilson@nevex.com) Date: Wed, 22 Dec 1999 22:43:40 -0500 (EST) Subject: [Python-Dev] re: Open Source design competition / Python / software tools Message-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --168427786-691315853-945920620=:4839 Content-Type: TEXT/PLAIN; charset=US-ASCII Hi, folks. I hope you don't mind another mail out of the blue, but I got notice on Saturday that the Department of Energy is giving me $860K over two years to support development of easier-to-use software engineering tools. All of the work will be Open Source, and will be done in Python, with a strong emphasis on design, testing, and documentation. The project's long-term objective is to encourage scientists and engineers to treat programs in the same way as they do other experiments, i.e. to calibrate, test, peer review, and so on. To kick-start things, we're going to be holding a two-round design competition. Anyone (individual or team, professional or student) can submit a short entry for the first round; the judges will pick four candidates to go forward in each of four categories, and those individuals or teams will be asked to submit full entries. The four categories are: * an issue tracking system to replace Gnats and Bugzilla; * a build system to replace make; * a platform inspection and configuration system to replace autoconf; and * a testing framework to replace XUnit, Expect, and DejaGnu. Would you be interested in participating in any way---judging, entering a design, critiquing things from the pointer of view of end users, or anything else? I realize that you're probably up past your eyeballs with work, and that the money on offer is nothing special, but I think this could be a lot of fun, and could help to shift the emphasis of the Open Source community from hacking to design (both by drawing attention to, and rewarding, design, and by creating a corpus of examples and commentary for programmers to refer to). It could also make life a lot easier for computational scientists and engineers... Please let me know if you'd like to be involved, or if you'd like more information than is contained in the FAQ (attached). Timescales are a bit tight---I'd like to be able to make an announcement on January 14---but I'll be reading email at this address several times a day during the holiday. I look forward to hearing from you, Greg Wilson p.s. please note that the attached FAQ is a first draft; I'd be grateful if you could show it to anyone you think might be interested, but I'd also be grateful if you wouldn't broadcast it until it's gone through one more editing pass. --168427786-691315853-945920620=:4839 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; name="faq.html" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="faq.html" PEhUTUw+DQo8SEVBRD4NCjxUSVRMRT5Tb2Z0d2FyZSBDYXJwZW50cnkgRkFR PC9USVRMRT4NCjwvSEVBRD4NCjxCT0RZPg0KDQo8SDEgQUxJR049IkNFTlRF UiI+U29mdHdhcmUgQ2FycGVudHJ5IEZBUTwvSDE+DQoNCg0KPEgyPkdlbmVy YWwgaW5mb3JtYXRpb248L0gyPg0KDQo8T0w+DQoNCjxMST48RU0+V2hhdCBp cyB0aGUgU29mdHdhcmUgQ2FycGVudHJ5IHByb2plY3Q/IDwvRU0+DQo8QlI+ DQpUaGUgYWltIG9mIHRoZSBTb2Z0d2FyZSBDYXJwZW50cnkgcHJvamVjdCBp cyB0byBtYWtlIGl0IGVhc2llciBmb3INCnByb2dyYW1tZXJzIGluIGdlbmVy YWwsIGFuZCBzY2llbnRpZmljIHByb2dyYW1tZXJzIGluIHBhcnRpY3VsYXIs IHRvDQphZG9wdCBiZXR0ZXIgc29mdHdhcmUgZGV2ZWxvcG1lbnQgcHJhY3Rp Y2VzLiBUaGUgcHJvamVjdCB3aWxsIGFjaGlldmUNCnRoaXMgYnkgY3JlYXRp bmcgdG9vbHMgdGhhdCBhcmUgZWFzaWVyIHRvIGxlYXJuIGFuZCB1c2UsIGFu ZCBieQ0KZG9jdW1lbnRpbmcgdGhvc2UgdG9vbHMgYW5kIHRoZSBwcmFjdGlj ZXMgdGhleSBlbWJvZHkuDQo8L0xJPg0KDQo8TEk+PEVNPldoZXJlIGRvZXMg dGhlIG5hbWUgY29tZSBmcm9tPzwvRU0+DQo8QlI+DQpUaGUgbmFtZSBpcyBh IHBsYXkgb24gInNvZnR3YXJlIGVuZ2luZWVyaW5nIiwgYW5kIGlzIG1lYW50 IHRvIGluZGljYXRlDQp0aGF0IHRoaXMgcHJvamVjdCBpcyBpbml0aWFsbHkg Y29uY2VybmVkIHdpdGggbWVkaXVtLXNpemVkIHRlYW1zICh1cA0KdG8gYSBk b3plbiBvciB0d28gcHJvZ3JhbW1lcnMpIGFuZCBtZWRpdW0tdGVybSB0aW1l c2NhbGVzIChhIHllYXIgb3INCnR3bykuDQo8L0xJPg0KDQo8TEk+PEVNPkhv dyBkaWQgdGhlIHByb2plY3QgZ2V0IHN0YXJ0ZWQ/PC9FTT4NCjxCUj4NClRo ZSBwcm9qZWN0IGhhcyBpdHMgb3JpZ2lucyBpbiBhIDxBDQpIUkVGPSJodHRw Oi8vd3d3LmFjbC5sYW5sLmdvdi9zYy9yZXNvdXJjZXMvY3NlL2luZGV4Lmh0 bWwiPnNlcmllcyBvZg0KYXJ0aWNsZXM8L0E+IHRoYXQgR3JlZyBXaWxzb24g b3JnYW5pemVkIGZvciB0aGUgRmFsbCAxOTk2IGFuZCBXaW50ZXINCjE5OTYg aXNzdWVzIG9mIDxDSVRFPklFRUUgQ29tcHV0YXRpb25hbCBTY2llbmNlIGFu ZA0KRW5naW5lZXJpbmc8L0NJVEU+LiBUaGVzZSBhcnRpY2xlcyBvdXRsaW5l ZCB3aGF0IHRoZWlyIGF1dGhvcnMgdGhvdWdodA0KY29tcHV0ZXIgc2NpZW50 aXN0cyBzaG91bGQgdGVhY2ggdG8gcGh5c2ljYWwgc2NpZW50aXN0cyBhbmQN CmVuZ2luZWVycy4gTW9zdCBhdXRob3JzIHJlY29tbWVuZGVkIG51bWVyaWNh bCBtZXRob2RzIG9yIHRoZSBzdGFuZGFyZA0KVW5peCB0b29sc2V0LCBidXQg U3RldmUgTWNDb25uZWxsIGFyZ3VlZCB0aGF0IGJldHRlciBwcm9ncmFtbWlu Zw0KcHJhY3RpY2VzIHdvdWxkIGhhdmUgdGhlIGdyZWF0ZXN0IGltcGFjdCBv biBwcm9kdWN0aXZpdHkuDQoNCjxCUj4gQXMgYSByZXN1bHQgb2YgdGhhdCBv YnNlcnZhdGlvbiwgR3JlZyBXaWxzb24sIEJyZW50IEdvcmRhLCBhbmQNClN0 ZXZlIE1jQ29ubmVsbCBwdXQgdG9nZXRoZXIgYSAzLWRheSBjb3Vyc2Ugb24g c29mdHdhcmUgZW5naW5lZXJpbmcNCmZvciBzY2llbnRpc3RzIGFuZCBlbmdp bmVlcnMsIHdoaWNoIHRoZXkgdGF1Z2h0IHNldmVyYWwgdGltZXMgYXQgdGhl DQpMb3MgQWxhbW9zIE5hdGlvbmFsIExhYm9yYXRvcnkuIEZlZWRiYWNrIG9u IHRoZSBjb3Vyc2Ugd2FzIHZlcnkNCnBvc2l0aXZlLCBidXQgbWFueSBwYXJ0 aWNpcGFudHMgZmVsdCB0aGF0IHRoZSB0b29scyBiZWluZw0KdGF1Z2h0LS0t UGVybCwgTWFrZSwgQ1ZTLCBhbmQgc28gb24tLS13ZXJlIHVubmVjZXNzYXJp bHkgZGlmZmljdWx0IHRvDQppbnN0YWxsLCBsZWFybiwgYW5kIHVzZS4gVGhl eSB3ZXJlIGFsc28gZnJ1c3RyYXRlZCBieSB0aGUgc2NhcmNpdHkgb2YNCmV4 YW1wbGVzIG9mIGRlc2lnbiBkb2N1bWVudHMsIHRlc3RpbmcgcGxhbnMsIGFu ZCBhbGwgb2YgdGhlIG90aGVyDQp0aGluZ3MgdGhlIGNvdXJzZSB3YXMgdHJ5 aW5nIHRvIHRlYWNoIHRoZW0uDQo8L0xJPg0KDQo8TEk+PEVNPldoeSBPcGVu IFNvdXJjZT88L0VNPg0KPEJSPg0KVGhlcmUgYXJlIHRocmVlIHJlYXNvbnMg d2h5IHRoZSBTb2Z0d2FyZSBDYXJwZW50cnkgcHJvamVjdCBpcw0KZm9sbG93 aW5nIHRoZSBPcGVuIFNvdXJjZSBtb2RlbDoNCjwvTEk+DQoNCgk8T0w+DQoN Cgk8TEk+PEVNPkxldmVyYWdpbmcgZXhpc3Rpbmcga25vd2xlZGdlLiA8L0VN Pg0KCTxCUj4NCglBIGNsb3NlZCBwcm9qZWN0IGNhbiBvbmx5IHRha2UgYWR2 YW50YWdlIG9mIGEgZmV3IG1pbmRzLiBBcw0KCUxpbnV4IGFuZCBvdGhlciBw cm9qZWN0cyBoYXZlIHNob3duLCBhIHdlbGwtcnVuIE9wZW4gU291cmNlDQoJ cHJvamVjdCBjYW4gaGFybmVzcyB0aGUgZXhwZXJpZW5jZSBhbmQgaW5zaWdo dCBvZiB0aG91c2FuZHMgb2YNCglwZW9wbGUuDQoJPC9MST4NCg0KCTxMST48 RU0+TG93ZXJpbmcgYmFycmllcnMgdG8gYWRvcHRpb24uIDwvRU0+DQoJPEJS Pg0KCUZyZWVseS1hdmFpbGFibGUgdG9vbHMgYXJlIG1vcmUgbGlrZWx5IHRv IGJlIHBpY2tlZCB1cCB0aGFuDQoJdGhlaXIgY29tbWVyY2lhbCBlcXVpdmFs ZW50cy4gVGhpcyBpcyBwYXJ0aWN1bGFybHkgdHJ1ZSB3aGVuDQoJdGhlIHRv b2wgaW4gcXVlc3Rpb24gZG9lcyBzb21ldGhpbmcgbm92ZWwgKGF0IGxlYXN0 IGZyb20gdGhlDQoJcG9pbnQgb2YgdGhlIHBlcnNvbiBhZG9wdGluZyBpdCks IGFuZCBpbiBhY2FkZW1pYSAod2hlcmUNCglidWRnZXRzIGFyZSBsaW1pdGVk KS4NCgk8L0xJPg0KDQoJPExJPjxFTT5FbmNvdXJhZ2luZyBwZWVyIHJldmll dy48L0VNPg0KCTxCUj4NCglEYW4gR2V6ZWx0ZXKScyA8YQ0KCWhyZWY9Imh0 dHA6Ly93d3cub3BlbnNjaWVuY2Uub3JnL3RhbGsvYm5sL2luZGV4Lmh0bWwi PnRhbGs8L2E+DQoJYXQgdGhlIGZpcnN0IE9wZW4gU291cmNlL09wZW4gU2Np ZW5jZSBjb25mZXJlbmNlIGRpc2N1c3NlZCBob3cNCgl0aGUgc2NpZW50aWZp YyB0cmFkaXRpb24gb2YgcGVlciByZXZpZXcgZml0cyB3aXRoIHRoZQ0KCXBo aWxvc29waHkgb2YgdGhlIE9wZW4gU291cmNlIG1vdmVtZW50LiBCeSBkZXNp Z25pbmcgYW5kDQoJYnVpbGRpbmcgdGhlc2UgdG9vbHMgaW4gdGhlIG9wZW4s IHRoZSBTb2Z0d2FyZSBDYXJwZW50cnkNCglwcm9qZWN0IHdpbGwgYm90aCBl bmNvdXJhZ2UgcGVlciByZXZpZXcgb2YgdGhlIHRvb2xzDQoJdGhlbXNlbHZl cywgYW5kIGRlbW9uc3RyYXRlIGhvdyB0aGlzIG91Z2h0IHRvIGJlIGRvbmUg Zm9yDQoJc2NpZW50aWZpYyBhbmQgY29tbWVyY2lhbCBzb2Z0d2FyZS4NCgk8 L0xJPg0KDQoJPC9PTD4NCg0KPExJPjxFTT5XaGVyZSBkb2VzIHRoZSBmdW5k aW5nIGNvbWUgZnJvbT8gPC9FTT4NCjxCUj4NClRoZSBmdW5kaW5nIGNvbWVz IGZyb20gdGhlIFUuUy4gRGVwYXJ0bWVudCBvZiBFbmVyZ3ksIHRocm91Z2gg dGhlDQpBZHZhbmNlZCBDb21wdXRpbmcgTGFib3JhdG9yeSBhdCBMb3MgQWxh bW9zIE5hdGlvbmFsIExhYm9yYXRvcnkuIFRoZQ0KcHJvamVjdCBpcyBiZWlu ZyBhZG1pbmlzdGVyZWQgYnkgQ29kZSBTb3VyY2VyeS4gVVMkNDgwLDAwMCBo YXMgYmVlbg0KcHJvdmlkZWQgZm9yIDIwMDAsIGFuZCBVUyQzODAsMDAwIGZv ciAyMDAxLg0KPC9MST4NCg0KPExJPjxFTT5XaHkgd291bGQgdGhlIERlcGFy dG1lbnQgb2YgRW5lcmd5IGZ1bmQgc29tZXRoaW5nIGxpa2UgdGhpcz88L0VN Pg0KPEJSPg0KVGhlIGZ1bmRpbmcgaGFzIGJlZW4gcHJvdmlkZWQgcGFydGx5 IGJlY2F1c2UgdGhlIERvRSB3b3VsZCBsaWtlDQpzY2llbnRpc3RzIGFuZCBl bmdpbmVlcnMgdG8gYmUgbW9yZSBwcm9kdWN0aXZlLCBhbmQgcGFydGx5IGJl Y2F1c2UgaXQNCndvdWxkIGxpa2UgdG8gZmluZCBvdXQgd2hldGhlciB0aGUg T3BlbiBTb3VyY2UgbW9kZWwgYW5kIGNvbW11bml0eSBjYW4NCm1lZXQgdGhl IHNwZWNpYWwgbmVlZHMgb2YgaGlnaC1wZXJmb3JtYW5jZSBjb21wdXRhdGlv bmFsIHNjaWVuY2UuIFRoZQ0KbGFzdCBmZXcgeWVhcnMgaGF2ZSBzZWVuIG1v c3QgbWFudWZhY3R1cmVycyBvZiBzcGVjaWFsLXB1cnBvc2UNCnN1cGVyY29t cHV0ZXJzIGRpc2FwcGVhciBvciBiZSBib3VnaHQgb3V0LCBhbmQgdGhlIHJp c2Ugb2YgY2x1c3RlcnMNCmJhc2VkIG9uIGNvbW1lcmNpYWwgb2ZmLXRoZS1z aGVsZiAoQ09UUykgaGFyZHdhcmUsIExpbnV4LCBNUEksIHRoZSBHTlUNCmNv bXBpbGVyIHRvb2xzZXQsIGFuZCBzbyBvbi4gVGhlcmUgaXMgYSBncm93aW5n IGZlZWxpbmcgdGhhdCB0aGVzZQ0KbWFjaGluZXMgY291bGQgYnJpbmcgc2Nh bGFibGUgc3VwZXJjb21wdXRpbmcgaW50byB0aGUgbWFpbnN0cmVhbSwgYnV0 DQp0aGlzIHdpbGwgb25seSBoYXBwZW4gaWYgZ29vZCB0b29scyBhbmQgcHJh Y3RpY2VzIGFyZSBhY2Nlc3NpYmxlDQplbm91Z2guDQo8L0xJPg0KDQo8TEk+ PEVNPkknbSBub3QgYSBzY2llbnRpc3Qgb3IgZW5naW5lZXItLS13aGF0J3Mg aW4gaXQgZm9yIG1lPyA8L0VNPg0KPEJSPg0KVGhlIHRoaW5ncyB0aGF0IG1h a2UgbWFueSBleGlzdGluZyBPcGVuIFNvdXJjZSBzb2Z0d2FyZSBkZXZlbG9w bWVudA0KdG9vbHMgZGlmZmljdWx0IHRvIGxlYXJuIGFuZCB1c2UtLS1vYnNj dXJlIHN5bnRheCwgYXJiaXRyYXJ5IG9yDQpoYXJkLXRvLWZvbGxvdyBiZWhh dmlvciwgYW5kIHBvb3IgZG9jdW1lbnRhdGlvbi0tLWFmZmVjdCBwcm9mZXNz aW9uYWwNCnByb2dyYW1tZXJzIGFuZCBjb21wdXRlciBzY2llbmNlIHN0dWRl bnRzIGp1c3QgYXMgbXVjaCBhcyB0aGV5IGRvDQpjb21wdXRhdGlvbmFsIHNj aWVudGlzdHMgYW5kIGVuZ2luZWVycy4gSWYgdGhlIE9wZW4gU291cmNlIG1v dmVtZW50DQpjYW4gYnVpbGQgdG9vbHMgdGhhdCBhcmUgc2ltcGxlIGVub3Vn aCB0byBiZSBsZWFybmVkIGJ5IHBlb3BsZSB3aG8NCmhhdmUgcHJvYmxlbXMg b2YgdGhlaXIgb3duIHRvIHNvbHZlLCBhbmQgeWV0IHBvd2VyZnVsIGVub3Vn aCB0bw0Kc3VwcG9ydCBkaXN0cmlidXRlZCBkZXZlbG9wbWVudCBvZiBodW5k cmVkcyBvZiB0aG91c2FuZHMgb2YgbGluZXMgb2YNCmNvbXBsZXggbnVtZXJp Y2FsIGFuZCB2aXN1YWxpemF0aW9uIGNvZGUsIHRoZW4gdGhvc2UgdG9vbHMg d2lsbA0KcHJvYmFibHkgYWxzbyBoZWxwIHBlb3BsZSB3aG8gd2FudCB0byBi dWlsZCBJbnRlcm5ldCBjaGF0IHJvb21zIGFuZA0Kb3JkZXItdHJhY2tpbmcg c3lzdGVtcy4NCjxCUj4NClRoaXMgcHJvamVjdCBzaG91bGQgYWxzbyBiZSBp bnRlcmVzdGluZyB0byB0aGUgZ2VuZXJhbCBwcm9ncmFtbWluZw0KY29tbXVu aXR5IGJlY2F1c2UgaXQgaXMgZ29pbmcgdG8gcGxhY2UgbW9yZSBlbXBoYXNp cyBvbiBkZXNpZ24gYW5kDQplYXJseSBmZWVkYmFjayB0aGFuIG1vc3QgT3Bl biBTb3VyY2UgcHJvamVjdHMgaGF2ZSB0byBkYXRlLiBJbnN0ZWFkIG9mDQpn cm93aW5nIHNvbWVvbmWScyBwZXQgcHJvamVjdCwgU29mdHdhcmUgQ2FycGVu dHJ5IGlzIGdvaW5nIHRvDQpvcmdhbml6ZS0tLWFuZCBwYXkgZm9yLS0tYSBk ZXNpZ24gY29tcGV0aXRpb24uIElmIHRoaXMgd29ya3MsIGl0IGNvdWxkDQpi ZSBhbiBpbnRlcmVzdGluZyBtb2RlbCBmb3Igb3RoZXIgT3BlbiBTb3VyY2Ug cHJvamVjdHMgdG8gYWRvcHQuDQo8L0xJPg0KDQo8TEk+PEVNPkkgdGhpbmsg W3Rvb2xdIGlzIGdvb2QgZW5vdWdoIGFscmVhZHktLS13aHkgYXJlIHlvdSBy ZS1pbnZlbnRpbmcgdGhlIHdoZWVsPyA8L0VNPg0KPEJSPg0KVGhlIHNob3J0 IGFuc3dlciB0byB0aGlzIGlzIEFsYW4gQ29vcGVyJ3M6DQoNCg0KCTxCTE9D S1FVT1RFPg0KCVRoZSBwaHJhc2UgImNvbXB1dGVyIGxpdGVyYXRlIHVzZXIi IHJlYWxseSBtZWFucyB0aGUgcGVyc29uDQoJaGFzIGJlZW4gaHVydCBzbyBt YW55IHRpbWVzIHRoYXQgdGhlIHNjYXIgdGlzc3VlIGlzIHRoaWNrDQoJZW5v dWdoIHNvIGhlIG5vIGxvbmdlciBmZWVscyB0aGUgcGFpbi4NCgk8QlI+DQoJ LS0gQWxhbiBDb29wZXIsDQoJPENJVEU+VGhlIElubWF0ZXMgYXJlIFJ1bm5p bmcgdGhlIEFzeWx1bTwvQ0lURT4NCgk8L0JMT0NLUVVPVEU+DQoNClRoZSBs b25nZXIgYW5zd2VyIGlzIHRoYXQgdGhlICJhY2NpZGVudGFsIGNvbXBsZXhp dHkiIG9mIHRoZSBzdGFuZGFyZA0KVW5peCBjb21tYW5kLWxpbmUgdG9vbHNl dCBpcyBhIG1ham9yIGJhcnJpZXIgdG8gaXRzIGFkb3B0aW9uIGJ5IHBlb3Bs ZQ0Kd2hvIGFyZSBub3QgZnVsbC10aW1lIHByb2dyYW1tZXJzLCBvciBmb3Ig d2hvbSBwcm9ncmFtbWluZyBpcyBqdXN0DQpzb21ldGhpbmcgdGhhdCBoYXMg dG8gYmUgZG9uZSBpbiBvcmRlciB0byBkbyBzb21ldGhpbmcgZWxzZS4gTWFu eQ0KcHJvZmVzc2lvbmFsIHByb2dyYW1tZXJzLS0tcGFydGljdWxhcmx5IHRo b3NlIHdobyBlbmpveSBwcm9ncmFtbWluZw0KZW5vdWdoIHRvIGJlIGludm9s dmVkIGluIHRoZSBPcGVuIFNvdXJjZSBtb3ZlbWVudC0tLWhhdmUgYmVlbiB1 c2luZw0KdGhlc2UgdG9vbHMgZm9yIHNvIGxvbmcgdGhhdCB0aGV5IHNpbXBs eSBkb24ndCByZW1lbWJlciBob3cgaGFyZCBpdCBpcw0KdG8gY29uZmlndXJl IEduYXRzLCBvciBwYXNzIHZhcmlhYmxlIGJpbmRpbmdzIGJldHdlZW4gcmVj dXJzaXZlIGNhbGxzDQp0byBNYWtlLg0KPEJSPg0KQW5kIGxldCdzIGZhY2Ug aXQ6IGlmIE1ha2Ugb3IgQXV0b2NvbmYgd2VyZSBidWlsdCBmcm9tIHNjcmF0 Y2ggdG9kYXksDQp0aGV5IHdvdWxkIGJlIHdyaXR0ZW4gYXMgZXh0ZW5zaWJs ZSwgZW1iZWRkYWJsZSBtb2R1bGVzIGluIGENCmhpZ2gtbGV2ZWwgc2NyaXB0 aW5nIGxhbmd1YWdlLiBUaGlzIHdvdWxkIG5vdCBvbmx5IG1ha2UgdGhlbSBl YXNpZXIgdG8NCnVzZSwgaXQgd291bGQgYWxzbyBtYWtlIHRoZW0gZWFzaWVy IHRvIGxlYXJuLCBzaW5jZSB0aGV5IHdvdWxkIGVtcGxveQ0Kb25lIHN5bnRh eCBmb3IgYWxsIHB1cnBvc2VzLiBNaWNyb3NvZnQgVmlzdWFsIEJhc2ljIGhh cyBzaG93biBqdXN0IGhvdw0KdXNlZnVsIGl0IGNhbiBiZSB0byBoYXZlIGEg c2luZ2xlIGdlbmVyYWwtcHVycG9zZSAiZ2x1ZSIgbGFuZ3VhZ2UNCmNhcGFi bGUgb2YgYmluZGluZyBkaXNwYXJhdGUgdG9vbHMgdG9nZXRoZXI7IHRoZSBh aW0gb2YgdGhlIGZpcnN0IGhhbGYNCm9mIHRoaXMgcHJvamVjdCBpcyB0byBi cmluZyB0aG9zZSBiZW5lZml0cyB0byB0aGUgT3BlbiBTb3VyY2UNCmNvbW11 bml0eS4NCg0KPC9PTD4NCg0KPEgyPkRldmVsb3BtZW50PC9IMj4NCg0KPE9M Pg0KDQo8TEk+PEVNPldoYXQgcHJvamVjdHMgYXJlIGN1cnJlbnRseSB1bmRl ciB3YXk/IDwvRU0+DQo8QlI+U29mdHdhcmUgQ2FycGVudHJ5IHdpbGwgc3Rh cnQgYnkgcHJvZHVjaW5nOg0KPC9MST4NCg0KCTxPTD4NCg0KCTxMST5hIHBs YXRmb3JtIGluc3BlY3Rpb24gdG9vbCBzaW1pbGFyIHRvIEF1dG9jb25mOzwv TEk+DQoNCgk8TEk+YSBidWlsZCBtYW5hZ2VtZW50IHRvb2wgc2ltaWxhciB0 byBNYWtlOzwvTEk+DQoNCgk8TEk+YW4gaXNzdWUgdHJhY2tpbmcgc3lzdGVt IHNpbWlsYXIgdG8gR25hdHMgb3IgQnVnemlsbGE7IGFuZDwvTEk+DQoNCgk8 TEk+YSB1bml0IGFuZCByZWdyZXNzaW9uIHRlc3RpbmcgaGFybmVzcyB3aXRo IHRoZQ0KCWZ1bmN0aW9uYWxpdHkgb2YgWFVuaXQsIEV4cGVjdCwgYW5kIERl amFHbnUuPC9MST4NCg0KCTwvT0w+DQoNCjxMST48RU0+V2h5IHdlcmUgdGhv c2UgdG9vbHMgY2hvc2VuPyA8L0VNPg0KPEJSPg0KVGhlc2UgZm91ciB0b29s cyB3ZXJlIGNob3NlbiBhcyBpbml0aWFsIHRhcmdldHMgZm9yIHNldmVyYWwN CnJlYXNvbnMuIEZpcnN0LCB0aGUgd29ya2luZyBwcmFjdGljZXMgdGhleSBz dXBwb3J0IGFyZSBlc3NlbnRpYWwgdG8NCm1lZGl1bS1zY2FsZSBzb2Z0d2Fy ZSBlbmdpbmVlcmluZy4gU2Vjb25kLCB0aGUgdG9vbHMgdGhleSBhcmUgaW50 ZW5kZWQNCnRvIHJlcGxhY2UgYXJlIGdlbmVyYWxseSByZWNvZ25pemVkIGFz IGJlaW5nIG91dGRhdGVkIG9yIGZsYXdlZC4gVGhpcw0KY3JlYXRlcyBkZW1h bmQsIGFuZCBpbmNyZWFzZXMgdGhlIG9kZHMgdGhhdCByYXRpb25hbCByZWlt cGxlbWVudGF0aW9ucw0Kd2lsbCBiZSBhZG9wdGVkLiBUaGlyZCwgZW5vdWdo IHBlb3BsZSBoYXZlIGVub3VnaCBleHBlcmllbmNlIHdpdGggdGhlDQp0b29s cyB0aGF0IGFyZSB0byBiZSByZXBsYWNlZCB0byBwYXJ0aWNpcGF0ZSBpbiB0 aGUgZGVzaWduIGNvbXBldGl0aW9uDQpkZXNjcmliZWQgbGF0ZXIuDQo8L0xJ Pg0KDQo8TEk+PEVNPldoeSBpc26SdCBbdG9vbF0gb24gdGhpcyBsaXN0Pzwv RU0+DQo8QlI+DQpUaGVyZSBhcmUgc2V2ZXJhbCBvdGhlciB0b29scyB0aGF0 IGNvdWxkIGhhdmUgYmVlbiBvbiB0aGlzIGxpc3QsIGFuZA0Kd2lsbCBiZSBh ZGRlZCBpZiB0aGUgZmlyc3Qgcm91bmQgb2Ygd29yayBnb2VzIHdlbGwuIEEg Y3Jvc3MtcGxhdGZvcm0NCnZlcnNpb24gY29udHJvbCBzeXN0ZW0gdGhhdCBj b3JyZWN0cyB0aGUgbWFueSBkZWZpY2llbmNpZXMgaW4gQ1ZTLCBmb3INCmV4 YW1wbGUsIGlzIGFuIG9idmlvdXMgY2FuZGlkYXRlLCBidXQgaXMgcHJvYmFi bHkgdG9vIGxhcmdlIHRvIGJlDQp0YWNrbGVkIGluaXRpYWxseSwgYW5kIGFu eSB3b3JrIGRvbmUgYnkgU29mdHdhcmUgQ2FycGVudHJ5IGNvdWxkIHdlbGwN CmJlIHN1cGVyc2VkZWQgYnkgQml0S2VlcGVyLiBTaW1pbGFybHksIHRoZSB3 b3JsZCBuZWVkcyBhIGdvb2QgT3Blbg0KU291cmNlIHByb2plY3QgbWFuYWdl bWVudCB0b29sIHdpdGggdGhlIGZ1bmN0aW9uYWxpdHkgb2YgTWljcm9zb2Z0 DQpQcm9qZWN0LCBidXQgcHJvYmFibHkgbmVlZHMgdGhlIGZvdXIgdG9vbHMg bGlzdGVkIGFib3ZlIG1vcmUgdXJnZW50bHkuDQo8L0xJPg0KDQo8TEk+PEVN PldoYXQgbGFuZ3VhZ2VzIGFuZCB0b29scyB3aWxsIGJlIHVzZWQ/IDwvRU0+ DQo8QlI+DQpBbGwgZGV2ZWxvcG1lbnQgd29yayB3aWxsIGJlIGRvbmUgaW4g UHl0aG9uLg0KPC9MST4NCg0KPExJPjxFTT5XaHkgUHl0aG9uPyA8L0VNPg0K PEJSPg0KVGhpcyBpcyBhY3R1YWxseSB0aHJlZSBxdWVzdGlvbnM6DQoNCgk8 T0w+DQoNCgk8TEk+PEVNPldoeSBtYW5kYXRlIGEgbGFuZ3VhZ2U/IDwvRU0+ DQoJPEJSPg0KCUJ1aWxkaW5nIGV2ZXJ5dGhpbmcgaW4gYSBzaW5nbGUgbGFu Z3VhZ2Ugd2lsbCBlbmNvdXJhZ2UNCglwcm9qZWN0cyB0byBzaGFyZSBjb2Rl LCB3aGljaCB3aWxsIGJvdGgga2VlcCB0aGUgdG90YWwgdm9sdW1lDQoJb2Yg Y29kZSBtYW5hZ2VhYmxlIGFuZCByYWlzZSB0aGUgcXVhbGl0eSBvZiB0aGUN CglpbXBsZW1lbnRhdGlvbnMgKHNpbmNlIHRoZSBzaGFyZWQgY29kZSB3aWxs IGJlIGV4ZXJjaXNlZCwgYW5kDQoJdGVzdGVkLCBpbiBtYW55IGRpZmZlcmVu dCB3YXlzKS4gVXNpbmcgYSBzaW5nbGUgbGFuZ3VhZ2Ugd2lsbA0KCWFsc28g aW1wcm92ZSB0aGUgY29tcHJlaGVuc2liaWxpdHksIGFuZCBoZW5jZSB0aGUN CgltYWludGFpbmFiaWxpdHkgYW5kIGV4dGVuc2liaWxpdHksIG9mIHRoZSB0 b29scy4gVGhlIHZhcnlpbmcNCglzeW50YXggb2YgTWFrZSwgQXV0b2NvbmYs IGFuZCBvdGhlciB0b29scyBpcyBhIGxhcmdlIHByYWN0aWNhbA0KCWJhcnJp ZXIgdG8gdGhlaXIgYWRvcHRpb24gYnkgcGVvcGxlIHdobyBoYXZlIGJldHRl ciAob3IgYXQNCglsZWFzdCBtb3JlIHByZXNzaW5nKSB0aGluZ3MgdG8gZG8g dGhhbiBsZWFybiB5ZXQgYW5vdGhlcg0KCXN5bnRheC4gTWljcm9zb2Z0knMg VmlzdWFsIEJhc2ljIGhhcyBzaG93biBob3cgcG93ZXJmdWwgaXQNCglpcyB0 byB1c2UgYSBzaW5nbGUsIGZsZXhpYmxlIGxhbmd1YWdlIGV2ZXJ5d2hlcmUu DQoJPC9MST4NCg0KCTxMST48RU0+V2h5IHVzZSBhIHNjcmlwdGluZyBsYW5n dWFnZT8gPC9FTT4NCgk8QlI+DQoJQSBsb3Qgb2YgYW5lY2RvdGFsIGV2aWRl bmNlIHNob3dzIHRoYXQgInJlbGF4ZWQiIGhpZ2gtbGV2ZWwNCglsYW5ndWFn ZXMgKGxpa2UgUHl0aG9uLCBQZXJsLCBhbmQgVmlzdWFsIEJhc2ljKSBhcmUg bW9yZQ0KCXByb2R1Y3RpdmUgdmVoaWNsZXMgZm9yIHByb2Nlc3MgbWFuYWdl bWVudCwgdGV4dCBwcm9jZXNzaW5nLA0KCWFuZCBzaW1pbGFyIHRhc2tzIHRo YW4gdGhlaXIgInN0cmljdCIgZXF1aXZhbGVudHMgKGxpa2UgQysrDQoJYW5k IEphdmEpLg0KCTwvTEk+DQoNCgk8TEk+PEVNPldoeSB1c2UgUHl0aG9uPyA8 L0VNPg0KCTxCUj4NCglUaGUgZm91ciBjYW5kaWRhdGVzIGNvbnNpZGVyZWQg d2VyZSBWaXN1YWwgQmFzaWMsIFBlcmwsIFRjbCwNCglhbmQgUHl0aG9uLg0K DQoJCTxPTD4NCg0KCQk8TEk+PEVNPlZpc3VhbCBCYXNpYyA8L0VNPg0KCQk8 QlI+DQoJCVZpc3VhbCBCYXNpYyBpcyBwcm9wcmlldGFyeSwgYW5kIHRoZXJl IGlzIG5vDQoJCWluZGljYXRpb24gdGhhdCBhIGNyZWRpYmxlIE9wZW4gU291 cmNlIGltcGxlbWVudGF0aW9uDQoJCXdpbGwgYXBwZWFyIGFueSB0aW1lIHNv b24uDQoJCTwvTEk+DQoNCgkJPExJPjxFTT5QZXJsPC9FTT4NCgkJPEJSPg0K CQlQZXJsIHdhcyBhIHN0cm9uZyBjb250ZW5kZXIsIHByaW1hcmlseSBiZWNh dXNlIG9mIHRoZQ0KCQltYW55IGxpYnJhcmllcyB0aGF0IGhhdmUgYmVlbiBk ZXZlbG9wZWQgZm9yIGl0LCBhbmQNCgkJYmVjYXVzZSBvZiB0aGUgbnVtYmVy IG9mIGJvb2tzIHRoYXQgZG9jdW1lbnQNCgkJaXQuIEhvd2V2ZXIsIG91ciBl eHBlcmllbmNlIHRlYWNoaW5nIGF0IExvcyBBbGFtb3Mgd2FzDQoJCXRoYXQg UGVybJJzIHN5bnRheCBpcyBoYXJkIHRvIGxlYXJuLCBpdHMgYmVoYXZpb3IN CgkJb2Z0ZW4gYXJiaXRyYXJ5LCBhbmQgaXRzIHNpemUgaW50aW1pZGF0aW5n LiBXaGlsZQ0KCQlmdWxsLXRpbWUgcHJvZmVzc2lvbmFsIHByb2dyYW1tZXJz IHdpdGggc2V2ZXJhbCBvdGhlcg0KCQlsYW5ndWFnZXMgdW5kZXIgdGhlaXIg YmVsdHMgbWlnaHQgKGFuZCBvZnRlbiBkbykgc2F5DQoJCXRoYXQgaXQgYWxs IG1ha2VzIHNlbnNlIG9uY2UgeW91IGtub3cgaXQsIHdlIHdhbnQgdG8NCgkJ bWFrZSB0aGUgbGVhcm5pbmcgY3VydmUgYXMgZ2VudGxlIGFzIHBvc3NpYmxl Lg0KCQk8L0xJPg0KDQoNCgkJPExJPjxFTT5UY2w8L0VNPg0KCQk8QlI+DQoJ CVRjbCBpcyBlYXNpZXIgdG8gbGVhcm4gYW5kIHJlYWQgdGhhbiBQZXJsLCBi dXQgaXMgbm90DQoJCWFzIHdlbGwgZG9jdW1lbnRlZCwgYW5kIGRvZXNuknQg Y29tZSB3aXRoIGFzIG1hbnkNCgkJbGlicmFyaWVzLiBIYWQgUHl0aG9uIG5v dCBleGlzdGVkLCBUY2wgd291bGQgcHJvYmFibHkNCgkJaGF2ZSBiZWVuIGNo b3NlbiBmb3IgdGhpcyBwcm9qZWN0Lg0KCQk8L0xJPg0KDQoJCTxMST48RU0+ UHl0aG9uPC9FTT4NCgkJPEJSPg0KCQlQeXRob24gcHJvdmlkZXMgdGhlIHNh bWUgZnVuY3Rpb25hbGl0eSBhcyBQZXJsIG9yIFRjbCwNCgkJYnV0IGhhcyBw cm92ZWQgdG8gYmUgZWFzaWVyIHRvIGxlYXJuLCByZWFkLCBhbmQNCgkJcmVt ZW1iZXIuIChGb3IgZXhhbXBsZSwgd29yZHMgbGlrZSAiZXhjZXB0IiBhbmQN CgkJInVubGVzcyIgYXBwZWFyIG11Y2ggbGVzcyBvZnRlbiBpbiBQeXRob24g cmVmZXJlbmNlDQoJCW1hdGVyaWFsIHRoYW4gdGhleSBkbyBpbiBQZXJsIHJl ZmVyZW5jZSBtYXRlcmlhbC4pDQoJCVB5dGhvbiBpcyBub3QgeWV0IGFzIGV4 dGVuc2l2ZWx5IGRvY3VtZW50ZWQgYXMgUGVybCwNCgkJYnV0IHRoZSBudW1i ZXIgb2YgYm9va3MgaXMgZ3Jvd2luZywgYXMgaXMgdGhlIG51bWJlcg0KCQlv ZiBtb2R1bGVzIGFuZCBsaWJyYXJpZXMuIEZpbmFsbHksIHRoZSBQeXRob24N CgkJY29tbXVuaXR5IGlzIHN0aWxsIHNtYWxsIGVub3VnaCBmb3IgYSBwcm9q ZWN0IGxpa2UNCgkJdGhpcyBvbmUgdG8gYXR0cmFjdCB0aGUgYXR0ZW50aW9u IG9mIGEgc2lnbmlmaWNhbnQNCgkJcHJvcG9ydGlvbiBvZiBpdC4NCgkJPC9M ST4NCg0KCQk8L09MPg0KCTwvTEk+DQoJPC9PTD4NCg0KPC9MST4NCg0KPExJ PjxFTT5Ib3cgd2lsbCBkZXZlbG9wbWVudCBiZSBvcmdhbml6ZWQgYW5kIGNv b3JkaW5hdGVkPyA8L0VNPg0KPEJSPg0KRXZlcnl0aGluZyB0aGUgcHJvamVj dCBwcm9kdWNlcy0tLWRlc2lnbnMsIGNyaXRpcXVlcyBvZiB0aG9zZSBkZXNp Z25zLA0KdGVzdCBzdWl0ZXMsIGFuZCBleGFtcGxlcywgYXMgd2VsbCBhcyBh Y3R1YWwgc291cmNlIGNvZGUtLS13aWxsIGJlDQphdmFpbGFibGUgdGhyb3Vn aCB0aGUgcHJvamVjdJJzIFdlYiBzaXRlIGF0DQpzb2Z0d2FyZS1jYXJwZW50 cnkuY29kZXNvdXJjZXJ5LmNvbS4gRWFjaCBwcm9qZWN0IHdpbGwgaGF2ZSBh DQpjb29yZGluYXRvciwgd2hvc2Ugam9iIGl0IHdpbGwgYmUgdG8gbW9kZXJh dGUgZGlzY3Vzc2lvbiwgc3luY2hyb25pemUNCnJlbGVhc2VzLCB0cmFjayB3 b3JrIGl0ZW1zLCBhbmQgcmVwb3J0IG9uIHByb2dyZXNzLiBUaGUgY29vcmRp bmF0b3INCndpbGwgYWxzbyBiZSByZXNwb25zaWJsZSBmb3IgY29sbGF0aW5n IGFuZCBlZGl0aW5nIGZlZWRiYWNrIGZyb20NCmp1ZGdlcyBkdXJpbmcgdGhl IGRlc2lnbiBjb21wZXRpdGlvbi4NCjwvTEk+DQoNCjwvT0w+DQoNCjxIMj5E ZXNpZ24gY29tcGV0aXRpb248L0gyPg0KDQo8T0w+DQoNCjxMST48RU0+V2h5 IGEgZGVzaWduIGNvbXBldGl0aW9uPzwvRU0+DQo8QlI+DQpNb3N0IE9wZW4g U291cmNlIHBhY2thZ2VzIGhhdmUgdGhlaXIgcm9vdHMgaW4gc29tZW9uZZJz IHBldCBob2JieQ0KcHJvamVjdCwgd2hpY2ggb3RoZXJzIGhhdmUgcGlja2Vk IHVwLCBleHRlbmRlZCwgYW5kIG1vZGlmaWVkLiBUaGlzDQpraW5kIG9mIG9y Z2FuaWMgZ3Jvd3RoIGhhcyBhIGxvdCBvZiBnb29kIGZlYXR1cmVzLCBidXQg YQ0Kd2VsbC1kb2N1bWVudGVkIGRlc2lnbiBpcyBub3Qgb25lIG9mIHRoZW0u IEFzIGEgcmVzdWx0LCBwcm9ncmFtbWVycw0Kb2Z0ZW4gaGF2ZSB0byByZWx5 IG9uIGZvbGtsb3JlIGFuZCByZXZlcnNlIGVuZ2luZWVyaW5nIGlmIHRoZXkg d2FudCB0bw0KYWRkIHRvLCBvciBmaXgsIHRoZXNlIHRvb2xzLiBJbiBhZGRp dGlvbiwgdGhlcmUgaXMgYSBkZWFydGggb2YNCmV4YW1wbGVzIG9mIGdvb2Qg ZGVzaWduIGZvciBuZXcgcHJvZ3JhbW1lcnMgdG8gbGVhcm4gZnJvbS4gPEJS PiBUaGUNClNvZnR3YXJlIENhcnBlbnRyeSBwcm9qZWN0IGhvcGVzIHRvIGFk ZHJlc3MgYm90aCBwcm9ibGVtcyBieSBydW5uaW5nIGENCnR3by1zdGFnZSBk ZXNpZ24gY29tcGV0aXRpb24uIFRoZSBiZXN0IGVudHJpZXMgaW4gYm90aCBy b3VuZHMgd2lsbCBiZQ0KcHVibGlzaGVkLCBhbG9uZyB3aXRoIGNvbW1lbnRh cnkgZnJvbSB0aGUgY29tcGV0aXRpb26Scw0KanVkZ2VzLiBUaGlzIG1hdGVy aWFsIHdpbGwgc2VydmUgYm90aCB0byBpbmZvcm0gYW5kIGd1aWRlIGZ1cnRo ZXINCmRldmVsb3BtZW50LCBhbmQgdG8gc2hvdyBub3ZpY2VzIHdoYXQgZXhw ZXJpZW5jZWQgcHJvZ3JhbW1lcnMgdGhpbmsNCmFib3V0IGJlZm9yZSB0aGV5 IHN0YXJ0IGNvZGluZy4NCjwvTEk+DQoNCjxMST48RU0+V2hvIGNhbiBlbnRl cj8gPC9FTT4NCjxCUj4NCkV2ZXJ5b25lOiBpbmRpdmlkdWFscyBhbmQgdGVh bXMsIHN0dWRlbnRzIGFuZCBwcm9mZXNzaW9uYWxzLCBmcm9tDQphbnl3aGVy ZSBpbiB0aGUgd29ybGQuDQo8L0xJPg0KDQo8TEk+PEVNPldoYXQgYXJlIHRo ZSBydWxlcz8gPC9FTT4NCjxCUj5UaGUgZnVsbCBydWxlcyBhcmUgYXZhaWxh YmxlIGF0Og0KPENFTlRFUj4NCnNvZnR3YXJlLWNhcnBlbnRyeS5jb2Rlc291 cmNlcnkuY29tL2Rlc2lnbi1jb21wZXRpdGlvbi9ydWxlcy5odG1sDQo8L0NF TlRFUj4NCkJhc2ljYWxseSwgaW5pdGlhbCBzdWJtaXNzaW9ucyBtdXN0IGJl IHdyaXR0ZW4gaW4gRW5nbGlzaCwgYW5kIGNhbiBiZQ0KdXAgdG8gMTAgcGFn ZXMgbG9uZy4gRXhhbXBsZXMgY291bnQgYWdhaW5zdCB0aGlzIGxpbWl0LCBi dXQgZGlhZ3JhbXMNCmFuZCBhIFVuaXgtc3R5bGUgbWFuIHBhZ2UgZG8gbm90 LiBBbnkgcGVyc29uIG9yIHRlYW0gbWF5IHN1Ym1pdCBvbmx5DQpvbmUgZW50 cnkgaW4gYW55IGdpdmVuIGNhdGVnb3J5LCBidXQgY2FuIHN1Ym1pdCBpbiBh cyBtYW55IG9mIHRoZSBmb3VyDQpjYXRlZ29yaWVzIGFzIGRlc2lyZWQuDQo8 QlI+DQpUaGUgYmVzdCBmb3VyIGVudHJpZXMgaW4gZWFjaCBjYXRlZ29yeSB3 aWxsIGJlIGF3YXJkZWQgVVMkMjUwMCwgYW5kDQphc2tlZCB0byBzdWJtaXQg ZnVsbCBkZXNpZ25zLiBQYXJ0aWNpcGFudHMgd2lsbCBiZSBzdHJvbmdseSBl bmNvdXJhZ2VkDQp0byBwb29sIHRoZWlyIGVmZm9ydHMgZm9yIHRoZSBzZWNv bmQgcm91bmQuIFRoZSBiZXN0IHNlY29uZC1yb3VuZA0Kc3VibWlzc2lvbiB3 aWxsIGJlIGF3YXJkZWQgYW4gYWRkaXRpb25hbCBVUyQ3NTAwLCB3aGlsZSB0 aGUgb3RoZXJzDQp3aWxsIHJlY2VpdmUgYW5vdGhlciBVUyQyNTAwIGVhY2gu IFRoZSByZWFsIHJld2FyZCB3aWxsIGJlIHNlZWluZyB0aGUNCmRlc2lnbiBp bXBsZW1lbnRlZCwgYW5kIGJlaW5nIGluIGEgZ29vZCBwb3NpdGlvbiB0byBi aWQgb24gdGhlDQppbXBsZW1lbnRhdGlvbiB3b3JrLg0KPC9MST4NCg0KPExJ PjxFTT5XaGF0IHNob3VsZCBmaXJzdC1yb3VuZCBzdWJtaXNzaW9ucyBjb250 YWluPyA8L0VNPg0KPEJSPg0KQW4gZXhhbXBsZSBvZiB3aGF0IGEgc3VibWlz c2lvbiBzaG91bGQgY29udGFpbiwgYW5kIGhvdyBpdCBzaG91bGQgYmUNCmZv cm1hdHRlZCBpcyBhdmFpbGFibGUgYXQ6DQo8Q0VOVEVSPg0Kc29mdHdhcmUt Y2FycGVudHJ5LmNvZGVzb3VyY2VyeS5jb20vZGVzaWduLWNvbXBldGl0aW9u L2V4YW1wbGUuaHRtbA0KPENFTlRFUj4NCkZpcnN0LXJvdW5kIGVudHJpZXMg c2hvdWxkIGZvY3VzIHByaW1hcmlseSBvbiB3aGF0IHRoZSB0b29sIHdpbGwg ZG8sDQphbmQgaG93IGl0IHdpbGwgYmUgdXNlZDogY29tbWFuZC1saW5lIG9w dGlvbnMsIGlucHV0IGFuZCBvdXRwdXQgZmlsZQ0KZm9ybWF0cywgc2tldGNo ZXMgb2YgV2ViIGFuZCBHVUkgaW50ZXJmYWNlcyAod2hlcmUgYXBwcm9wcmlh dGUpLCBhbmQNCnNvIG9uLiBTZWNvbmQtcm91bmQgc3VibWlzc2lvbnMgd2ls bCB0aGVuIGJlIGV4cGVjdGVkIHRvIGRlc2NyaWJlIGhvdw0KaXSScyBhbGwg Z29pbmcgdG8gYmUgaW1wbGVtZW50ZWQuDQo8L0xJPg0KDQo8TEk+PEVNPldo byB3aWxsIHRoZSBqdWRnZXMgYmU/IDwvRU0+DQo8QlI+DQo8Qj5OZWVkIHRv IGZpcm0gdXAgdGhlIGxpc3Qgb2YganVkZ2VzIEFTQVAuPC9CPg0KPC9MST4N Cg0KPExJPjxFTT5XaGVuIGFyZSB0aGUgZGVhZGxpbmVzPyA8L0VNPg0KPEJS Pg0KVGhlIGRlYWRsaW5lIGZvciBmaXJzdC1yb3VuZCBzdWJtaXNzaW9ucyBp cyBNYXJjaCAzMSwgMjAwMC4gVGhlIGZpdmUNCmJlc3QgcHJvcG9zYWxzIGlu IGVhY2ggY2F0ZWdvcnkgd2lsbCBiZSBhbm5vdW5jZWQgb24gQXByaWwgMzAs DQoyMDAwLiBGdWxsIHN1Ym1pc3Npb25zIGFyZSBkdWUgb24gSnVuZSAxLCAy MDAwLCBhbmQgd2lubmVycyB3aWxsIGJlDQphbm5vdW5jZWQgb24gSnVuZSAz MCwgMjAwMC4NCjwvTEk+DQoNCjxMST48RU0+V29uJ3QgcHJpemVzIGRpc2Nv dXJhZ2UgY28tb3BlcmF0aW9uPyA8L0VNPg0KPEJSPg0KV2UgZG9uknQga25v dy4gT24gdGhlIG9uZSBoYW5kLCBwZW9wbGUgbWlnaHQgd2FudCB0byBob2Fy ZCB0aGVpcg0KYmVzdCBpZGVhczsgb24gdGhlIG90aGVyIGhhbmQsIHRoZSBi ZXN0IGRlc2lnbnMgaW4gYm90aCByb3VuZHMgYXJlDQpnb2luZyB0byBiZSBw dWJsaXNoZWQsIGFsb25nIHdpdGggdGhlIGp1ZGdlc5IgY29tbWVudGFyeSwg YW5kIHdlDQp3aWxsIGJlIGVuY291cmFnaW5nIHBhcnRpY2lwYW50cyB0byBw b29sIHRoZWlyIGVmZm9ydHMuIE1vc3Qgb2YgdGhlDQptb25leSB0aGF0IHdp bGwgYmUgcGFpZCBvdXQgd2lsbCBnbyB0byBmdW5kIGltcGxlbWVudGF0aW9u LCB0ZXN0aW5nLA0KYW5kIGRvY3VtZW50YXRpb247IHdlIGhvcGUgdGhhdCBw ZW9wbGUgd2lsbCBjb2xsYWJvcmF0ZSBpbiB0aGUgZWFybHkNCnN0YWdlcywg YW5kIHRyZWF0IHRoZSBwcml6ZXMgYXMgcmVjb2duaXRpb24gZm9yIHRoZWly IGVmZm9ydCwgcmF0aGVyDQp0aGFuIHRyZWF0aW5nIFVTJDEwLDAwMCBhcyB0 aGVpciByZXRpcmVtZW50IGZ1bmQuDQo8L0xJPg0KDQo8L09MPg0KDQo8SDI+ RG9jdW1lbnRhdGlvbjwvSDI+DQoNCjxPTD4NCg0KPExJPjxFTT5XaGF0IGRv Y3VtZW50YXRpb24gd2lsbCBiZSBwcm9kdWNlZD88L0VNPg0KPEJSPg0KVGhl IFNvZnR3YXJlIENhcnBlbnRyeSBwcm9qZWN0IHdpbGwgcHJvZHVjZSBzZXZl cmFsIGRpZmZlcmVudCBraW5kcyBvZg0KZG9jdW1lbnRhdGlvbjoNCg0KCTxP TD4NCg0KCTxMST48RU0+RGVzaWduIGRvY3VtZW50YXRpb24uIDwvRU0+DQoJ PEJSPg0KCUFzIHN0YXRlZCBhYm92ZSwgdGhlIGJlc3QgZGVzaWducyBpbiBl YWNoIGNhdGVnb3J5IHdpbGwgYmUNCglwdWJsaXNoZWQsIGFsb25nIHdpdGgg dGhlIGp1ZGdlc5IgY29tbWVudGFyeS4gVGhpcyBtYXRlcmlhbA0KCW91Z2h0 IHRvIHBsYXkgdGhlIHJvbGUgdGhhdCBtdXNpYyBjcml0aWNpc20gaGFzIHBs YXllZCBpbiB0aGUNCglkZXZlbG9wbWVudCBvZiBtdXNpYywgYnkgZ2l2aW5n IG5ld2NvbWVycyAoYW5kIGV4cGVyaWVuY2VkDQoJcHJvZ3JhbW1lcnMpIGJl dHRlciBpbnNpZ2h0IGludG8gaG93IGdvb2QgZGVzaWduZXJzIHRoaW5rLg0K CTwvTEk+DQoNCgk8TEk+PEVNPlVzZXIgZ3VpZGVzLiA8L0VNPg0KCTxCUj4N CglUaGUgcHJvamVjdCB3aWxsIHBheSBmb3IgdGhlIGRldmVsb3BtZW50IG9m IG1hbiBwYWdlcywgdXNlcg0KCWd1aWRlcywgb25saW5lIGhlbHAsIGFuZCBh bGwgdGhlIG90aGVyIGRvY3VtZW50YXRpb24gbmVlZGVkIHRvDQoJdHVybiBh IHByb2dyYW0gaW50byBhIHByb2R1Y3QuDQoJPC9MST4NCg0KCTxMST48RU0+ VGVzdCBzdWl0ZXMuIDwvRU0+DQoJPEJSPg0KCVRoZSBwcm9qZWN0IHdpbGwg YWxzbyBwYXkgZm9yIHRoZSBkZXZlbG9wbWVudCBvZg0KCWluZHVzdHJpYWwt c3RyZW5ndGggdGVzdCBzdWl0ZXMgZm9yIGFsbCBmb3VyIHRvb2xzLiBUaGVz ZQ0KCXN1aXRlcyB3aWxsIGJlIHB1Ymxpc2hlZCwgYm90aCB0byBzZXJ2ZSBh cyBhIHN0YXJ0aW5nIHBvaW50DQoJZm9yIG90aGVyIHByb2plY3RzIGFuZCB0 byBkZW1vbnN0cmF0ZSBnb29kIHByYWN0aWNlLg0KCTwvTEk+DQoNCgk8TEk+ PEVNPkNhc2Ugc3R1ZGllcy4gPC9FTT4NCgk8QlI+DQoJSXQgaXMgb2Z0ZW4g ZWFzaWVyIHRvIHNob3cgc29tZW9uZSBob3cgdG8gZG8gc29tZXRoaW5nIHRo YW4gdG8NCglleHBsYWluIGl0IHRvIHRoZW0uIFRoZSBTb2Z0d2FyZSBDYXJw ZW50cnkgcHJvamVjdCB3aWxsIHBheQ0KCWZvciBjYXNlIHN0dWRpZXMgdGhh dCBkZXNjcmliZSBob3cgdGhlc2UgdG9vbHMsIGFuZCAobW9yZQ0KCWltcG9y dGFudGx5KSB0aGUgd29ya2luZyBwcmFjdGljZXMgdGhleSBzdXBwb3J0LCBo YXZlIGJlZW4NCglkZXBsb3llZCBpbiBwcmFjdGljZS4gQ2hlY2tsaXN0cywg dGVtcGxhdGVzIGZvciBmb3JtcywgYW5kDQoJb3RoZXIgZXJyYXRhIGNhbiBi ZSBzdWJtaXR0ZWQuDQoJPC9MST4NCg0KCTwvT0w+DQoNCjwvTEk+DQoNCjxM ST48RU0+V2hhdCBmb3JtYXQocykgd2lsbCBiZSB1c2VkPyA8L0VNPg0KPEJS Pg0KVGhlIHByaW1hcnkgZm9ybWF0IGZvciBhbGwgZG9jdW1lbnRhdGlvbiB3 aWxsIGJlIEhUTUwuIFRoZSBwcm9qZWN0DQp3aWxsIG1pZ3JhdGUgdG8gWE1M IHdoZW4gYW5kIGFzIGZlYXNpYmxlLg0KPC9MST4NCg0KPExJPjxFTT5XaGF0 IHJlc3RyaWN0aW9ucyBhcmUgdGhlcmUgb24gdXNpbmcgdGhlIGRvY3VtZW50 YXRpb24/PC9FTT4NCjxCUj4NCk9ubHkgdGhvc2UgdGhhdCBhbHNvIGFwcGx5 IHRvIHRoZSBzb2Z0d2FyZSwgdW5kZXIgdGhlIHRlcm1zIG9mIGl0cw0KT3Bl biBTb3VyY2UgbGljZW5zZS4gWW91IGNhbiBjb3B5IGFuZCBkaXN0cmlidXRl IHRoZSBkb2N1bWVudGF0aW9uIGluDQphbnkgZm9ybSwgYnV0IG9ubHkgaWYg aXRzIGF1dGhvcihzKSBhbmQgb3JpZ2luIGFyZSBjbGVhcmx5IHNob3duLCBh bmQNCmlmIHlvdSBpbmNsdWRlIGEgZGVzY3JpcHRpb24gb2YgaG93IHJlYWRl cnMgY2FuIGFjY2VzcyB0aGUNCm9yaWdpbmFscy4gSW4gcGFydGljdWxhciwg dGhlIGRvY3VtZW50YXRpb24gY2FuIGJlIHJlcHJvZHVjZWQgaW4NCmJvb2tz LCBidXQgb25seSBpZiB0aGUgYXV0aG9ycywgb3JpZ2luLCBhbmQgbG9jYXRp b24gb2YgdGhlIG9yaWdpbmFscw0KaXMgcHJpbnRlZCBjbGVhcmx5IG9uIGVh Y2ggcGFnZS4NCjwvTEk+DQoNCjwvT0w+DQoNCjwvQk9EWT4NCjwvSFRNTD4N Cg== --168427786-691315853-945920620=:4839-- From jack@oratrix.nl Thu Dec 23 10:24:26 1999 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 23 Dec 1999 11:24:26 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Message by Guido van Rossum , Wed, 22 Dec 1999 13:23:45 -0500 , <199912221823.NAA16517@eric.cnri.reston.va.us> Message-ID: <19991223102426.CCB75370CF2@snelboot.oratrix.nl> > Vladimir.Marangozov@inrialpes.fr: > > > Yes. Besides, I still think that string-based exceptions are just > > convenient for quick & dirty, throw-away test scripts. > > They have a hard-to-understand quirk though: the id() of the string is > used to check rather than its value, so that except "foo" doesn't > necessarily catch raise "foo"; but due to various optimization, this > usually works, and people get bent out of shape when it doesn't. I sort-of use this feature when I'm debugging: if I want to know what happens in an exception that is usually caught somewhere higher up in the call stack I simply put quotes around the exception name and the exception will happen uncaught. The same trick works for except: clauses. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From harri.pasanen@trema.com Thu Dec 23 11:44:04 1999 From: harri.pasanen@trema.com (Harri Pasanen) Date: Thu, 23 Dec 1999 13:44:04 +0200 Subject: [Python-Dev] Re: [PSA MEMBERS] Please test new dynamic load behavior References: Message-ID: <38620B04.7CC64485@trema.com> Greg Stein wrote: > > Hi all, > > I reorganized Python's dynamic load/import code over the past few days. > Gudio provided some feedback, I did some more mods, and now it is checked > into CVS. The new loading behavior has been tested on Linux, IRIX, and > Solaris (and probably Windows by now). > ... What was the motivation behind this modification? Just curious, -Harri From Vladimir.Marangozov@inrialpes.fr Thu Dec 23 12:12:40 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Thu, 23 Dec 1999 13:12:40 +0100 (CET) Subject: [Python-Dev] Please test new dynamic load behavior In-Reply-To: from "Greg Stein" at Dec 22, 1999 12:11:43 PM Message-ID: <199912231212.NAA26572@python.inrialpes.fr> Greg Stein wrote: > > Hi all, > > I reorganized Python's dynamic load/import code over the past few days. > Gudio provided some feedback, I did some more mods, and now it is checked > into CVS. The new loading behavior has been tested on Linux, IRIX, and > Solaris (and probably Windows by now). > Great work Greg! > Here are some of the platforms that I believe need specific testing: > > - NetBSD, FreeBSD, OpenBSD, ... > - AIX > - HP/UX > - BeOS > - NeXT > - Mac > - OS/2 > - Win16 AFAICT, the AIX version works perfectly okay. -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From jim@digicool.com Thu Dec 23 14:41:23 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 09:41:23 -0500 Subject: [Python-Dev] str(1L) -> '1' ? Message-ID: <38623493.E6BA6D6F@digicool.com> In November there was an interesting discussion on comp.lang.python about the meaning of __str__ and __repr__. One tidbit that came out of this discussion was that __str__ for longs should drop the trailing 'L'. Was there a decision on this? I'd really like this to happen. We do alot of work with RDBMS systems and long integers seem to come up alot with these systems (as do other fix-decimal number, but that's another topic ;). For example, our latest Sybase and Oracle support in Zope returns long integers for RDBMS types like NUMBER(10,0). The trailing 'L' in the string representation is causeing us some headaches. This seems also to be an issue when using the current standard ODBC interface with Oracle, as indicated in a DB-SIG post today. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido@CNRI.Reston.VA.US Thu Dec 23 14:46:58 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 09:46:58 -0500 Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: Your message of "Thu, 23 Dec 1999 09:41:23 EST." <38623493.E6BA6D6F@digicool.com> References: <38623493.E6BA6D6F@digicool.com> Message-ID: <199912231446.JAA22086@eric.cnri.reston.va.us> [Jim F] > In November there was an interesting discussion on comp.lang.python > about the meaning of __str__ and __repr__. One tidbit that came out > of this discussion was that __str__ for longs should drop the trailing > 'L'. Was there a decision on this? I'd really like this to happen. Yes, I'd like it to happen. I'd also like repr() of a float to return the full precision (using the "%.17g" sprintf format). I haven't done it for lack of time -- feel free to send a patch (don't forget the disclaimer from http://www.python.org/1.5/bugrelease.html). We haven't decided yet what to do with the greater topic of that discussion (or was it a different one?) -- whether the values printed by typing a bare expression in interactive mode should use str(), repr(), or str-special-casing-the-snot-out-of-strings(). --Guido van Rossum (home page: http://www.python.org/~guido/) From jim@digicool.com Thu Dec 23 14:51:14 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 09:51:14 -0500 Subject: [Python-Dev] Fixed-decimal types Message-ID: <386236E2.F97109D3@digicool.com> While on the subject of RDBMS systems, a common need is to be able to work with fixed-decimal data. I think a standard Python fixed-decimal type would help to make Python database interfaces alot more robust. I even wonder if the Python long type might be hijacked for this purpose by adding a "scale" that indicates the number of digits to the right of the decimal point. For example, an expression like: 1000000000.2500L would create a fixed decimal number with a scale of 4. People have built Python classes for fixed-decimal types, but when working with RDBMS data, one often deals with lots of data and efficiency matters. I also suspect that adding scale to longs wouldn't be that hard and would be a fairly natural extension. In any case, a "standard" (being in the standard library would be sufficient) fixed-decimal type would probably lead to better database interfaces that (at least more) properly handled fixed-decimal data. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido@CNRI.Reston.VA.US Thu Dec 23 14:56:33 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 09:56:33 -0500 Subject: [Python-Dev] Fixed-decimal types In-Reply-To: Your message of "Thu, 23 Dec 1999 09:51:14 EST." <386236E2.F97109D3@digicool.com> References: <386236E2.F97109D3@digicool.com> Message-ID: <199912231456.JAA22134@eric.cnri.reston.va.us> What would be scale of the product of two fixed-decimal numbers? E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are arguments for either. Same question for division (harder, I think). I like the idea of using the dd.ddL notation for this. I have no time to implement it but would not be unwilling to accept patches. They would have to be accompanied with a wet signature, see http://www.python.org/1.5/wetsign.html. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim@digicool.com Thu Dec 23 15:00:25 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 10:00:25 -0500 Subject: [Python-Dev] re: Open Source design competition / Python / software tools References: Message-ID: <38623909.CDF41014@digicool.com> gvwilson@nevex.com wrote: > > Hi, folks. I hope you don't mind another mail out of the blue, but I got > notice on Saturday that the Department of Energy is giving me $860K over > two years to support development of easier-to-use software engineering > tools. All of the work will be Open Source, and will be done in Python, > with a strong emphasis on design, testing, and documentation. The > project's long-term objective is to encourage scientists and engineers to > treat programs in the same way as they do other experiments, i.e. to > calibrate, test, peer review, and so on. > > To kick-start things, we're going to be holding a two-round design > competition. Anyone (individual or team, professional or student) can > submit a short entry for the first round; the judges will pick four > candidates to go forward in each of four categories, and those > individuals or teams will be asked to submit full entries. The four > categories are: > > * an issue tracking system to replace Gnats and Bugzilla; > > * a build system to replace make; > > * a platform inspection and configuration system to replace autoconf; > and > > * a testing framework to replace XUnit, Expect, and DejaGnu. > > Would you be interested in participating in any way Are these categories fixed? I see a very strong need for an open-source UML modeling tool. UML is extremely powerful, but current UML tools largely suck and are very expensive. We are contemplating launching an open-source development effort to build UML modeling tools using Zope or the Zope object database as a repository. A contest like this could help to kick-start this effort, but tools to automate requirements and design seem to be missing. This is odd, considering that up-front activities like requirements and design have the largest impact on software-engineering project success. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From andy@robanal.demon.co.uk Thu Dec 23 15:13:22 1999 From: andy@robanal.demon.co.uk (=?iso-8859-1?q?Andy=20Robinson?=) Date: Thu, 23 Dec 1999 07:13:22 -0800 (PST) Subject: [Python-Dev] Fixed-decimal types Message-ID: <19991223151322.5698.qmail@web604.mail.yahoo.com> --- Guido van Rossum wrote: > What would be scale of the product of two > fixed-decimal numbers? > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to > 4.00L? There are > arguments for either. Same question for division > (harder, I think). Most commonly one is trying to avoid rounding errors when dealing with money - a few cents rounding error tends to result in a few billable hours with the accountants at the end of the year! SQL dialects and type-safe languages would make you specify the precision of the variable to be assigned, so the issue does not arise for other languages. For the work I do, simply taking the precision of the most precise input (4.00L)would do the trick, but your answer (4.0000L) is purer. We should provide a rounding function, and in practice anyone using such a function would round (or floor, or ceiling) to get to the desired precision immediately. I'm not sure on division either but I'm sure there are precedents to look at. On the subject of adding new types to the standard library, what are the plans on dates and times? Would a cut-down mxDateTime ever be considered? It is fully Open Source (unlike mxODBC) and was designed for the DBAPI. Regards, Andy ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. __________________________________________________ Do You Yahoo!? Thousands of Stores. Millions of Products. All in one place. Yahoo! Shopping: http://shopping.yahoo.com From guido@CNRI.Reston.VA.US Thu Dec 23 15:23:43 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 10:23:43 -0500 Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) In-Reply-To: Your message of "Thu, 23 Dec 1999 07:13:22 PST." <19991223151322.5698.qmail@web604.mail.yahoo.com> References: <19991223151322.5698.qmail@web604.mail.yahoo.com> Message-ID: <199912231523.KAA22232@eric.cnri.reston.va.us> > On the subject of adding new types to the standard > library, what are the plans on dates and times? Would > a cut-down mxDateTime ever be considered? It is fully > Open Source (unlike mxODBC) and was designed for the > DBAPI. I don't know much about date/time types, or about mxDateTime. My intuition is that there are too many ways to do it, and that being compatible with commercial databases may not be the right way to do it for core Python. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Dec 23 15:27:59 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 23 Dec 1999 10:27:59 -0500 (EST) Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: <38623493.E6BA6D6F@digicool.com> References: <38623493.E6BA6D6F@digicool.com> Message-ID: <14434.16255.58344.646524@weyr.cnri.reston.va.us> Jim Fulton writes: > In November there was an interesting discussion on comp.lang.python > about the meaning of __str__ and __repr__. One tidbit that came out > of this discussion was that __str__ for longs should drop the trailing > 'L'. Was there a decision on this? I'd really like this to happen. I liked that result as well, and thought about it just the other day. Luckily, you sent a note this morning and made me think about again. I'll have something checked into CVS shortly. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From Mike.Da.Silva@uk.fid-intl.com Thu Dec 23 16:30:07 1999 From: Mike.Da.Silva@uk.fid-intl.com (Da Silva, Mike) Date: Thu, 23 Dec 1999 16:30:07 -0000 Subject: [Python-Dev] Fixed Decimal types Message-ID: Andy Robinson wrote: For the work I do, simply taking the precision of the most precise input (4.00L)would do the trick, but your answer (4.0000L) is purer. We should provide a rounding function, and in practice anyone using such a function would round (or floor, or ceiling) to get to the desired precision immediately. I'm not sure on division either but I'm sure there are precedents to look at. The AS400 provides a useful example of the right way to do scaled decimals. In the RPG programming language, all internal calculations (i.e. multiplication, division) are performed to the maximum precision of the intermediate result (in the multiplication example below), the intermediate result would be 4.0000L. When the intermediate result is assigned to the target scaled decimal number, the decimal precision is automatically extended or truncated to fit the target precision. One extra wrinkle in all of this is the option to "half-adjust" the intermediate value on assignment; that is to apply automatic 5/4 rounding to the precision of the target. So, if the target field is defined as numeric(4,2), the result will be 4.00L. These are probably the kind of semantics that a scaled decimal type would require in Python also; i.e. allow unlimited precision in intermediate calculations, with a sensible set of rules for assignment to a variable of different scale and precision. However, unlike RPG, we should probably ensure that attempts to overflow or underflow the scale result in NaN or Overflow conditions, rather than assuming the user is right and losing the significant digits. Regards, Mike da Silva From jim@digicool.com Thu Dec 23 16:37:10 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 11:37:10 -0500 Subject: [Python-Dev] Fixed-decimal types References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> Message-ID: <38624FB6.ED903F@digicool.com> Guido van Rossum wrote: > > What would be scale of the product of two fixed-decimal numbers? > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are > arguments for either. Same question for division (harder, I think). I'd be inclined to start by doing some research to see if some standard (SQL?) defines this somewhere. It would be nice if someone has already done the requirements work for us. :) > I like the idea of using the dd.ddL notation for this. > > I have no time to implement Me neither. > it but would not be unwilling to accept patches. Cool. If no one else volunteers, then I'll try to find a way to get this done (not necessarily by me). I think it is pretty important. > They would have to be accompanied with a wet signature, see > http://www.python.org/1.5/wetsign.html. Yup. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From andy@robanal.demon.co.uk Thu Dec 23 16:38:50 1999 From: andy@robanal.demon.co.uk (=?iso-8859-1?q?Andy=20Robinson?=) Date: Thu, 23 Dec 1999 08:38:50 -0800 (PST) Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) Message-ID: <19991223163850.15619.qmail@web604.mail.yahoo.com> Sorry, should have replied to the list... --- Andy Robinson wrote: > Date: Thu, 23 Dec 1999 08:37:18 -0800 (PST) > From: Andy Robinson > Reply-to: andy@robanal.demon.co.uk > Subject: Re: [Python-Dev] Date and timetypes (was: > Fixed-decimal types) > To: Guido van Rossum > > --- Guido van Rossum > wrote: > > I don't know much about date/time types, or about > > mxDateTime. > > My intuition is that there are too many ways to do > > it, and that being > > compatible with commercial databases may not be > the > > right way to do it > > for core Python. > > > > OK. Let me rephrase it. Say we form a consensus on > 'the right way'. Are you amenable to some solution > which goes back before 1970 and after 2038 going > into > the standard library? > > And does your answer change if it involves some > compiled code as well? > > I mention mxDateTime because it was agreed by a > Python > SIG, is mature and stable, and I find it very > useful. > And the core type is pretty small - much of the > helper > stuff in the package now could be kept separate from > the main Python distribution. > > - Andy > > > ===== > Andy Robinson > Robinson Analytics Ltd. > ------------------ > My opinions are the official policy of Robinson > Analytics Ltd. > They just vary from day to day. > > _________________________________________________________ > Do You Yahoo!? > Get your free @yahoo.com address at > http://mail.yahoo.com > ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com From guido@CNRI.Reston.VA.US Thu Dec 23 16:42:33 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 11:42:33 -0500 Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) In-Reply-To: Your message of "Thu, 23 Dec 1999 08:38:50 PST." <19991223163850.15619.qmail@web604.mail.yahoo.com> References: <19991223163850.15619.qmail@web604.mail.yahoo.com> Message-ID: <199912231642.LAA22598@eric.cnri.reston.va.us> > > OK. Let me rephrase it. Say we form a consensus on 'the right > > way'. Are you amenable to some solution which goes back before > > 1970 and after 2038 going into the standard library? No problem. > > And does your answer change if it involves some > > compiled code as well? I'd rather not. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@mojam.com (Skip Montanaro) Thu Dec 23 17:05:52 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 23 Dec 1999 11:05:52 -0600 (CST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <14434.22128.639699.738932@dolphin.mojam.com> Guido> (The next step would be to outlaw raise with a string argument; I Guido> think I can't make that for 1.6. But it would be a good idea to Guido> scan the standard library for string exceptions and convert all Guido> of them.) Agreed. I know Zope uses (at least, my Zope-using code uses) stuff like raise 'Redirect', url to map names onto HTTP response codes. Makes it easier on people to remember names instead of numeric codes. I suspect it will take the Zopers awhile to convert to using class-based exceptions if they haven't already. (For all I know I may be using a deprecated feature.) Skip From gvwilson@nevex.com Thu Dec 23 17:24:05 1999 From: gvwilson@nevex.com (gvwilson@nevex.com) Date: Thu, 23 Dec 1999 12:24:05 -0500 (EST) Subject: [Python-Dev] re: Open Source design competition / Python / software tools In-Reply-To: <38623909.CDF41014@digicool.com> Message-ID: Hi, everyone. I'm sending my reply to Jim's message to the whole python-dev list; I'll send follow-ups to individuals if people would prefer. > > * an issue tracking system to replace Gnats and Bugzilla; > > > > * a build system to replace make; > > > > * a platform inspection and configuration system to replace autoconf; > > and > > > > * a testing framework to replace XUnit, Expect, and DejaGnu. > Jim Fulton asked: > Are these categories fixed? For the first round, yes --- I have to prove that this model can solve small problems before I'll be given the funding to tackle larger ones, and I think that a UML modeling tool is definitely "large" :-). I also have to demonstrate uptake, and I think more people will adopt a sane replacement for Autoconf in the next 18 months than would adopt a UML modeler. However, decent Open Source CASE tools are very (very) high on my personal list --- if this works, I'd like to tackle them (along with providing support for DDD, and a few other thingsl ike that). Greg From gstein@lyra.org Thu Dec 23 18:26:44 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 23 Dec 1999 10:26:44 -0800 (PST) Subject: [Python-Dev] Re: Please test new dynamic load behavior In-Reply-To: <38620B04.7CC64485@trema.com> Message-ID: On Thu, 23 Dec 1999, Harri Pasanen wrote: > Greg Stein wrote: > > Hi all, > > > > I reorganized Python's dynamic load/import code over the past few days. > > Gudio provided some feedback, I did some more mods, and now it is checked > > into CVS. The new loading behavior has been tested on Linux, IRIX, and > > Solaris (and probably Windows by now). > > ... > > What was the motivation behind this modification? Harri - With the new code structure, it is much easier to maintain Python's loading code. Each platform has its own file (e.g. dynload_aix.c) rather than being all jammed together into importdl.c. This isn't a huge win by itself, but does increase readability/maintainability. The big improvement, however, is when you are adding support for new platforms or loading mechanisms. A new dynload_*.c can be written and one line added to configure.in, and you're done. No need to make importdl.c even uglier. (actually, importdl.c no longer contains *any* platform specific code; it has all been moved to the dynload_*.c files) Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim@digicool.com Thu Dec 23 19:39:37 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 14:39:37 -0500 Subject: [Python-Dev] Fixed-decimal types References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com> Message-ID: <38627A79.BF379672@digicool.com> Jim Fulton wrote: > > Guido van Rossum wrote: > > > > What would be scale of the product of two fixed-decimal numbers? > > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are > > arguments for either. Same question for division (harder, I think). > > I'd be inclined to start by doing some research to see if some standard > (SQL?) defines this somewhere. It would be nice if someone has already > done the requirements work for us. :) Here is what the book "SQL-99 Complete, Really" says that the SQL standard says: - for addition and subtraction of two "exact" (fixed-decimal) numbers, the result has the maximum of the scales. - for multiplication of two "exact" (fixed-decimal) numbers, the result has the sum of the scales. - punts on division - for addition, subtraction, multiplication or division between "exact" (fixed point) and "approximate" (floating point) yields an approximate result. This means that fixed-decimal coerces to float. I'm curious to see who else chips in with examples from other systems. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From jim@digicool.com Thu Dec 23 19:43:41 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 14:43:41 -0500 Subject: [Python-Dev] Fixed Decimal types References: Message-ID: <38627B6D.447A9553@digicool.com> "Da Silva, Mike" wrote: > > Andy Robinson wrote: > For the work I do, simply taking the precision of the > most precise input (4.00L)would do the trick, but your > answer (4.0000L) is purer. We should provide a > rounding function, and in practice anyone using such a > function would round (or floor, or ceiling) to get to > the desired precision immediately. > > I'm not sure on division either but I'm sure there are > precedents to look at. > > The AS400 provides a useful example of the right way to do scaled > decimals. > > In the RPG programming language, all internal calculations (i.e. > multiplication, division) are performed to the maximum precision of the > intermediate result (in the multiplication example below), the intermediate > result would be 4.0000L. When the intermediate result is assigned to the > target scaled decimal number, the decimal precision is automatically > extended or truncated to fit the target precision. One extra wrinkle in all > of this is the option to "half-adjust" the intermediate value on assignment; > that is to apply automatic 5/4 rounding to the precision of the target. Yee ha! This is great input. Anyone have any other examples of what any other systems do? Anyone got a PL/I manual handy. ;) > So, if the target field is defined as numeric(4,2), the result will > be 4.00L. Since Python doesn't have types values, this is not an issue internally, but would be an issue when binding to external databases. > These are probably the kind of semantics that a scaled decimal type > would require in Python also; i.e. allow unlimited precision in intermediate > calculations, with a sensible set of rules for assignment to a variable of > different scale and precision. > > However, unlike RPG, we should probably ensure that attempts to > overflow or underflow the scale result in NaN or Overflow conditions, rather > than assuming the user is right and losing the significant digits. Since this would be based on infinite-precision numbers, I don't think that this would be an issue. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido@CNRI.Reston.VA.US Thu Dec 23 19:44:36 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 14:44:36 -0500 Subject: [Python-Dev] Fixed-decimal types In-Reply-To: Your message of "Thu, 23 Dec 1999 14:39:37 EST." <38627A79.BF379672@digicool.com> References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com> <38627A79.BF379672@digicool.com> Message-ID: <199912231944.OAA23337@eric.cnri.reston.va.us> Jim Fulton wrote: > - for addition and subtraction of two "exact" (fixed-decimal) > numbers, the result has the maximum of the scales. One could argue that this is incorrect: if "3.1" means that I know the value to one decimal of precision, and "2.01" means that I know that value to two decimals of precision, stating the result of their sum as "5.11" suggests that I know the result to two decimals of precision, which is of course false: because I only knew one decimal of precision for one of the operands, I only know (at most!) one decimal of precision for the result. Not arguing for this interpretation, just indicating that doing fixed precision arithmetic right is hard. I'm waiting for Tim Peters' contribution, but he's on vacation so it may be a while. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Thu Dec 23 20:48:56 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 23 Dec 1999 15:48:56 -0500 Subject: [Python-Dev] Fixed Decimal types In-Reply-To: <38627B6D.447A9553@digicool.com> Message-ID: <1266141247-31971518@hypernet.com> Jim Fulton wrote: > "Da Silva, Mike" wrote: [AS400 RPG rules...] > Yee ha! This is great input. Anyone have any other examples of > what any other systems do? Anyone got a PL/I manual handy. ;) From memory of IBM COBOL and SQL, the rules for intermediates seem similar to what Mike outlines. In both cases, the target is pre-specified, and I think by default you get auto-rounding. Tim's BCD class seem to always return the higher precision on an arithmetic op, although the intermediate is full precision. >> However, unlike RPG, we should probably ensure >> that attempts to overflow or underflow the scale >> result in NaN or Overflow conditions, rather >> than assuming the user is right and losing >> the significant digits. > Since this would be based on infinite-precision numbers, I don't > think that this would be an issue. It's an issue if the result of an arithmetic op is other than "full" precision. The issue certainly comes up when you e.g. talk to a DB, and it might be better to have it come up sooner rather than later. - Gordon From jim@digicool.com Thu Dec 23 22:18:37 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 17:18:37 -0500 Subject: [Python-Dev] re: Open Source design competition / Python /software tools References: Message-ID: <38629FBD.3B8F47D4@digicool.com> gvwilson@nevex.com wrote: > > Hi, everyone. I'm sending my reply to Jim's message to the whole > python-dev list; I'll send follow-ups to individuals if people would > prefer. > > > > * an issue tracking system to replace Gnats and Bugzilla; > > > > > > * a build system to replace make; > > > > > > * a platform inspection and configuration system to replace autoconf; > > > and > > > > > > * a testing framework to replace XUnit, Expect, and DejaGnu. > > > Jim Fulton asked: > > Are these categories fixed? > > For the first round, yes OK. >--- I have to prove that this model can solve > small problems before I'll be given the funding to tackle larger ones, and > I think that a UML modeling tool is definitely "large" :-). Well, since you gave rational ..... :) Isn't the Open Source community especially good at large problems? Note that I'm thinking more in terms of an open source UML community of tools, based around an existing repository rather than on a single monolithic tool. I envision a community of diagramming and other small tools orbiting Zope or ZODB. The hardest part of a UML tool is the repository, and I think we've mostly got that. I think that what the Open Source community desperately needs are tools for managing and sharing the most important artifacts in the development process. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gstein@lyra.org Fri Dec 24 00:09:29 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 23 Dec 1999 16:09:29 -0800 (PST) Subject: [Python-Dev] re: Open Source design competition / Python /software tools In-Reply-To: <38629FBD.3B8F47D4@digicool.com> Message-ID: On Thu, 23 Dec 1999, Jim Fulton wrote: > gvwilson@nevex.com wrote: >... > >--- I have to prove that this model can solve > > small problems before I'll be given the funding to tackle larger ones, and > > I think that a UML modeling tool is definitely "large" :-). > > Well, since you gave rational ..... :) > > > Isn't the Open Source community especially good at large problems? Very true, I agree, but part of Greg's problem is "proving" that to the DoE. Somebody has said those four problems are sufficient to do so, and (probably) because they are reasonably constrained to allow completion within a specified timeframe. > Note that I'm thinking more in terms of an open source UML community > of tools, based around an existing repository rather than on a single > monolithic tool. I envision a community of diagramming and other small > tools orbiting Zope or ZODB. The hardest part of a UML tool is the > repository, and I think we've mostly got that. Greg's proposal is quite specific. "A community" isn't, so it might not help to create a proof to the DoE (otherwise, they could look at the Zope community, or other communities!). Jim: there isn't anything stopping or impeding the creation of an Open Source community for UML modeling. This DoE competition won't affect that... Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From jim@digicool.com Fri Dec 24 00:27:53 1999 From: jim@digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 19:27:53 -0500 Subject: [Python-Dev] re: Open Source design competition / Python /softwaretools References: Message-ID: <3862BE09.9AF62090@digicool.com> Greg Stein wrote: > (snip) > Jim: there isn't anything stopping or impeding the creation of an Open > Source community for UML modeling. Of course not. > This DoE competition won't affect that... Perhaps it could help it. > Happy Holidays, You too. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From ping@lfw.org Fri Dec 24 08:55:28 1999 From: ping@lfw.org (Ka-Ping Yee) Date: Fri, 24 Dec 1999 00:55:28 -0800 (PST) Subject: [Python-Dev] re: Open Source design competition / Python / software tools In-Reply-To: Message-ID: On Wed, 22 Dec 1999 gvwilson@nevex.com wrote: > To kick-start things, we're going to be holding a two-round design > competition. Anyone (individual or team, professional or student) can > submit a short entry for the first round; the judges will pick four > candidates to go forward in each of four categories, and those > individuals or teams will be asked to submit full entries. The four > categories are: > > * an issue tracking system to replace Gnats and Bugzilla; Hi there. At ILM we've been using a system that i hacked up quickly in Python called "Roundup". It has a number of interesting properties that have made it really useful to us, and arguably better than any of the existing open-source bug-tracking things out there that i know of. It is not just a Web app; it lives between the Web and e-mail, because we do so much of our communication that way. For example, each request item gets its own virtual mailing list, updated on the fly without the need for explicit subscription (if you cc: somebody while discussing the bug, they get subscribed). Empirically i've discovered that unsubscription is actually unnecessary (!) because conversation will stop on a topic when it gets resolved or when it ceases to be interesting. These are fine-grained discussion lists on a per-topic level. This is just to let you know i'm interested. I'm currently asking for permission to open-source Roundup; if it can't be done, or doesn't happen quickly enough, i'll just have to take a weekend and rewrite the thing. There were a few things i wanted to fix anyway. -- ?!ng "You should either succeed gloriously or fail miserably. Just getting by is the worst thing you can do." -- Larry Smith From Vladimir.Marangozov@inrialpes.fr Fri Dec 24 12:07:05 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Fri, 24 Dec 1999 13:07:05 +0100 (CET) Subject: [Python-Dev] Exceptions In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 01:23:45 PM Message-ID: <199912241207.NAA18783@python.inrialpes.fr> Guido van Rossum wrote: > > Vladimir.Marangozov@inrialpes.fr: > > > Yes. Besides, I still think that string-based exceptions are just > > convenient for quick & dirty, throw-away test scripts. > > They have a hard-to-understand quirk though: the id() of the string is > used to check rather than its value, so that except "foo" doesn't > necessarily catch raise "foo"; but due to various optimization, this > usually works, and people get bent out of shape when it doesn't. Which brings 2 important questions: 1. In the long run, which one is better -- compare and check exceptions by reference (by name) or by value? (currently, this is done by reference on predefined object types: strings, classes or instances) I'd say, exceptions have to be compared (catched) by value, i.e. use "e1 == e2" instead of "e1 is e2". 2. Should we limit the exception "types"? I'd say, no. My Pythonic view of things says that we raise "objects", be they classes, instances, strings or, why not, ints. However, if one wants to put some order in the "unordered set" of exceptions s/he uses, then classes is the way to do it, because classes were given some nice properties, like inheritance, that allow to group and to organize logically the objects we throw and catch as exceptions (+ other bonus properties coming from classes). Note that conceptually, when we say "strings and ints", we have in mind "string instances and int instances", whose "classes" are written in C. When there will be String and Int classes of some sort as first class objects, then we'll fall back to the terminology: Exceptions can be classes or instances. If point 1 and (optionally) point 2 is implemented, the hard-to-understand quirk wouldn't be an issue and string-based exceptions would have a legal reason to stay and live. > Since you have to give your exception a name, how hard is it to say > > class MyError(Exception): pass > > rathern than > > MyError = "MyError" > > ? You know what I think about "names"... I may have defined my exception conventions and be interested in catching an exception named 404, implying that "a 404 bobo" occured deeply in my code ("deeply in my code" meaning for example: database 4, service 0, customer group 4, or just a standard HTTP "Code 404 - Not Found".) Pushing this to the extreme to catapult your thoughts into the next millenium. :) and to emphasize the importance of discussing and anwsering objectively the above questions 1) and 2). -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From mal@lemburg.com Fri Dec 24 11:03:37 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 24 Dec 1999 12:03:37 +0100 Subject: [Python-Dev] str(1L) -> '1' ? References: <38623493.E6BA6D6F@digicool.com> <199912231446.JAA22086@eric.cnri.reston.va.us> Message-ID: <38635309.2AEFF18D@lemburg.com> Guido van Rossum wrote: > > [Jim F] > > In November there was an interesting discussion on comp.lang.python > > about the meaning of __str__ and __repr__. One tidbit that came out > > of this discussion was that __str__ for longs should drop the trailing > > 'L'. Was there a decision on this? I'd really like this to happen. > > Yes, I'd like it to happen. I'd also like repr() of a float to return > the full precision (using the "%.17g" sprintf format). While we're at it: how about adding a PyLong_AsString() API to the C interface ? I currently use PyObject_Str() in mxODBC and then slice off the 'L' -- not very elegant. A PyLong_AsString() API would much better suit the task. Merry Christmas, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 7 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Fri Dec 24 11:11:29 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 24 Dec 1999 12:11:29 +0100 Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) References: <19991223163850.15619.qmail@web604.mail.yahoo.com> <199912231642.LAA22598@eric.cnri.reston.va.us> Message-ID: <386354E1.DA560F42@lemburg.com> Guido van Rossum wrote: > > > > OK. Let me rephrase it. Say we form a consensus on 'the right > > > way'. Are you amenable to some solution which goes back before > > > 1970 and after 2038 going into the standard library? > > No problem. > > > > And does your answer change if it involves some > > > compiled code as well? > > I'd rather not. As far as mxDateTime goes, I'd rather not see it in the core distribution. Including the mx stuff in a separate PythonPowerTools distribution would be cool though. For a start in this direction see e.g.: http://startship.skyport.net/~lemburg/PPowerTools-0.2.zip Note that I'll wrap all my mx extensions into a new mx package which will come in several flavours next year. There will no longer be separate packages due to the various naming collisions and to enable intra-mx-package dependencies. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 7 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From andy@robanal.demon.co.uk Fri Dec 24 12:22:29 1999 From: andy@robanal.demon.co.uk (=?iso-8859-1?q?Andy=20Robinson?=) Date: Fri, 24 Dec 1999 04:22:29 -0800 (PST) Subject: [Python-Dev] Fixed Decimal types Message-ID: <19991224122229.23506.qmail@web606.mail.yahoo.com> > >> However, unlike RPG, we should probably ensure > >> that attempts to overflow or underflow the scale > >> result in NaN or Overflow conditions, rather > >> than assuming the user is right and losing > >> the significant digits. > > > Since this would be based on infinite-precision > numbers, I don't > > think that this would be an issue. Three very general observations before I disappear for Christmas: (1) I think there is great mileage in combining the fixed-decimal concept with Martin Fowler's Quantity pattern, so that a variable could be defined as not just two decimal places but also (say) "GBP" or "USD", and it would be an error to add the two. Same applies for adding metres, kilograms and other quantities. There has also been discussion that the 'type' of a quantity should determine what math should apply. (2) If Python is going to be used increasingly in eCommerce, it should be good at dealing with money - maybe not in the core language, but we should aim for one standard package. (3) We have a python-finance list (python-finance@egroups.com), recently generalized to cover business systems, which is a good place to discuss this if anyone wants to. There are people there who have time, would love to prototype something (indeed some work started in this area 3 months back), and would use it at work too. This would be an ideal first target for that group - or indeed for a finance-sig. I'll pursue this in the New Year. Merry Christmas, Andy ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com From jack@oratrix.nl Fri Dec 24 12:34:28 1999 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 24 Dec 1999 13:34:28 +0100 Subject: [Python-Dev] Fixed Decimal types In-Reply-To: Message by =?iso-8859-1?q?Andy=20Robinson?= , Fri, 24 Dec 1999 04:22:29 -0800 (PST) , <19991224122229.23506.qmail@web606.mail.yahoo.com> Message-ID: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl> > (1) I think there is great mileage in combining the > fixed-decimal concept with Martin Fowler's Quantity > pattern, so that a variable could be defined as not > just two decimal places but also (say) "GBP" or "USD", > and it would be an error to add the two. Same applies > for adding metres, kilograms and other quantities. > There has also been discussion that the 'type' of a > quantity should determine what math should apply. Isn't this something that is ideally suited for implementation in a Python module, based on a core implementation of fixed decimal numbers? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From gstein@lyra.org Fri Dec 24 20:05:22 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 24 Dec 1999 12:05:22 -0800 (PST) Subject: [Python-Dev] Fixed Decimal types In-Reply-To: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl> Message-ID: On Fri, 24 Dec 1999, Jack Jansen wrote: > > (1) I think there is great mileage in combining the > > fixed-decimal concept with Martin Fowler's Quantity > > pattern, so that a variable could be defined as not > > just two decimal places but also (say) "GBP" or "USD", > > and it would be an error to add the two. Same applies > > for adding metres, kilograms and other quantities. > > There has also been discussion that the 'type' of a > > quantity should determine what math should apply. > > Isn't this something that is ideally suited for implementation in a Python > module, based on a core implementation of fixed decimal numbers? I'd agree with Jack here. The "simple" change of a scale for the Long values is nice. Starting to lump in features like this begins to get a little messier... Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Fri Dec 24 20:13:50 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 24 Dec 1999 12:13:50 -0800 (PST) Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: <38635309.2AEFF18D@lemburg.com> Message-ID: On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > Guido van Rossum wrote: > > [Jim F] > > > In November there was an interesting discussion on comp.lang.python > > > about the meaning of __str__ and __repr__. One tidbit that came out > > > of this discussion was that __str__ for longs should drop the trailing > > > 'L'. Was there a decision on this? I'd really like this to happen. > > > > Yes, I'd like it to happen. I'd also like repr() of a float to return > > the full precision (using the "%.17g" sprintf format). > > While we're at it: how about adding a PyLong_AsString() API > to the C interface ? I currently use PyObject_Str() in mxODBC > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > API would much better suit the task. Fred just checked in a change yesterday. PyObject_Str() on a Long no longer includes the 'L'. You're going to need to update your code :-) [ I've got some here and there to fix, too, with the idiom: if type(v) is type(1L): return str(v)[:-1] ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From mal@lemburg.com Sun Dec 26 22:29:28 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 26 Dec 1999 23:29:28 +0100 Subject: [Python-Dev] str(1L) -> '1' ? References: Message-ID: <386696C8.6EBBF428@lemburg.com> Greg Stein wrote: > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > > While we're at it: how about adding a PyLong_AsString() API > > to the C interface ? I currently use PyObject_Str() in mxODBC > > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > > API would much better suit the task. > > Fred just checked in a change yesterday. PyObject_Str() on a Long no > longer includes the 'L'. Ah, ok... scanning the patches: they don't provide an externed C interface... I would like to have such a beast if possible (basically, the new long_format() as PyLong_AsString()). > You're going to need to update your code :-) > [ I've got some here and there to fix, too, with the idiom: > if type(v) is type(1L): return str(v)[:-1] > ] Your above example will effectively divide the long value by 10 which will probably break things in very subtle ways... hmm, this change ought to be made *very* visible to people upgrading to 1.6, IMHO. I'll fix mxODBC to only truncate the string value iff the 'L' is present. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 5 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From andy@robanal.demon.co.uk Mon Dec 27 10:43:17 1999 From: andy@robanal.demon.co.uk (Andy Robinson) Date: Mon, 27 Dec 1999 10:43:17 GMT Subject: [Python-Dev] Fixed Decimal types In-Reply-To: References: Message-ID: <38674259.5377973@post.demon.co.uk> On Fri, 24 Dec 1999 12:05:22 -0800 (PST), you wrote: >On Fri, 24 Dec 1999, Jack Jansen wrote: >> > (1) I think there is great mileage in combining the >> > fixed-decimal concept with Martin Fowler's Quantity >> > pattern, so that a variable could be defined as not >> > just two decimal places but also (say) "GBP" or "USD", >> > and it would be an error to add the two. Same applies >> > for adding metres, kilograms and other quantities.=20 >> > There has also been discussion that the 'type' of a >> > quantity should determine what math should apply. >>=20 >> Isn't this something that is ideally suited for implementation in a = Python=20 >> module, based on a core implementation of fixed decimal numbers? > >I'd agree with Jack here. > Me too - I thought I said that in point 2, but in retrospect I didn't say it clearly enough :-) - Andy From gstein@lyra.org Mon Dec 27 11:31:29 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 27 Dec 1999 03:31:29 -0800 (PST) Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: <386696C8.6EBBF428@lemburg.com> Message-ID: On Sun, 26 Dec 1999, M.-A. Lemburg wrote: > Greg Stein wrote: > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > > > While we're at it: how about adding a PyLong_AsString() API > > > to the C interface ? I currently use PyObject_Str() in mxODBC > > > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > > > API would much better suit the task. > > > > Fred just checked in a change yesterday. PyObject_Str() on a Long no > > longer includes the 'L'. > > Ah, ok... scanning the patches: they don't provide an externed > C interface... I would like to have such a beast if possible > (basically, the new long_format() as PyLong_AsString()). What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry Point. > > You're going to need to update your code :-) > > [ I've got some here and there to fix, too, with the idiom: > > if type(v) is type(1L): return str(v)[:-1] > > ] > > Your above example will effectively divide the long value by 10 > which will probably break things in very subtle ways... hmm, this Yah :-( Not a lot of fun, but I think for the best. > change ought to be made *very* visible to people upgrading to > 1.6, IMHO. Yes. Cheers, -g -- Greg Stein, http://www.lyra.org/ From mal@lemburg.com Mon Dec 27 12:51:36 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 27 Dec 1999 13:51:36 +0100 Subject: [Python-Dev] str(1L) -> '1' ? References: Message-ID: <386760D8.E897FADF@lemburg.com> Greg Stein wrote: > > On Sun, 26 Dec 1999, M.-A. Lemburg wrote: > > Greg Stein wrote: > > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > > > > While we're at it: how about adding a PyLong_AsString() API > > > > to the C interface ? I currently use PyObject_Str() in mxODBC > > > > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > > > > API would much better suit the task. > > > > > > Fred just checked in a change yesterday. PyObject_Str() on a Long no > > > longer includes the 'L'. > > > > Ah, ok... scanning the patches: they don't provide an externed > > C interface... I would like to have such a beast if possible > > (basically, the new long_format() as PyLong_AsString()). > > What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry > Point. What's wrong with a rich C API :-) ? The long_format function would be very useful for programs interacting with other software at C level. Making it external would give the programmer the ability to pass long string representations in any base to other programs, which is very useful for e.g. database interaction or crypto software. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 4 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From bkc@murkworks.com Mon Dec 27 22:04:25 1999 From: bkc@murkworks.com (Brad Clements) Date: Mon, 27 Dec 1999 17:04:25 -0500 Subject: [Python-Dev] Re: [PSA MEMBERS] Re: Please test new dynamic load behavior In-Reply-To: References: <38620B04.7CC64485@trema.com> Message-ID: <199912272204.RAA26173@anvil.murkworks.com> On 23 Dec 99, at 10:26, Greg Stein wrote: > > > I reorganized Python's dynamic load/import code over the past few days. > > > Gudio provided some feedback, I did some more mods, and now it is checked > > > into CVS. The new loading behavior has been tested on Linux, IRIX, and > > > Solaris (and probably Windows by now). FYI, I downloaded the import stuff from CVS and used it in my port of Python to NetWare. Good timing, as I was just tackling dynamic loading on NetWare when I saw your message. The new scheme is much better, and works for me. Though I do need to add some special "un-import" code similar to what BEOS does. Brad Clements, bkc@murkworks.com (315)268-1000 http://www.murkworks.com (315)268-9812 Fax netmeeting: ils://ils.murkworks.com AOL-IM: BKClements From skip@mojam.com (Skip Montanaro) Tue Dec 28 21:41:33 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 28 Dec 1999 15:41:33 -0600 Subject: [Python-Dev] Better text processing support in py2k? Message-ID: <199912282141.PAA31426@dolphin.mojam.com> It just occurred to me as I was replying to a request on the main list, that Python's text handling capabilities could be a bit better than they are. This will probably not come as a revelation to many of you, but I finally put it together with the standard argument against beefing things up One fix would be to add regular expressions to the language core and have special syntax for them, as Perl has done. However, I don't like this solution because Python is a general-purpose language, and regular expressions are used for the single application domain of text processing. For other application domains, regular expressions may be of no interest, and you might want to remove them to save memory and code size. and the observation that Python does support some builtin objects and syntax that are fairly specific to some much more restricted application domains than text processing. I stole the above quote from Andrew Kuchling's Python Warts page, which I also happened to read earlier today. What AMK says makes perfect sense until you examine some of the other things that are in the language, like the Ellipsis object and complex numbers. If I recall correctly both were added as a result of the NumPy package development. I have nothing against ellipses or complex numbers. They are fine first class objects that should remain in the language. But I have never used either one in my day-to-day work. On the other hand, I read files and manipulate them with regular expressions all the time. I rather suspect that more people use Python for some sort of text processing than any other single application domain. Python should be good at it. While I don't want to turn Python into Perl, I would like to see it do a better job of what most people probably use the language for. Here is a very short list of things I think need attention: 1. When using something like the simple file i/o idiom for line in f.readlines(): dofunstuff(line) the programmer should not have to care how big the file is. It should just work in a reasonably efficient manner without gobbling up all of memory. I realize this may require some change to the syntax of the common idiom. 2. The re module needs to be sped up, if not to catch up with Perl, then to catch up with the deprecated regex module. Depending how far people want to go with things, adding some language syntax to support regular expressions might be in order. I don't see that as compelling as adding complex numbers however. Another possibility, now that Barry Warsaw has opened the floodgates, is to add regular expression methods to strings. 3. I've not yet used it, but I am told the pattern matching in Marc-Andre Lemburg's mxTextTools (http://starship.python.net/crew/lemburg/) is both powerful and efficient (though it certainly appears complex). Perhaps it deserves consideration for incorporation into the core Python distribution. I'm sure other people will come up with other suggestions. Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From akuchlin@mems-exchange.org Tue Dec 28 22:00:11 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Tue, 28 Dec 1999 17:00:11 -0500 (EST) Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com> References: <199912282141.PAA31426@dolphin.mojam.com> Message-ID: <14441.13035.802146.730160@amarok.cnri.reston.va.us> Skip Montanaro writes: >What AMK says makes perfect sense until you examine some of the other things >that are in the language, like the Ellipsis object and complex numbers. If >I recall correctly both were added as a result of the NumPy package >development. True, but note that you can compile Python with WITHOUT_COMPLEX defined to remove complex numbers. > 1. When using something like the simple file i/o idiom > for line in f.readlines(): > dofunstuff(line) > the programmer should not have to care how big the file is. What about 'for line in fileinput.input()', which already exists? (Hmmm... if you have an already open file object, I don't think you can pass it to fileinput.input(); maybe that should be fixed.) On a vaguely related note, since there are many things like parser generators and XML stuff and mxTextTools, I've been speculating about a text processing topic guide. If you know of Python packages related to text processing, please send me a private e-mail with a link. -- A.M. Kuchling http://starship.python.net/crew/amk/ Constraints often boost creativity. -- Jim Hugunin, 11 Feb 1999 From skip@mojam.com (Skip Montanaro) Tue Dec 28 22:26:53 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 28 Dec 1999 16:26:53 -0600 (CST) Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <14441.13035.802146.730160@amarok.cnri.reston.va.us> References: <199912282141.PAA31426@dolphin.mojam.com> <14441.13035.802146.730160@amarok.cnri.reston.va.us> Message-ID: <14441.14637.682862.999776@dolphin.mojam.com> Andrew> True, but note that you can compile Python with WITHOUT_COMPLEX Andrew> defined to remove complex numbers. That's true, but that wasn't my point. I'm not arguing for or against space efficiency, just that the the rather timeworn argument about not doing anything special to support text processing because Python is a general purpose language is a red herring. >> 1. When using something like the simple file i/o idiom >> for line in f.readlines(): >> dofunstuff(line) >> the programmer should not have to care how big the file is. Andrew> What about 'for line in fileinput.input()', which already Andrew> exists? (Hmmm... if you have an already open file object, I Andrew> don't think you can pass it to fileinput.input(); maybe that Andrew> should be fixed.) Well, a couple reasons jump to mind: 1. fileinput.FileInput isn't particularly efficient. At its heart, its __getitem__ method makes a simple readline() call instead of buffering some amount of readlines(sizehint) bytes. This can be fixed, but I'm not sure what would happen to its semantics. 2. As you pointed out, it's not all that general. My point, not at all well stated, is that the programmer shouldn't have to worry (much?) about the conditions under which he does file i/o. Right now, if I know the file is small(ish), I can do for line in f.readlines(): dofunstuff(line) but I have to know that the file won't be big, because readlines() will behave badly (perhaps even generate a MemoryError exception) if the file is large. In that case, I have to fall back to the safer (and slower) line = f.readline() while line: dofunstuff(line) line = f.readline() or the more efficient, but more cumbersome lines = f.readlines(sizehint) while lines: for line in lines: dofunstuff(line) lines = f.readlines(sizehint) That's three separate idioms the programmer has to be aware of when writing code to read a text file based upon the perceived need for speed, memory usage and desired clarity: fast/memory-intensive/clear slow/memory-conserving/not-as-clear fast/memory-conserving/fairly-muddy Any particular reason that the readline method can't return an iterator that supports __getitem__ and buffers input? (Again, remember this is for py2k, so the potential breakage such a change might cause is a consideration, but not a showstopper.) Andrew> On a vaguely related note, since there are many things like Andrew> parser generators and XML stuff and mxTextTools, I've been Andrew> speculating about a text processing topic guide. If you know of Andrew> Python packages related to text processing, please send me a Andrew> private e-mail with a link. This sounds like a good idea to me. Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From andy@robanal.demon.co.uk Wed Dec 29 08:34:43 1999 From: andy@robanal.demon.co.uk (=?iso-8859-1?q?Andy=20Robinson?=) Date: Wed, 29 Dec 1999 00:34:43 -0800 (PST) Subject: [Python-Dev] Better text processing support in py2k? Message-ID: <19991229083443.27817.qmail@web6005.mail.yahoo.com> --- Skip Montanaro wrote: > fast/memory-intensive/clear > slow/memory-conserving/not-as-clear > fast/memory-conserving/fairly-muddy > > Any particular reason that the readline method can't > return an iterator that > supports __getitem__ and buffers input? (Again, > remember this is for py2k, > so the potential breakage such a change might cause > is a consideration, but > not a showstopper.) Why not generalize fileinput to do buffering instead? More generally, Java has the notion of 'stackable streams' - e.g. construct a 'BufferedFile' around a 'File', maybe construct a 'Line-oriented file' around that etc. Each one takes a file-like object as an argument to the constructor. Things you might want to do: - buffering - international encoding conversions - line delimiters other than CR/LF/CRLF - read/write Python objects (i.e. use pickle/marshal) - easy interfaces to parsers This took me a couple of hours to get used to (and at the time I thought 'Yuk!' when I saw first saw four nested constructors), but gives you very precise control and a lot of versatility when handling files. It's an idiom Python does not use much but maybe it should. I'd argue that maybe some enhancements to fileinput.py - adding some streams to provide building blocks for these operations - would get us the power you want and a lot more versatility besides. ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. __________________________________________________ Do You Yahoo!? Talk to your friends online with Yahoo! Messenger. http://messenger.yahoo.com From mal@lemburg.com Wed Dec 29 16:55:21 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 29 Dec 1999 17:55:21 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <19991229083443.27817.qmail@web6005.mail.yahoo.com> Message-ID: <386A3CF9.8AF0EA60@lemburg.com> Andy Robinson wrote: > > --- Skip Montanaro wrote: > > fast/memory-intensive/clear > > slow/memory-conserving/not-as-clear > > fast/memory-conserving/fairly-muddy > > > > Any particular reason that the readline method can't > > return an iterator that > > supports __getitem__ and buffers input? (Again, > > remember this is for py2k, > > so the potential breakage such a change might cause > > is a consideration, but > > not a showstopper.) > > Why not generalize fileinput to do buffering instead? > > More generally, Java has the notion of 'stackable > streams' - e.g. construct a 'BufferedFile' around a > 'File', maybe construct a 'Line-oriented file' around > that etc. Each one takes a file-like object as an > argument to the constructor. Things you might want to > do: > - buffering > - international encoding conversions > - line delimiters other than CR/LF/CRLF > - read/write Python objects (i.e. use pickle/marshal) > - easy interfaces to parsers If all goes well we'll have something like this in Python 1.6 at least for the encoding/decoding part file reading and writing. You basically take a file object and then wrap some StreamCodecs around it to get the functionality you need. Very simple and very intuitive. > This took me a couple of hours to get used to (and at > the time I thought 'Yuk!' when I saw first saw four > nested constructors), but gives you very precise > control and a lot of versatility when handling files. > It's an idiom Python does not use much but maybe it > should. > > I'd argue that maybe some enhancements to fileinput.py > - adding some streams to provide building blocks for > these operations - would get us the power you want and > a lot more versatility besides. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Get ready to party ! Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From bckfnn@pipmail.dknet.dk Wed Dec 29 18:51:52 1999 From: bckfnn@pipmail.dknet.dk (Finn Bock) Date: Wed, 29 Dec 1999 18:51:52 GMT Subject: [Python-Dev] zipfile.py In-Reply-To: <3857B97E.3684224F@interet.com> References: <3857B97E.3684224F@interet.com> Message-ID: <386a582d.6762574@pipmail.dknet.dk> James C. Ahlstrom wrote: > ftp://ftp.interet.com/pub/pylib.html I feel that it smell a bit too much like a tool and too little like an general programming api. - It can only add disk files. The ability to write data to a zip entry through a file-like object or from a string would make it more like an API, IMHO - Some kind of access to the TOC entry fields (date, size, compressed size etc) also seems like a nice feature. - The data for an entry must be available in memory. Could be a problem for huge files, but most like not in practical use. I admit that I am fond of the api from java.util.zip.ZipFile and java.util.zip.ZipOutputStream. Regards, Finn Bock From tim_one@email.msn.com Thu Dec 30 06:08:58 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 01:08:58 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com> Message-ID: <000001bf528c$5cbdb9a0$a02d153f@tim> [Skip Montanaro, wants nicer text facilities] > ... > I rather suspect that more people use Python for some sort of > text processing than any other single application domain. Hmm. You're probably right, but I'm an exception. > Python should be good at it. And I guess I'm an exception mostly *because* Perl is better at easy text crunching and Icon is better at hard text-crunching -- that is, I use the right tool for the job . > While I don't want to turn Python into Perl, I would like to see > it do a better job of what most people probably use the language > for. Here is a very short list of things I think need attention: > > 1. [*A* clear way to do memory- and time-efficient textfile > input] I agree, but unsure how to fix it. The best way to write this now is # f is some open file object. while 1: lines = f.readlines(BUFSIZE) if not lines: break for line in lines: process(line) and it's not something anyone figures out on their own -- or enjoys typing or explaining afterwards. Perl gets its line-at-a-time speed by peeking and poking C FILE structs directly in compiler- and platform-specific ways -- ways that vendors *should* have done in their own fgets implementations, but almost never do. I have no idea whether it works well with Perl's nascent notions of threading, but in the absence of that "the system" doesn't know Perl is cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one line at a time -- even mixing in C-level ungetc calls works (well, sometimes <0.1 wink -- they don't always peek and poke enough fields>)). The Python QIO extension module is much easier to port but less compatible (it doesn't use stdio, so QIO-opened files don't play well with others) and slower (although that's likely repairable -- he's got two passes over the buffer where one hairier pass should suffice). > 2. The re module needs to be sped up, if not to catch up with > Perl, then to catch up with the deprecated regex module. The irony here is that the re engine is very often unboundedly faster than the regex engine -- provided you're chewing over large strings. Some tests /F ran showed that the length-independent *overhead* of invoking re is about 10x higher than for regex. Presumably the bulk of that is due to re.py, i.e. that you get to the re engine via going thru Python layers on your way in and out, while regex was pure C. In any case, /F is working on a new engine (for Unicode), and I believe he has this all well in hand. > Depending how far people want to go with things, adding some > language syntax to support regular expressions might be in order. > ... > 3. I've not yet used it, but I am told the pattern matching in > Marc-Andre Lemburg's mxTextTools > (http://starship.python.net/crew/lemburg/) > is both powerful and efficient (though it certainly appears > complex). Perhaps it deserves consideration for > incorporation into the core Python distribution. It's not complex, it's complicated -- and *that's* what makes it un-Pythonic . Tony Ibbs has written a friendly wrapper around mxTextTools that suppresses much of the non-essential complication. OTOH, if you go into this with a regexp mindset, it will run much slower than a real regexp package, because the bulk of the latter is devoted to doing optimization; mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls if you e.g. try to implement naive backtracking). You should go to the REBOL site and look at the description of REBOL's PARSE verb in the FAQ ... mumble, mumble ... at http://www.rebol.com/faq.html#11550948 Here's an example pulled from that page (this is a REBOL code fragment): digit: charset "0123456789" expr: [term ["+" | "-"] expr | term] term: [factor ["*" | "/"] term | factor] factor: [primary "**" factor | primary] primary: [value | "(" expr ")"] value: [digit value | digit] parse "1 + 2 ** 9" expr There hasn't been a pattern scheme this clean, convenient or powerful since SNOBOL4. It exploits REBOL's Forth-like (lack of!) syntax, and Smalltalk-like penchant for passing around thunks (anonymous closures -- "[...]" in REBOL builds a lexically-scoped entity called "a block", which can be treated as code (executed) or data (manipulated like a Python list) at will). Now the example doesn't show this, but you can freely mix computations into the middle of the patterns; only *some* of the words in the blocks have special meaning to PARSE. The fragment above is already way beyond what can be accomplished with regexps, but that's just the start of it. Perl too is slamming in more & more ways to get user code to interact with its regexp engine. So REBOL has a *very* nice approach to this; I believe it's unreasonably clumsy to mimic in Python primarily because of forward references (note e.g. that the block attached to "expr" above refers to "term" before the latter has been bound -- but the stuff inside [...] is just a closure so that doesn't matter -- it only matters that term gets bound before expr is *executed*). I hit a similar snag years ago when trying to mimic SNOBOL4's approach in Python. Perl's endless abuse of regexps is making that language more absurd by the month. The other major approach to mixing patterns with computation is due to Icon, another language where a regexp mindset is fatal. On a whim, I whipped up the attached, which illustrates a bit of the Icon approach in Pythonic terms (but without language support for generators, the *heart* of it can't really be captured). Here's an example of how this could be used to implement (the simplest form of) string.split: def mysplit(str): s = Searcher(str) white = CharSet(" \t\n") result = [] s.many(white) # consume initial whitespace while s.notmany(white): # consume non-whitespace result.append(s.get_match()) s.many(white) return result >>> mysplit(" \t Hey, that's\tpretty\n\n neat! ") ['Hey,', "that's", 'pretty', 'neat!'] >>> The primary thing to note is that there's no seam between analyzing the string and doing computation on the partial results -- "the program is the pattern". This is what Icon does to perfection, Perl is moving toward, and REBOL is arriving at from a different direction. It's The Future <0.9 wink>. Without generators it's difficult to work backtracking into the Searcher class, but, as above, in my experience the backtracking feature of regexps is rarely *needed*! For example, at various points "split" wants to suck up all the whitespace characters, and that's *it* -- the backtracking possibility in the regexp \s+ is often a bug just waiting for unexpected *context* to trigger it. A hairy regexp is pure hell; but what simpler regexps can do don't require all that funky regexp machinery. BTW, the mxTextTools engine could be used to get blazing implementations of the primary Searcher methods (it excels at simple analysis). OTOH, making lots of calls to analyze short strings is slow. The only clean solutions to that are Perl's and Icon's (build everyting into one language so the compiler can optimize stuff away), and REBOL's (make no distinction between code and data, so that code can be analyzed & optimized at runtime -- and build the entire implementation around making closures and calls supernaturally fast). the-less-you-use-regexps-the-less-you-miss-'em-ly y'rs - tim class CharSet: def __init__(self, seq): self.seq = seq d = {} for ch in seq: d[ch] = 1 self.haskey = d.has_key def __call__(self, ch): return self.haskey(ch) def __add__(self, other): if isinstance(other, CharSet): other = other.seq return CharSet(self.seq + other) def _normalize_index(i, n): assert n >= 0 if i >= 0: return min(i, n) elif n == 0: return 0 # want smallest q s.t. i + q*n >= 0 # <-> q*n >= -i # <-> q >= -i/n # so q = ceiling(-i/n) = -floor(i/n) return i - (i/n)*n class Searcher: def __init__(self, str, lo=0, hi=None): """Create object to search in str[lo:hi]. lo defaults to 0. hi defaults to len(str). len(str) is repeatedly added to negative lo or hi until reaching a number >= 0. If lo > hi, a uselessly empty slice will be searched. The search cursor is initialized to lo. """ self.s = str self.lo = _normalize_index(lo, len(str)) if hi is None: self.hi = len(str) else: self.hi = _normalize_index(hi, len(str)) if self.lo > self.hi: self.hi = self.lo self.i = self.lo self.lastmatch = None, None def any(self, charset, consume=1): """Try to match single character in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i = self.i if i < self.hi and charset(self.s[i]): if consume: self.__consume(i+1) return 1 return 0 def notany(self, charset, consume=1): """Try to match single character not in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i = self.i if i < self.hi and not charset(self.s[i]): if consume: self.__consume(i+1) return 1 return 0 def many(self, charset, consume=1): """Try to match one or more characters in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i, n, s = self.i, self.hi, self.s j = i while j < n and charset(s[j]): j = j+1 if i < j: if consume: self.__consume(j) return 1 return 0 def notmany(self, charset, consume=1): """Try to match one or more characters not in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i, n, s = self.i, self.hi, self.s j = i while j < n and not charset(s[j]): j = j+1 if i < j: if consume: self.__consume(j) return 1 return 0 def match(self, str, consume=1): """Try to match string "str". Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i = self.i j = i + len(str) if self.s[i:j] == str: if consume: self.__consume(j) return 1 return 0 def get_str(self): """Return subject string.""" return self.s def get_lo(self): """Return low slice bound.""" return self.lo def get_hi(self): """Return high slice bound.""" return self.hi def get_pos(self): """Return current value of search cursor.""" return self.i def get_match_indices(self): """Return slice indices of last "consumed" match.""" return self.lastmatch def get_match(self): """Return last "consumed" matching substring.""" i, j = self.lastmatch if i is None: return ValueError("no match to return!") return self.s[i:j] def set_pos(self, pos, consume=1): """Set search cursor to new value. No return value. If optional arg "consume" is true, the last match is set to the slice between pos and the current cursor position. """ p = _normalize_index(pos, len(self.s)) if not self.lo <= p <= self.hi: raise ValueError("pos out of bounds: " + `pos`) if consume: self.__consume(p) else: self.i = p def move_pos(self, incr, consume=1): """Move the cursor by incr characters. No return value. If the new value is outside the slice bounds, it's clipped. If optional arg "consume" is true, the last match is set to the slice between the old and new cursor positions. """ newi = self.i + incr if newi < self.lo: newi = self.lo elif newi > self.hi: newi = self.hi if consume: self.__consume(newi) else: self.i = newi def __consume(self, newi): i, j = self.i, newi if i > j: i, j = j, i self.lastmatch = i, j self.i = newi From tim_one@email.msn.com Thu Dec 30 06:09:14 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 01:09:14 -0500 Subject: [Python-Dev] Fixed-decimal types In-Reply-To: <199912231944.OAA23337@eric.cnri.reston.va.us> Message-ID: <000201bf528c$657c3080$a02d153f@tim> [Guido] > ... > Not arguing for this interpretation, just indicating that doing > fixed precision arithmetic right is hard. It's not so much hard as it is arbitrary. The floating-point world is standardized now, but the fixed-point world remains a mish-mash of incompatible legacy schemes carried across generations of products for no reason other than product-specific compatibility. So despite that fixed-point has a specialty audience, whatever rules Python chooses will leave it incompatible with much of that audience's (mixed!) expectations. If fixed-point is needed, and my FixedPoint.py isn't good enough (all other fixed point pkgs I've seen for Python were braindead), then it should be implemented such that developers can control both rounding and precision propagation. I'll attach suitable kernels; they haven't been tested but any bugs discovered will be trivial to fix (there are no difficulties here, but typos are likely); the kernels supply the bulk of what's required, whether implemented in Python or C; various packages can wrap them to supply whatever policies they like; see FixedPoint.py for exact string<->FixedPoint and exact float->FixedPoint conversions; and that's the end of my involvement in fixed-point . Python should certainly *not* add a "scale factor" to its current long implementation; fixed-point should be a distinct type, as scale-factor fiddling is clumsy and pervasive (long arithmetic is challenging enough to get correct and quick without this obfuscating distraction; and by leaving scale factors out of it, it's much easier to plug in alternative bigint implementations (like GMP)). One other point: some people are going to want BCD (binary-coded decimal), which suffers the same mish-mash of legacy policies, but with a different data representation. The point is that many commercial applications spend much more time doing I/O conversions than arithmetic, and BCD accepts slow arithmetic (in the absence of special HW support) in return for fast scaling & I/O conversion. Forgetting the database-heads for a moment, decimal *floating*-point is what calculators do, so that's what "real people" are most comfortable with. The IEEE-854 std (IEEE-754's younger and friendlier brother) specifies that completely. Add a means to boost "global" precision (a la REXX), and it's a powerful tool even for experts (benefits approximating those of unbounded rational arithmetic but with bounded & user-controllable expense). can-never-have-too-many-numeric-types-but-always-have- too-few-literal-notations-ly y'rs - tim # Kernels for fixed-point decimal arithmetic. # _add, _sub, _mul, _div all have arglist # n1, p1, n2, p2, p, round=DEFAULT_ROUND # n1 and n2 are longs; p1, p2 and p ints >= 0. # The inputs are exactly n1/10**p1 and n2/10**p2. # # The return value is the integer n such that n/10**p is the best # approximation to the infinite-precision result. In other words, p1 # and p2 are the input precisions and p is the desired output # precision, where precision is the # of digits *after* the decimal # point. # # What "best approximation" means is determined by the round function. # In many cases rounding isn't required, but when it is # round(top, bot) # is returned. top and bot are longs, with bot > 0 guaranteed. The # infinite-precision result is top/bot. round must return an integer # (long) approximation to top/bot, using whichever rounding discipline # you want. By default, IEEE round-to-nearest/even is used; see the # _roundXXX functions for examples of suitable rounding functions. # # Note: The only code here that knows we're working in decimal is # function _tento; simply change the "10L" in that to do fixed-point # arithmetic in some other base. # # Example: # # >>> r7 = _div(1L, 0, 7L, 0, 20) # 1/7 # >>> r7 # 14285714285714285714L # >>> r5 = _div(1L, 0, 5L, 0, 20) # 1/5 # >>> r5 # 20000000000000000000L # >>> sum = _add(r7, 20, r5, 20, 20) # 1/7 + 1/5 = 12/35 # >>> sum # 34285714285714285714L # >>> _mul(sum, 20, 35L, 0, 20) # 1199999999999999999990L # >>> _mul(sum, 20, 35L, 0, 18) # 12000000000000000000L # >>> _mul(sum, 20, 35L, 0, 0) # 12L # >>> ################################################################### # Sample rounding functions. ################################################################### # Round to minus infinity. def _roundminf(top, bot): assert bot > 0 return top / bot # Round to plus infinity. def _roundpinf(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot if r: q = q + 1 return q # IEEE nearest/even rounding (closest integer; in case of tie closest # even integer). def _roundne(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot c = cmp(r << 1, bot) # c < 0 <-> r < bot/2, etc if c > 0 or (c == 0 and (q & 1) == 1): q = q + 1 return q # "Add a half and chop" rounding (remainder < 1/2 toward 0; remainder # >= half away from 0). def _roundhalf(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot c = cmp(r << 1, bot) # c < 0 <-> r < bot/2, etc if c > 0 or (c == 0 and q >= 0): q = q + 1 return q # Round toward 0 (throw away remainder). def _roundchop(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot if r and q < 0: q = q + 1 return q ################################################################### # Kernels for + - * /. ################################################################### DEFAULT_ROUND = _roundne def _add(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 # (n1/10**p1 + n2/10**p2) * 10**p == # (n1*10**(max-p1) + n2*10**(max-p2))/10**max * 10**p max = p1 # until proven otherwise if p1 < p2: n1 = n1 * _tento(p2 - p1) max = p2 elif p2 < p1: n2 = n2 * _tento(p1 - p2) n3 = n1 + n2 p3 = p - max if p3 > 0: n3 = n3 * _tento(p3) elif p3 < 0: n3 = round(n3, _tento(-p3)) return n3 def _sub(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 return _add(n1, p1, -n2, p2, p, round) def _mul(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 # (n1/10**p1 * n2/10**p2) * 10**p == # (n1*n2)/10**(p1+p2) * 10**p n3 = n1 * n2 p3 = p - p1 - p2 if p3 > 0: n3 = n3 * _tento(p3) elif p3 < 0: n3 = round(n3, _tento(-p3)) return n3 def _div(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 if n2 == 0: raise ZeroDivisionError("scaled integer") # (n1/10**p1 / n2/10**p2) * 10**p == # (n1/n2) * 10**(p2-p1+p) p3 = p2 - p1 + p if p3 > 0: n1 = n1 * _tento(p3) elif p3 < 0: n2 = n2 * _tento(-p3) if n2 < 0: n1 = -n1 n2 = -n2 return round(n1, n2) def _tento(i, _cache={}): assert i >= 0 try: return _cache[i] except KeyError: answer = _cache[i] = 10L ** i return answer From fredrik@pythonware.com Thu Dec 30 11:05:45 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 30 Dec 1999 12:05:45 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <000001bf528c$5cbdb9a0$a02d153f@tim> Message-ID: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com> Tim Peters is back from his vacation: > > While I don't want to turn Python into Perl, I would like to see > > it do a better job of what most people probably use the language > > for. Here is a very short list of things I think need attention: > > > > 1. [*A* clear way to do memory- and time-efficient textfile > > input] > > I agree, but unsure how to fix it. The best way to write this now is > > # f is some open file object. > while 1: > lines = f.readlines(BUFSIZE) > if not lines: > break > for line in lines: > process(line) > > and it's not something anyone figures out on their own -- or enjoys typing > or explaining afterwards. > > Perl gets its line-at-a-time speed by peeking and poking C FILE structs > directly in compiler- and platform-specific ways -- ways that vendors > *should* have done in their own fgets implementations, but almost never do. > I have no idea whether it works well with Perl's nascent notions of > threading, but in the absence of that "the system" doesn't know Perl is > cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one > line at a time -- even mixing in C-level ungetc calls works (well, sometimes > <0.1 wink -- they don't always peek and poke enough fields>)). > > The Python QIO extension module is much easier to port but less compatible > (it doesn't use stdio, so QIO-opened files don't play well with others) and > slower (although that's likely repairable -- he's got two passes over the > buffer where one hairier pass should suffice). we have something called SIO which uses memory mapping where possible, and just a more aggressive read-ahead for other cases. on a windows box, a traditional while/readline loop runs 3-5 times faster than before. with SRE instead of re, a while/readline/match loop runs up to 10 times faster than before. note that this is without *any* changes to the Python source code... > > 2. The re module needs to be sped up, if not to catch up with > > Perl, then to catch up with the deprecated regex module. > > The irony here is that the re engine is very often unboundedly faster than > the regex engine -- provided you're chewing over large strings. Some tests > /F ran showed that the length-independent *overhead* of invoking re is about > 10x higher than for regex. Presumably the bulk of that is due to re.py, > i.e. that you get to the re engine via going thru Python layers on your way > in and out, while regex was pure C. I've attached some old benchmarks. I think the current code base is a bit faster, but you get the idea. > In any case, /F is working on a new engine (for Unicode), and I believe he > has this all well in hand. with a little luck, the new module will replace both pcre and regex... not to mention that it's fairly easy to write your own front- end to the matching engine -- the expression parser and the compiler are both written in good old python. $ python sre_bench.py 0 5 50 250 1000 5000 25000 ----- ----- ----- ----- ----- ----- ----- ----- search for Python|Perl in Perl -> sre8 0.007 0.008 0.010 0.010 0.020 0.073 0.349 sre16 0.007 0.007 0.008 0.010 0.020 0.075 0.353 re 0.097 0.097 0.101 0.103 0.118 0.175 0.480 regex 0.007 0.007 0.009 0.020 0.059 0.271 1.320 search for (Python|Perl) in Perl -> sre8 0.007 0.007 0.007 0.010 0.020 0.074 0.344 sre16 0.007 0.007 0.008 0.010 0.020 0.074 0.347 re 0.110 0.104 0.111 0.115 0.125 0.184 0.559 regex 0.006 0.006 0.009 0.019 0.057 0.285 1.432 search for Python in Python -> sre8 0.007 0.007 0.007 0.011 0.021 0.072 0.387 sre16 0.007 0.007 0.008 0.010 0.022 0.082 0.365 re 0.107 0.097 0.105 0.102 0.118 0.175 0.511 regex 0.009 0.008 0.010 0.018 0.036 0.139 0.708 search for .*Python in Python -> sre8 0.008 0.007 0.008 0.011 0.021 0.079 0.379 sre16 0.008 0.008 0.008 0.011 0.022 0.075 0.402 re 0.102 0.108 0.119 0.183 0.400 1.545 7.284 regex 0.013 0.019 0.072 0.318 1.231 8.035 45.366 search for .*Python.* in Python -> sre8 0.008 0.008 0.008 0.011 0.021 0.080 0.383 sre16 0.008 0.008 0.008 0.011 0.021 0.079 0.395 re 0.103 0.108 0.119 0.184 0.418 1.685 8.378 regex 0.013 0.020 0.073 0.326 1.264 9.961 46.511 search for .*(Python) in Python -> sre8 0.007 0.008 0.008 0.011 0.021 0.077 0.378 sre16 0.007 0.008 0.008 0.011 0.021 0.077 0.444 re 0.108 0.107 0.134 0.240 0.637 2.765 13.395 regex 0.026 0.112 3.820 87.322 (skipped) search for .*P.*y.*t.*h.*o.*n.* in Python -> sre8 0.010 0.010 0.014 0.031 0.093 0.419 2.212 sre16 0.010 0.011 0.014 0.030 0.093 0.419 2.292 re 0.112 0.121 0.195 0.521 1.747 8.298 40.877 regex 0.026 0.048 0.248 1.148 4.550 24.720 ... (searching for patterns in padded strings; sre8 is the sre engine compiled for 8-bit characters, sre16 is the same engine compiled for 16-bit characters) From mal@lemburg.com Thu Dec 30 11:52:50 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 30 Dec 1999 12:52:50 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <000001bf528c$5cbdb9a0$a02d153f@tim> Message-ID: <386B4792.A551022A@lemburg.com> Tim Peters wrote: > > [Skip Montanaro, wants nicer text facilities] > > While I don't want to turn Python into Perl, I would like to see > > it do a better job of what most people probably use the language > > for. Here is a very short list of things I think need attention: > > > > 1. [*A* clear way to do memory- and time-efficient textfile > > input] > > ... > > The Python QIO extension module is much easier to port but less compatible > (it doesn't use stdio, so QIO-opened files don't play well with others) and > slower (although that's likely repairable -- he's got two passes over the > buffer where one hairier pass should suffice). What is QIO ? > > Depending how far people want to go with things, adding some > > language syntax to support regular expressions might be in order. > > ... > > 3. I've not yet used it, but I am told the pattern matching in > > Marc-Andre Lemburg's mxTextTools > > (http://starship.python.net/crew/lemburg/) > > is both powerful and efficient (though it certainly appears > > complex). Perhaps it deserves consideration for > > incorporation into the core Python distribution. > > It's not complex, it's complicated -- and *that's* what makes it un-Pythonic > . Tony Ibbs has written a friendly wrapper around mxTextTools that > suppresses much of the non-essential complication. OTOH, if you go into > this with a regexp mindset, it will run much slower than a real regexp > package, because the bulk of the latter is devoted to doing optimization; > mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls > if you e.g. try to implement naive backtracking). All true. mxTextTools provides the tools, not the magic. But this is also its strength: you can optimize the hell out of your particular parsing requirement without having to think about how the RE optimizer works. > You should go to the REBOL site and look at the description of REBOL's PARSE > verb in the FAQ ... mumble, mumble ... at > > http://www.rebol.com/faq.html#11550948 > > Here's an example pulled from that page (this is a REBOL code fragment): > > digit: charset "0123456789" > expr: [term ["+" | "-"] expr | term] > term: [factor ["*" | "/"] term | factor] > factor: [primary "**" factor | primary] > primary: [value | "(" expr ")"] > value: [digit value | digit] > > parse "1 + 2 ** 9" expr > > There hasn't been a pattern scheme this clean, convenient or powerful since > SNOBOL4. It exploits REBOL's Forth-like (lack of!) syntax, and > Smalltalk-like penchant for passing around thunks (anonymous closures -- > "[...]" in REBOL builds a lexically-scoped entity called "a block", which > can be treated as code (executed) or data (manipulated like a Python list) > at will). Looks nice indeed, but how does executable code fit into that definition ? (mxTextTools allows you to write your own parsing elements in Python, BTW; it should be possible to use those mechanisms to achieve a similar intergration.) > ... > > BTW, the mxTextTools engine could be used to get blazing implementations of > the primary Searcher methods (it excels at simple analysis). OTOH, making > lots of calls to analyze short strings is slow. That's why mxTextTools converts these search idioms into byte codes which it executes at C level. Some future version will even "precompile" the tuple input and then omit the type checks during the search... that should give another noticeable speedup. Note that recursion etc. can be done at C level too -- Python function calls are not needed. > The only clean solutions to > that are Perl's and Icon's (build everyting into one language so the > compiler can optimize stuff away), and REBOL's (make no distinction between > code and data, so that code can be analyzed & optimized at runtime -- and > build the entire implementation around making closures and calls > supernaturally fast). Just for kicks, here is the mysplit() function using mxTextTools: from mx.TextTools import * table = ( # Match all whitespace (None,AllInSet,whitespace_set,+1), # Match and tag all non-whitespace ('text',AllInSet + AppendMatch,nonwhitespace_set,+1), # Loop until EOF (None,EOF,Here,-2), ) def mysplit(text): return tag(text,table)[1] The timings: mysplit: 5.84 sec. string.split: 3.62 sec. Note that you can customize the above to split text at any character set you like, not just whitespace... without compiling or writing C code. The function mx.TextTools.setsplit() provides this functionality as pure C function. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Get ready to party ! Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jim@interet.com Thu Dec 30 14:21:36 1999 From: jim@interet.com (James C. Ahlstrom) Date: Thu, 30 Dec 1999 09:21:36 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk> Message-ID: <386B6A70.3C9A0042@interet.com> Finn Bock wrote: > > James C. Ahlstrom wrote: > > > ftp://ftp.interet.com/pub/pylib.html > > I feel that it smell a bit too much like a tool and too little like an general > programming api. It was meant to be an API except for writepy(), which is clearly a tool. > - It can only add disk files. The ability to write data to a zip entry through > a file-like object or from a string would make it more like an API, IMHO I could add a method writestr(self, string, year, month, day, hour, minute, second, ...) There are a lot of fields required which usually come from the file. > - Some kind of access to the TOC entry fields (date, size, compressed > size etc) also seems like a nice feature. This access is provided directly by self.TOC, and the fields are documented. > - The data for an entry must be available in memory. Could be a problem > for huge files, but most like not in practical use. I agree, but adding loops will make it slower. What do others think? > I admit that I am fond of the api from java.util.zip.ZipFile and > java.util.zip.ZipOutputStream. I don't know this API. If writestr() is not sufficient, what API would you like? JimA From bckfnn@pipmail.dknet.dk Thu Dec 30 19:14:14 1999 From: bckfnn@pipmail.dknet.dk (Finn Bock) Date: Thu, 30 Dec 1999 19:14:14 GMT Subject: [Python-Dev] zipfile.py In-Reply-To: <386B6A70.3C9A0042@interet.com> References: <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk> <386B6A70.3C9A0042@interet.com> Message-ID: <386baec9.2867733@pipmail.dknet.dk> [I wrote] > - It can only add disk files. The ability to write data to a zip entry through > a file-like object or from a string would make it more like an API, IMHO [JimA wrote] >I could add a method > writestr(self, string, year, month, day, hour, minute, second, ...) >There are a lot of fields required which usually come from the file. Something like that seems fine to me. [I wrote] > - Some kind of access to the TOC entry fields (date, size, compressed > size etc) also seems like a nice feature. [JimA answers] >This access is provided directly by self.TOC, and the fields are >documented. Good enough. My bad, I was looking for getter methods. (me being a java dude) [I wrote] > I admit that I am fond of the api from java.util.zip.ZipFile and > java.util.zip.ZipOutputStream. [JimA asks] >I don't know this API. If writestr() is not sufficient, what >API would you like? This is only meant as a source for inspiration, certainly as a request for change. writestr would answer my complaint nicely. Below, only one ZipEntry can be actively read or written to at a time. All the small details of performance and implementation complexity are ignored. class ZipFile: def getEntry(name): ... self.activeentry = ZipEntry(name) return self.activeentry class ZipEntry: #enough methods and fields to fake file-ness to casual users like me. def write(list): ... def writelines(str): ... def read(size=None): ... def readlines(sizehint=-1): ... def seek(offset): ... def flush(): ... def close(str): ... def getSize(): .... def getCompressedSize(): .... def getFlags(): .... regards, finn From tim_one@email.msn.com Fri Dec 31 03:35:18 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 22:35:18 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <386B4792.A551022A@lemburg.com> Message-ID: <000001bf5340$0fb20300$e12d153f@tim> [M.-A. Lemburg] > What is QIO ? See DejaNews (I don't save URLs). "Quick" line-oriented text input adapted from INN. Someone rewrote that as a Python extension module. >> http://www.rebol.com/faq.html#11550948 > Looks nice indeed, but how does executable code fit into > that definition ? See the URL above I didn't save . PARSE's "pattern" argument is a block. Blocks can be (& often are) nested. Whether any given block is code or data is all the same to REBOL, so passing nested code blocks in PARSE's pattern argument is easy. Because blocks are lexically scoped, assignments (etc) inside a block are (well, can be) visible to its context; etc. It's a very Lispish approach. REBOL is essentially Scheme under the covers, but with syntax much more like Forth's (whitespace-separated strings of arbitrary non-whitespace characters, with few pre-assigned meanings or restrictions -- in fact, it's impossible for a compiler to determine where a REBOL function call begins or ends! can't be known until runtime). > (mxTextTools allows you to write your own parsing elements > in Python, BTW; it should be possible to use those mechanisms > to achieve a similar intergration.) It can't capture the flavor -- although I don't know that it needs to . There's no distinction between "the pattern language" and "the computational language" in REBOL or Icon, and it's hard to explain what a maddening distinction that can be once you've lived without it. mxTextTools embedding would feel more like Icon, where the matching engine is fully exposed to the programmer (REBOL hides it, allowing only "approved" interactions). >> OTOH, making lots of calls to analyze short strings is slow. > That's why mxTextTools converts these search idioms into byte > codes which it executes at C level. Some future version will > even "precompile" the tuple input and then omit the type checks > during the search...that should give another noticeable speedup. > Note that recursion etc. can be done at C level too -- Python > function calls are not needed. That's also the curse of having distinct languages; e.g., Python already had recursion, but you needed to reimplement it in a different way with different syntax and different rules in your pattern language. In Icon etc, there's no difference between a recursive pattern and a recursive function, except in *what* it computes. The machinery is all the same, and both more powerful and easier to learn because of that. > ... > Just for kicks, here is the mysplit() function using mxTextTools: > > from mx.TextTools import * > > table = ( > # Match all whitespace > (None,AllInSet,whitespace_set,+1), > # Match and tag all non-whitespace > ('text',AllInSet + AppendMatch,nonwhitespace_set,+1), > # Loop until EOF > (None,EOF,Here,-2), > ) > > def mysplit(text): > > return tag(text,table)[1] > > The timings: > mysplit: 5.84 sec. > string.split: 3.62 sec. > > Note that you can customize the above to split text at any > character set you like, not just whitespace... without > compiling or writing C code. That's equally true of the example I posted . Now what if I wanted to stop splitting right after I find a keyword, recognized as such because it's a key in some passed-in dictionary? In my example, I make an obvious local code change, from while s.notmany(white): # consume non-whitespace result.append(s.get_match()) s.many(white) to while s.notmany(white): # consume non-whitespace word = s.get_match() result.append(word) if dictionary.has_key(word): break s.many(white) What does it do to your example? Or what if the target string isn't "a string" (the code I posted only assumes the "str" object responds to indexing and slicing -- any buffer object is fine -- so my example doesn't change at all)? Or what if you need to pass the tokens on as they're found, pipeline style? Etc. This is why I do complex string processing in Icon <0.9 wink>. OTOH, at what it does well, mxTextTools runs quicker than Icon. Its biggest problem has always been that e.g. nobody knows what the hell (None,EOF,Here,-2), *means* at first glance -- or third . an-extreme-on-the-transparency-vs-speed-curve-ly y'rs - tim From mal@lemburg.com Fri Dec 31 11:18:57 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 31 Dec 1999 12:18:57 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <000001bf5340$0fb20300$e12d153f@tim> Message-ID: <386C9121.E9D9DC01@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > What is QIO ? > > See DejaNews (I don't save URLs). "Quick" line-oriented text input adapted > from INN. Someone rewrote that as a Python extension module. Ok, thanks. > >> http://www.rebol.com/faq.html#11550948 > > > Looks nice indeed, but how does executable code fit into > > that definition ? > > See the URL above I didn't save . PARSE's "pattern" argument is a > block. Blocks can be (& often are) nested. Whether any given block is code > or data is all the same to REBOL, so passing nested code blocks in PARSE's > pattern argument is easy. Because blocks are lexically scoped, assignments > (etc) inside a block are (well, can be) visible to its context; etc. It's a > very Lispish approach. REBOL is essentially Scheme under the covers, but > with syntax much more like Forth's (whitespace-separated strings of > arbitrary non-whitespace characters, with few pre-assigned meanings or > restrictions -- in fact, it's impossible for a compiler to determine where a > REBOL function call begins or ends! can't be known until runtime). If I understand the concept correctly, I think Python could do pretty much the same thing. The bummer is of course the need for new keywords and byte codes (although these could be split out into a separate text scanning engine). Using Python function calls would slow down things to an extent that would render the added functionality useless, well IMHO anyways ;-) > > (mxTextTools allows you to write your own parsing elements > > in Python, BTW; it should be possible to use those mechanisms > > to achieve a similar intergration.) > > It can't capture the flavor -- although I don't know that it needs to > . There's no distinction between "the pattern language" and "the > computational language" in REBOL or Icon, and it's hard to explain what a > maddening distinction that can be once you've lived without it. mxTextTools > embedding would feel more like Icon, where the matching engine is fully > exposed to the programmer (REBOL hides it, allowing only "approved" > interactions). Of course its hard for a Turing Machine to capture the flavor of any high level language :-) When you're programming the mxTextTools Tagging Engine directly you feel like writing assembler... but things are moving in the right direction: Tony Ibbs has a nice meta-language and M.C. Fletcher his SimpleParse to cover up these insufficiencies. > >> OTOH, making lots of calls to analyze short strings is slow. > > > That's why mxTextTools converts these search idioms into byte > > codes which it executes at C level. Some future version will > > even "precompile" the tuple input and then omit the type checks > > during the search...that should give another noticeable speedup. > > Note that recursion etc. can be done at C level too -- Python > > function calls are not needed. > > That's also the curse of having distinct languages; e.g., Python already had > recursion, but you needed to reimplement it in a different way with > different syntax and different rules in your pattern language. In Icon etc, > there's no difference between a recursive pattern and a recursive function, > except in *what* it computes. The machinery is all the same, and both more > powerful and easier to learn because of that. Agreed. > > ... > > Just for kicks, here is the mysplit() function using mxTextTools: > > > > from mx.TextTools import * > > > > table = ( > > # Match all whitespace > > (None,AllInSet,whitespace_set,+1), > > # Match and tag all non-whitespace > > ('text',AllInSet + AppendMatch,nonwhitespace_set,+1), > > # Loop until EOF > > (None,EOF,Here,-2), > > ) > > > > def mysplit(text): > > > > return tag(text,table)[1] > > > > The timings: > > mysplit: 5.84 sec. > > string.split: 3.62 sec. > > > > Note that you can customize the above to split text at any > > character set you like, not just whitespace... without > > compiling or writing C code. > > That's equally true of the example I posted . Now what if I wanted to > stop splitting right after I find a keyword, recognized as such because it's > a key in some passed-in dictionary? In my example, I make an obvious local > code change, from > > while s.notmany(white): # consume non-whitespace > result.append(s.get_match()) > s.many(white) > > to > > while s.notmany(white): # consume non-whitespace > word = s.get_match() > result.append(word) > if dictionary.has_key(word): > break > s.many(white) > > What does it do to your example? You'd replace the 'text' tagobj with a callable object and write AllInSet + CallTag as command. The Tagging Engine will then call the object with arguments (taglist,text,l,r,subtags) and let it decide what to do. In your example it would check the dictionary and raise an exception in case a keyword is found to stop any further scanning. If it's not a keyword, it would simply append the found string to the taglist and return None. Here's the code: from mx.TextTools import * import exceptions stoplist = {'abc':1, 'def':1} class KeywordFound(exceptions.StandardError): def __init__(self, taglist): self.taglist = taglist def callable(taglist,text,l,r,subtags): taglist.append(text[l:r]) if stoplist.has_key(text[l:r]): raise KeywordFound(taglist) table = ( # Match all whitespace (None,AllInSet,whitespace_set,+1), # Match and tag all non-whitespace (callable,AllInSet + CallTag,nonwhitespace_set,+1), # Loop until EOF (None,EOF,Here,-2), ) def mysplitex(text): try: return tag(text,table)[1] except KeywordFound,data: return data.taglist > Or what if the target string isn't "a > string" (the code I posted only assumes the "str" object responds to > indexing and slicing -- any buffer object is fine -- so my example doesn't > change at all)? The current version only handles string objects, but I am already beginning to convert all the APIs in mxTextTools to "s#" or "t#" style (can't decide which to use... "s#" is great for processing raw data, while "t#" more closely refers to text processing). > Or what if you need to pass the tokens on as they're found, > pipeline style? Etc. This is why I do complex string processing in Icon > <0.9 wink>. You can have all that extra magic via callable tag objects or callable matching functions. It's not exactly nice to write, but I'm sure that a meta-language could do the conversions for you. > OTOH, at what it does well, mxTextTools runs quicker than Icon. Its biggest > problem has always been that e.g. nobody knows what the hell > > (None,EOF,Here,-2), > > *means* at first glance -- or third . The structure of those tag tables is very simple: (tagobject, command, argument[, jump offset in case of failure [, jump offset in case of success]]) Please remember that this is byte code, not some higher level abstraction. The design is very much inverted from what you'd usually do: design a nice language and then try to find suitable set of byte codes to make it work as intended. Anyway, I'll keep focussing on the speed aspect of mxTextTools; others can focus on abstractions, so that eventually everybody will be happy :-) Happy New Year, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Get ready to party ! Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tim_one@email.msn.com Fri Dec 31 22:53:49 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 31 Dec 1999 17:53:49 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com> Message-ID: <000701bf53e1$e7119760$472d153f@tim> [Fredrik Lundh, whose very nice eMatter book is on sale until the end of the 20th century (as real people think of it), although the eMatter distribution scheme has lots of problems [just an editorial note from a bot who has to-- for unknown reasons Fatbrain "is working on" --delete the Fatbrain registry tree and reregister the book almost every time he tries to open it ] ] > we have something called SIO which uses memory mapping > where possible, and just a more aggressive read-ahead for > other cases. on a windows box, a traditional while/readline > loop runs 3-5 times faster than before. with SRE instead of > re, a while/readline/match loop runs up to 10 times faster > than before. > > note that this is without *any* changes to the Python > source code... If so, there's potential for significantly more speed. Python does its line-at-a-time input with a character-at-a-time macro-in-a-loop, the same way naive vendors (read "almost all vendors") implement fgets. It's replacing that inner loop with direct peeking into the FILE buffer that gets Perl its dramatic speed -- despite that Perl has fancier input functionality (the oft-requested automagical "input record separator"). So it sounds like the Perl trick is orthogonal to SIO's tricks; Perl isn't doing mmaps or read-aheads or anything else fancy under the covers -- it only optimizes the inner loop! > ... > with a little luck, the new module will replace both pcre > and regex... If something more tangible than luck would help to make this come true, feel free to mention it . > not to mention that it's fairly easy to write your own front- > end to the matching engine -- the expression parser and the > compiler are both written in good old python. Ah, good news / bad news. Perl refugees aren't accustomed to "precompiling" regexp objects, so write code that will cause regexps to get recompiled over & over. Even if you cache the results under the covers, the overhead of the Python call to the regexp compiler will likely take as long as the engine takes to search. Personally, in such cases, I think they should learn how to use the language <0.5 wink>. From tim_one@email.msn.com Fri Dec 31 22:53:56 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 31 Dec 1999 17:53:56 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <386C9121.E9D9DC01@lemburg.com> Message-ID: <000901bf53e1$eb4248c0$472d153f@tim> >> This is why I do complex string processing in Icon <0.9 wink>. [MAL] > You can have all that extra magic via callable tag objects > or callable matching functions. It's not exactly nice to > write, but I'm sure that a meta-language could do the > conversions for you. That wasn't my point: I do it in Icon because it *is* "exactly nice to write", and doesn't require any yet-another meta-language. It's all straightforward, in a way that separate schemes pasted together can never be (simply because they *are* "separate schemes pasted together" ). The point of my Python examples wasn't that they could do something mxTextTools can't do, but that they were *Python* examples: every variation I mentioned (or that you're likely to think of) was easy to handle for any Python programmer because the "control flow" and "data type" etc aspects could be handled exactly the way they always are in *non* pattern-matching Python code too, rather than recoded in pattern-scheme-specific different ways (e.g., where I had a vanailla "if/break", you set up a special exception to tickle the matching engine). I'm not attacking mxTextTools, so don't feel compelled to defend it -- people using regexps in those examples are dead in the water. mxTextTools is very good at what it does; if we have a real disagreement, it's probably that I'm less optimistic about the prospects for higher-level wrappers (e.g., MikeF's SimpleParse is much slower than "a real" BNF parsing system (ARBNFPS), in part because he isn't doing all the optimizations ARBNFPS does, but also in part because ARBNFPS uses an underlying engine more optimized to its specific task than mxTextTool's more-general engine *can* be). So I don't see mxTextTools as being the answer to everything -- and if you hadn't written it, you would agree with that on first glance . > Anyway, I'll keep focussing on the speed aspect of mxTextTools; > others can focus on abstractions, so that eventually everybody > will be happy :-) You and I will be, anyway . From guido at CNRI.Reston.VA.US Wed Dec 1 18:32:08 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 12:32:08 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Fri, 19 Nov 1999 14:59:11 CST." <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> Message-ID: <199912011732.MAA10419@eric.cnri.reston.va.us> > My first Python-Dev post. :-) Welcome! > >We had some discussion a while back about enabling thread support by > >default, if the underlying OS supports it obviously. I agree with this. MacOS seems to be the only OS without threads these days. > What's the consensus about Python microthreads -- a likely candidate > for incorporation in 1.6 (or later)? What are microthreads? If you think about threads implemented in the Python VM instead of in the OS, forget it. > Also, we have a couple minor convenience functions for Python in an > MSDEV environment, an exposure of OutputDebugString for writing to > the DevStudio log window and a means of tripping DevStudio C/C++ layer > breakpoints from Python code (currently experimental). The msvcrt > module seems like a likely candidate for these, would these be > welcome additions? Sure -- send patches. --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli at amber.org Wed Dec 1 18:39:00 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Wed, 1 Dec 1999 12:39:00 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: <199912011732.MAA10419@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Wed, Dec 01, 1999 at 12:32:08PM -0500 References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us> Message-ID: <19991201123900.A7419@trump.amber.org> Guido van Rossum [guido at CNRI.Reston.VA.US] wrote: > > >We had some discussion a while back about enabling thread support by > > >default, if the underlying OS supports it obviously. > > I agree with this. MacOS seems to be the only OS without threads > these days. I believe the new GUISI package has pthread-API compatible threads implemented, which talk to the underlying ThreadManager. With MacOSX being impending before 1.6 (i.e. early 2000), I'd say this is a good way to go. Threads are VERY useful for a lot of problem domains. Chris -- | Christopher Petrilli | petrilli at amber.org From guido at CNRI.Reston.VA.US Wed Dec 1 18:54:53 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 12:54:53 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Wed, 01 Dec 1999 12:39:00 EST." <19991201123900.A7419@trump.amber.org> References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us> <19991201123900.A7419@trump.amber.org> Message-ID: <199912011754.MAA10465@eric.cnri.reston.va.us> > > I agree with this. MacOS seems to be the only OS without threads > > these days. > > I believe the new GUISI package has pthread-API compatible threads > implemented, which talk to the underlying ThreadManager. With MacOSX > being impending before 1.6 (i.e. early 2000), I'd say this is a good > way to go. Threads are VERY useful for a lot of problem domains. What's GUISI? The son of GUSI? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Wed Dec 1 18:55:19 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 12:55:19 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Wed, 01 Dec 1999 12:32:08 EST." <199912011732.MAA10419@eric.cnri.reston.va.us> References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us> Message-ID: <199912011755.MAA10476@eric.cnri.reston.va.us> > > Also, we have a couple minor convenience functions for Python in an > > MSDEV environment, an exposure of OutputDebugString for writing to > > the DevStudio log window and a means of tripping DevStudio C/C++ layer > > breakpoints from Python code (currently experimental). The msvcrt > > module seems like a likely candidate for these, would these be > > welcome additions? > > Sure -- send patches. I hadn't seen Mark Hammond's response -- I take it back. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Wed Dec 1 19:15:26 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 13:15:26 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Sat, 20 Nov 1999 11:04:28 +1100." <005f01bf32ea$d0b82b90$0501a8c0@bobcat> References: <005f01bf32ea$d0b82b90$0501a8c0@bobcat> Message-ID: <199912011815.NAA10506@eric.cnri.reston.va.us> > This is really a pointer to the fact that some or all of the win32api > should be moved into the core - registry access is the thing people > most want, but there are plenty of other useful things that people > reguarly use... > > Guido objects to the coding style, but hopefully that wont be a big > issue. IMO, the coding style isnt "bad" - it is just more an "MS" > flavour than a "Python" flavour - presumably people reading the code > will have some experience with Windows, so it wont look completely > foreign to them. The good thing about taking it "as-is" is that it > has been fairly well bashed on over a few years, so is really quite > stable. The final "coding style" issue is that there are no "doc > strings" - all documentation is embedded in C comments, and extracted > using a tool called "autoduck" (similar to "autodoc"). However, Im > sure we can arrange something there, too. That's a good summary of the status quo. I would appreciate it if win32all could become part of the core. However the coding style issues need to be addressed (I also believe that it needs to be compiled in C++ mode). One concern that Mark doesn't mention is that there are some safety issues -- you can abuse some of the calls to cause segfaults, whether intentional or by mistake, and that's not a good thing. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Wed Dec 1 19:55:40 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 13:55:40 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Wed, 24 Nov 1999 09:43:57 EST." <383BF9AD.E183FB98@interet.com> References: <383BF9AD.E183FB98@interet.com> Message-ID: <199912011855.NAA10662@eric.cnri.reston.va.us> > I would like to argue that on Windows, import of dynamic libraries is > broken. If a file something.pyd is imported, then sys.path is searched > to find the module. If a file something.dll is imported, the same thing > happens. But Windows defines its own search order for *.dll files which > Python ignores. I would suggest that this is wrong for files named > *.dll, > but OK for files named *.pyd. I think you misunderstand some of the issues. Python cannot import every .dll file. Only .dll files that conform to the convention for Python extension modules can be imported. (The convention is that it must export an init function.) On most other platforms, shared libraries must have a specific extension (e.g. .so on most Unix). Python allows you to drop such a file into any directory where is looks for modules, and it will then direct the dynamic load support to load that specific file. This seems logical -- Python extensions must live in directories that Python searches (Python must do its own search because the search order is significant). On Windows, Python uses the same strategy. The only modification is that it is allowed to give the file a different extension, namely .pyd, to indicate that this really is a Python extension and not a regular DLL. This was mostly introduced because it is apparently common to have an existing DLL "foo.dll" and write a Python wrapper for it that is also called "foo". Clearly, two files foo.dll are too confusing, so we let you name the wrapper foo.pyd. But because the file format is essentially that of a DLL, we don't *require* this renaming; some ways of creating DLLs in the first place may make it difficult to do. > A SysAdmin should be able to install and maintain *.dll as she has > been trained to do. This makes maintaining Python installations > simpler and more un-surprising. I don't see that a SysAdmin needs to do much DLL management. This is up to installer scripts. Anyway how hard can it be for a SysAdmin to leave DLLs in specific directories alone? > I have no solution to the backward compatibilty problem. But the > code is only a couple lines. A LoadLibrary() call does its own > path searching. But at what point should this LoadLibrary() call be called? The import statement contains no clue that a DLL is requested -- the sys.path search reveals that. I claim that there is nothing with the current strategy. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Wed Dec 1 20:01:12 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 1 Dec 1999 14:01:12 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs References: <199911161700.MAA02716@eric.cnri.reston.va.us> <14389.31511.706588.20840@anthem.cnri.reston.va.us> Message-ID: <14405.28792.184298.298597@anthem.cnri.reston.va.us> >>>>> "BAW" == Barry A Warsaw writes: BAW> There was a suggestion to start augmenting the checkin emails BAW> to include the diffs of the checkin. This would let you keep BAW> a current snapshot of the tree without having to do a direct BAW> `cvs update'. The voting has stopped, with the "yeah" vote slightly head of the "nay" vote. We'll go with context diffs, and we'll be implementing Greg Stein's approach with the xml-checkins list: truncating diffs to H number of lines at the top and T number of lines at the bottom, so as not to overwhelm incoming email. I'll try to get this going sometime today (no promises). You'll likely see a number of tests coming through python-checkins in the meantime. I'll send a message out when it's done. -Barry From da at ski.org Wed Dec 1 20:34:56 1999 From: da at ski.org (David Ascher) Date: Wed, 1 Dec 1999 11:34:56 -0800 (Pacific Standard Time) Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues In-Reply-To: <14405.25141.297349.76968@gargle.gargle.HOWL> Message-ID: On Wed, 1 Dec 1999, Geoffrey Furnish wrote: [...] > Well, like I said above, I haven't analyzed your posts for technical > details, so I can't say whether you made avoidable mistakes. But I > definitely do agree with you that it is roughly 100 times harder than > it needs to be, to use Python from C++. The charter of this sig is to > fix that, by developing the additional software that would allow > Python's compiled interface to be exploited from C++ "with ease". > > The first and most basic issue, is compiling Python so it initializes > C++ global objects correctly. There is a patch on the sig's www site > to help with that. Any opinions from this esteemed body re: integrating said patch in the main tree? --david From jim at interet.com Wed Dec 1 20:47:14 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 14:47:14 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us> Message-ID: <38457B42.85552AC@interet.com> Guido van Rossum wrote: > > > I would like to argue that on Windows, import of dynamic libraries is > > broken. If a file something.pyd is imported, then sys.path is searched > > to find the module. If a file something.dll is imported, the same thing > > happens. But Windows defines its own search order for *.dll files which > > Python ignores. I would suggest that this is wrong for files named > > *.dll, > > but OK for files named *.pyd. > > I think you misunderstand some of the issues. > > Python cannot import every .dll file. Only .dll files that conform to > the convention for Python extension modules can be imported. (The > convention is that it must export an init function.) Of course I meant that the test is LoadLibrary(module) followed by GetProcAddress(h, "init" + module). Both must succeed. > This seems logical -- Python extensions must live in directories that > Python searches (Python must do its own search because the search > order is significant). The PYTHONPATH search path is what I am trying to get away from. If I eliminate PYTHONPATH I still can not use the Windows DLL search path (which is superior) because DLLs are searched on PYTHONPATH too; thus my post. I don't believe it is important for Python module.dll to be located on PYTHONPATH. > > A SysAdmin should be able to install and maintain *.dll as she has > > been trained to do. This makes maintaining Python installations > > simpler and more un-surprising. > > I don't see that a SysAdmin needs to do much DLL management. This is > up to installer scripts. Anyway how hard can it be for a SysAdmin to > leave DLLs in specific directories alone? The problem is maintaining PYTHONPATH plus having DLL's on a non-standard search path. Yes, PythonDev[:] and professional SysAdmins can do it. But it is not as simple as it could be. Someone has to write the install scripts. And what if something doesn't work? Think of Python being used as a teaching language for the 8th grade. Think of the 8th grade teacher trying to get all this right. The only thing that works is simplicity. > But at what point should this LoadLibrary() call be called? The > import statement contains no clue that a DLL is requested -- the > sys.path search reveals that. Just after built-in and frozen modules. > I claim that there is nothing with the current strategy. Thank you for thoughtfully considering and commenting at length on this issue. Lets ignore it for the moment. The other problems with PYTHONPATH are more pressing. But if those issues are solved, this one will stick out. JimA From da at ski.org Wed Dec 1 20:59:44 1999 From: da at ski.org (David Ascher) Date: Wed, 1 Dec 1999 11:59:44 -0800 (Pacific Standard Time) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <38457B42.85552AC@interet.com> Message-ID: On Wed, 1 Dec 1999, James C. Ahlstrom wrote: > > This seems logical -- Python extensions must live in directories that > > Python searches (Python must do its own search because the search > > order is significant). > > The PYTHONPATH search path is what I am trying to get away > from. If I eliminate PYTHONPATH I still can not use the > Windows DLL search path (which is superior) because DLLs > are searched on PYTHONPATH too; thus my post. I don't believe > it is important for Python module.dll to be located on PYTHONPATH. Why is the DLL search path superior? In my experience, the DLL search path (PATH for short) is problematic because it requires either using the System control panel or modifying autoexec.bat, both of which can have massive systemic effects completely unrelated to Python if a mistake is made during the modification. On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH, although I think there are significant variations in how that works across platforms. Most beginning unix users have no idea how to modify their LD_LIBRARY_PATH, as they typically don't understand the configuration mechanisms on Unix (system vs. user-specific, login vs. shell-specific, different shell configuration languages, etc.). I know it's not what you had in mind, but have you tried doing something like: import sys, os, string sys.path.extend(string.split(os.environ['PATH'], ';')) --david From gmcm at hypernet.com Wed Dec 1 21:19:13 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 1 Dec 1999 15:19:13 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: References: <38457B42.85552AC@interet.com> Message-ID: <1268042932-41354568@hypernet.com> David Ascher wrote: > On Wed, 1 Dec 1999, James C. Ahlstrom wrote: > > > > This seems logical -- Python extensions must live in > > > directories that Python searches (Python must do its own > > > search because the search order is significant). > > > > The PYTHONPATH search path is what I am trying to get away > > from. If I eliminate PYTHONPATH I still can not use the > > Windows DLL search path (which is superior) because DLLs are > > searched on PYTHONPATH too; thus my post. I don't believe it > > is important for Python module.dll to be located on PYTHONPATH. > > Why is the DLL search path superior? > > In my experience, the DLL search path (PATH for short) Make that: [ os.path.dirname(sys.executable), os.getcwd(), win32api.GetSystemDirectory(), os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'), win32api.GetWindowsDirectory() ] + string.split(os.environ['PATH'], ';') > is > problematic because it requires either using the System control > panel or modifying autoexec.bat, both of which can have massive > systemic effects completely unrelated to Python if a mistake is > made during the modification. Hear, hear! [snip] - Gordon From jim at interet.com Wed Dec 1 21:36:04 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 15:36:04 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: Message-ID: <384586B4.48905B32@interet.com> David Ascher wrote: > Why is the DLL search path superior? > > In my experience, the DLL search path (PATH for short) is problematic > because it requires either using the System control panel or modifying > autoexec.bat, both of which can have massive systemic effects completely > unrelated to Python if a mistake is made during the modification. I agree that altering PATH is problematic. So is altering PYTHONPATH and for exactly the same reason. That is why I think PYTHONPATH is a bad idea. The reason the DLL search path is superior is that it is not just PATH. It defines a path which includes the install directory of the application plus the system directories, and this path is discovered at runtime. So it is not necessary to set a global PYTHONPATH, nor make registry entries, nor do anything at all. It Just Works. The Windows DLL search path is: 1) The directory of the executable program. That means you can just throw all your DLL's in with the *.exe's, and it all Just Works. 2) The current directory. Also useful. 3) The Windows system directory (call GetSystemDirectory() to get this). 4) The Windows directory (call GetWindowsDirectory() to get this). These two directories are used for system files. Think of /sbin, /bin. Windows apps usually throw some of their DLL's here, especially if they are of general interest. 5) The directories in PATH. This is relatively useless, and AFAIK it is seldom used in a real installation. It is a left-over from DOS. That is also why it appears last. > On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH, > although I think there are significant variations in how that works across > platforms. Most beginning unix users have no idea how to modify their > LD_LIBRARY_PATH, as they typically don't understand the configuration > mechanisms on Unix (system vs. user-specific, login vs. shell-specific, > different shell configuration languages, etc.). I agree. > > I know it's not what you had in mind, but have you tried doing something > like: > > import sys, os, string > sys.path.extend(string.split(os.environ['PATH'], ';')) Adding PATH (or anything else) to PYTHONPATH is making it worse. Have you tried "import sys; print sys.path" on Windows? It is junk. JimA From jim at interet.com Wed Dec 1 21:44:00 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 15:44:00 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38457B42.85552AC@interet.com> <1268042932-41354568@hypernet.com> Message-ID: <38458890.BCB36FE2@interet.com> Gordon McMillan wrote: > Make that: > [ os.path.dirname(sys.executable), > os.getcwd(), > win32api.GetSystemDirectory(), > os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'), > win32api.GetWindowsDirectory() > ] + string.split(os.environ['PATH'], ';') Very nice! "../SYSTEM" needed on NT I guess. JimA From fredrik at pythonware.com Wed Dec 1 21:56:16 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 1 Dec 1999 21:56:16 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> Message-ID: <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> James C. Ahlstrom wrote: > Adding PATH (or anything else) to PYTHONPATH is making it worse. Have > you tried "import sys; print sys.path" on Windows? It is junk. not on my machine. it would help if you stopped assuming that every- one have the same problems as you have. we've distributed several python apps on windows, and frankly, I don't understand what you're talking about. From jim at interet.com Wed Dec 1 22:26:37 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 16:26:37 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> Message-ID: <3845928D.C0462322@interet.com> Fredrik Lundh wrote: > > you tried "import sys; print sys.path" on Windows? It is junk. > > not on my machine. On my Windows machine I get: ['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib', '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin'] PYTHONPATH is N:/prd/winlease/vest. os.path.dirname(sys.executable) is F:/bin. The others are junk. What do you get? Did you change sys.path from the default? > it would help if you stopped assuming that every- > one have the same problems as you have. we've > distributed several python apps on windows, and > frankly, I don't understand what you're talking > about. We distribute our app by freezing all *.py files into a DLL, and we don't set PYTHONPATH on the target machine. The files are located with the executable file and are found there. This works fine and we don't have a problem with it. It would help me a lot if you could describe how you distribute your app. Do you set PYTHONPATH on the target machine? JimA From da at ski.org Wed Dec 1 22:41:31 1999 From: da at ski.org (David Ascher) Date: Wed, 1 Dec 1999 13:41:31 -0800 (Pacific Standard Time) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <384586B4.48905B32@interet.com> Message-ID: On Wed, 1 Dec 1999, James C. Ahlstrom wrote: > > In my experience, the DLL search path (PATH for short) is problematic > > because it requires either using the System control panel or modifying > > autoexec.bat, both of which can have massive systemic effects completely > > unrelated to Python if a mistake is made during the modification. > > I agree that altering PATH is problematic. So is altering PYTHONPATH > and for exactly the same reason. That is why I think PYTHONPATH is > a bad idea. I see. Thanks for the explanation. I didn't know the complete story of the "Windows DLL search path". BTW, I think a huge difference b/w PYTHONPATH and PATH is the system-wide nature of PATH, vs. the Python-restriced nature of PYTHONPATH. --david From mhammond at skippinet.com.au Wed Dec 1 23:29:38 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Thu, 2 Dec 1999 09:29:38 +1100 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Message-ID: <009c01bf3c4b$8f119090$0501a8c0@bobcat> > I see. Thanks for the explanation. I didn't know the > complete story of > the "Windows DLL search path". BTW, I think a huge difference b/w > PYTHONPATH and PATH is the system-wide nature of PATH, vs. the > Python-restriced nature of PYTHONPATH. And more to the point - and the critical distinction - is that PYTHONPATH is actually specific to the Python _app_, not just Python on the machine. Sure - the standard Python installation puts a "default" PYTHONPATH suitable for general purpose development - but any distributed application _can_ define their own PYTHONPATH that is independant of any other Python systems or applications. People have been doing this for years, including MS :-) Sorry Jim, but count this as another vote against it - which isnt to argue that the current system is perfect, simply (IMO) better than the Windows path and DLL search order. Mark. From guido at CNRI.Reston.VA.US Thu Dec 2 00:00:21 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:00:21 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Wed, 01 Dec 1999 16:26:37 EST." <3845928D.C0462322@interet.com> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> Message-ID: <199912012300.SAA10861@eric.cnri.reston.va.us> > Fredrik Lundh wrote: > > > > you tried "import sys; print sys.path" on Windows? It is junk. > > > > not on my machine. > > On my Windows machine I get: > > ['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib', > '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin'] > > PYTHONPATH is N:/prd/winlease/vest. > os.path.dirname(sys.executable) is F:/bin. > The others are junk. What do you get? Did > you change sys.path from the default? You must not have used the standard Python installer; if you had used it you wouldn't have had this problem (and perhaps we wouldn't have had this discussion). The problem is that you apparently have installed python.exe in f:\bin. "Modern" Python versions execute some code at startup that comes up with a suitable value for sys.path; the Windows version of this code is in PC/getpathp.c -- I recommend that you study it. This code tries to find the Python install directory by looking for a "landmark" file relative to the executable path, and then adds a bunch of directory entries to the path relative to the install directory. If it fails, it defaults to "." for the install directory. The entries '.\\DLLs', '.\\lib', '.\\lib\\plat-win', '.\\lib\\lib-tk' are all a result of this failing. As long as this works, there is no need for the user (or anyone) to ever set the PYTHONPATH variable -- that variable is only needed to add directories in front of sys.path for stuff that getpathp.c doesn't know about (e.g. PIL, Numeric, etc.). With packagized versions of those modules, even that won't be necessary, because the packages will be dropped in the Python install directory (typically C:\Program Files\Python). I believe that most of your desire to get rid of PYTHONPATH comes from your insistence to bypass the default installer. There's probably a way to install your app in such a way that the getpathp.c algorithm actually succeeds? There's also a separate env variable, PYTHONHOME, which overrides the Python install directory; if getpathp.c sees that it is set, it will bypass the search relative to the executable's path. I take blame for not documenting all this well enough. However I wish you stopped criticizing the design -- I think the design is quite solid. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Thu Dec 2 00:09:43 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:09:43 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Wed, 01 Dec 1999 14:47:14 EST." <38457B42.85552AC@interet.com> References: <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us> <38457B42.85552AC@interet.com> Message-ID: <199912012309.SAA10873@eric.cnri.reston.va.us> > > This seems logical -- Python extensions must live in directories that > > Python searches (Python must do its own search because the search > > order is significant). > > The PYTHONPATH search path is what I am trying to get away > from. If I eliminate PYTHONPATH I still can not use the > Windows DLL search path (which is superior) because DLLs > are searched on PYTHONPATH too; thus my post. I don't believe > it is important for Python module.dll to be located on PYTHONPATH. But I do. First of all, I'm not sure whether you're talking here about sys.path or PYTHONPATH. As I explained in a previous post, you should normally not have to set PYTHONPATH at all. Let's assume you really meant sys.path. Let's assume sys.path is [A, B]. Let's assume there's a foo.py and a foo.dll. If foo.py lives in A and foo.dll lives in B, then import foo should load foo.py. If it's the other way around, it should load foo.dll. If we were to use the default DLL search path, there's no way that we can get this behavior: either you have to look for a DLL first, which means there's no way for foo.py to override foo.dll, or you have to look for a DLL last, and then there's no way for a foo.dll to override foo.py. It is desirable that both overrides are possible: we want to be able to have foo.dll override foo.py, because perhaps foo.py should only be used when for some reason foo.dll can't be loaded (say foo.py does the same thing only slower); but we also want to be able to have foo.py override foo.dll (by simply placing it in a directory that's earlier on the path) e.g. in a situation where the dll version does something undesirable and we want to create a safe substitute. (Deleting files is not always an option.) > The problem is maintaining PYTHONPATH plus having DLL's on a > non-standard search path. I've commented already that PYTHONPATH maintenance is probably a red herring due to your non-standard install. I'm not sure what the problem is with having a DLL on a non-std path? > Yes, PythonDev[:] and professional > SysAdmins can do it. But it is not as simple as it could be. > Someone has to write the install scripts. The distutil-sig (a.k.a. Greg Ward :-) is taking care of this as we speak. > And what if something > doesn't work? Think of Python being used as a teaching language > for the 8th grade. Think of the 8th grade teacher trying to get > all this right. The only thing that works is simplicity. We will provide an installer that Just Works [tm]. > > But at what point should this LoadLibrary() call be called? The > > import statement contains no clue that a DLL is requested -- the > > sys.path search reveals that. > > Just after built-in and frozen modules. See my long comment above. > > I claim that there is nothing with the current strategy. > > Thank you for thoughtfully considering and commenting at length > on this issue. Lets ignore it for the moment. The other > problems with PYTHONPATH are more pressing. But if those > issues are solved, this one will stick out. And those other issues should be resolved in a different way than what you have been proposing. See other post. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Thu Dec 2 00:11:28 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:11:28 -0500 Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues In-Reply-To: Your message of "Wed, 01 Dec 1999 11:34:56 PST." References: Message-ID: <199912012311.SAA10888@eric.cnri.reston.va.us> > > The first and most basic issue, is compiling Python so it initializes > > C++ global objects correctly. There is a patch on the sig's www site > > to help with that. > > Any opinions from this esteemed body re: integrating said patch in the > main tree? I presume you meant me :-) I'll give it a try tonight. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at cnri.reston.va.us Thu Dec 2 00:24:06 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Wed, 1 Dec 1999 18:24:06 -0500 (EST) Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01 Message-ID: <14405.44566.832799.96438@goon.cnri.reston.va.us> It looks like there has been some mail glitch that result in no digests being sent between 11/26 and 12/01 and no messages being archived between 11/24 and 12/01. Does anyone keep a personal archive that has those messages? I'd like to read them. Jeremy From guido at CNRI.Reston.VA.US Thu Dec 2 00:28:14 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:28:14 -0500 Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01 In-Reply-To: Your message of "Wed, 01 Dec 1999 18:24:06 EST." <14405.44566.832799.96438@goon.cnri.reston.va.us> References: <14405.44566.832799.96438@goon.cnri.reston.va.us> Message-ID: <199912012328.SAA12879@eric.cnri.reston.va.us> > It looks like there has been some mail glitch that result in no > digests being sent between 11/26 and 12/01 and no messages being > archived between 11/24 and 12/01. Does anyone keep a personal archive > that has those messages? I'd like to read them. I do :-) I'll provide Jeremy with an archive. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Thu Dec 2 05:24:03 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 1 Dec 1999 23:24:03 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs References: <199911161700.MAA02716@eric.cnri.reston.va.us> <14389.31511.706588.20840@anthem.cnri.reston.va.us> Message-ID: <14405.62563.345566.500106@anthem.cnri.reston.va.us> Okay folks, I think I've got the diff thing working now. The trick (for you CVS heads) was that you can't do a `cvs diff' while you're executing a loginfo script. Lock contention (repeat after me: "I Love CVS!"). Anyway, let's see how you all like it. Note that based on a suggestion by Greg Stein, seconded by GvR, I do not send out the entire diff of every file (which could potentially be huge). I send out 20 lines from the head of the diff and 20 lines from the tail, and suppress everything inbetween. Those numbers can be easily tweaked, and I'm not sure what the ideal is. Let's see what the emails look like when stuff starts getting checked in. Enjoy, -Barry From jack at oratrix.nl Thu Dec 2 12:00:45 1999 From: jack at oratrix.nl (Jack Jansen) Date: Thu, 02 Dec 1999 12:00:45 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Message by Guido van Rossum , Wed, 01 Dec 1999 18:09:43 -0500 , <199912012309.SAA10873@eric.cnri.reston.va.us> Message-ID: <19991202110045.96F33370CF2@snelboot.oratrix.nl> On the Mac I've introduced "magic cookies" into sys.path, which allow you to do interesting searches (like searching for a DLL or PYC-resource in the application itself) at known places in the import process. There isn't a cookie for "search along the standard MacOS dll search path" (which is somewhat similar to the Windows dll search path) because I haven't seen a reason for it, but there's nothing to stop it. And if you'd insert that cookie it would be perfectly clear (at least, it should be) that only dll modules will be found in that step, not .py modules. Actually I'm so happy with the magic cookie scheme that I've advocated at various times in the past that something similar also be used for determining where builtin modules and frozen modules appear in sys.path... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido at CNRI.Reston.VA.US Thu Dec 2 12:59:34 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 06:59:34 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Thu, 02 Dec 1999 12:00:45 +0100." <19991202110045.96F33370CF2@snelboot.oratrix.nl> References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> Message-ID: <199912021159.GAA13732@eric.cnri.reston.va.us> > On the Mac I've introduced "magic cookies" into sys.path, which > allow you to do interesting searches (like searching for a DLL or > PYC-resource in the application itself) at known places in the > import process. > There isn't a cookie for "search along the standard MacOS dll search > path" (which is somewhat similar to the Windows dll search path) > because I haven't seen a reason for it, but there's nothing to stop > it. And if you'd insert that cookie it would be perfectly clear (at > least, it should be) that only dll modules will be found in that > step, not .py modules. > Actually I'm so happy with the magic cookie scheme that I've > advocated at various times in the past that something similar also > be used for determining where builtin modules and frozen modules > appear in sys.path... I see the magic cookies as a poor man's (but more compatible!) version of a chain of importers as advocated by Greg Stein and other imputil fans. I like the idea, except that I think that the chain should be manipulatable more easily than the current imputil implementation. (I'll have more comments on Greg's comments later, when I've actually read them through.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Thu Dec 2 13:09:40 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 2 Dec 1999 04:09:40 -0800 (PST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <199912021159.GAA13732@eric.cnri.reston.va.us> Message-ID: On Thu, 2 Dec 1999, Guido van Rossum wrote: >... > I see the magic cookies as a poor man's (but more compatible!) version > of a chain of importers as advocated by Greg Stein and other imputil > fans. I like the idea, except that I think that the chain should be > manipulatable more easily than the current imputil implementation. > (I'll have more comments on Greg's comments later, when I've actually > read them through.) Anything in sys.path that is not a string pointing to a directory is not very compatible. My current proposal keeps the existing semantics for sys.path (the proposal adds functionality thru other mechanisms, rather than changing/interfering with existing ones). I look forward to your comments! I'll definitely provide new solutions where you find problems :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Thu Dec 2 13:53:03 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 2 Dec 1999 13:53:03 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> <199912021159.GAA13732@eric.cnri.reston.va.us> Message-ID: <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> Guido van Rossum wrote: > > Actually I'm so happy with the magic cookie scheme that I've > > advocated at various times in the past that something similar also > > be used for determining where builtin modules and frozen modules > > appear in sys.path... > > I see the magic cookies as a poor man's (but more compatible!) version > of a chain of importers as advocated by Greg Stein and other imputil > fans. I like the idea, except that I think that the chain should be > manipulatable more easily than the current imputil implementation. I know this has been asked before, but cannot recall any of the arguments against it: how about replacing Jack's magic cookies with importer objects? (in other words, if a path item is a string, import as usual. otherwise, ask the importer for a code object or maybe better, a module object). From jack at oratrix.nl Thu Dec 2 14:23:31 1999 From: jack at oratrix.nl (Jack Jansen) Date: Thu, 02 Dec 1999 14:23:31 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Message by "Fredrik Lundh" , Thu, 2 Dec 1999 13:53:03 +0100 , <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> Message-ID: <19991202132331.E3F8D370CF2@snelboot.oratrix.nl> > > I see the magic cookies as a poor man's (but more compatible!) version > > of a chain of importers as advocated by Greg Stein and other imputil > > fans. [...] > > I know this has been asked before, but cannot recall > any of the arguments against it: how about replacing > Jack's magic cookies with importer objects? For the record: I definitely agree with both comments here. The only thing that would need solving (but maybe it already is? Greg?) is the external representation of an importer, as I'd definitely want to be able to name them in PYTHONPATH (or the mac equivalent). -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jim at interet.com Thu Dec 2 15:19:31 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 09:19:31 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <009c01bf3c4b$8f119090$0501a8c0@bobcat> Message-ID: <38467FF3.D938EE4@interet.com> Mark Hammond wrote: > Sure - the standard Python installation puts a "default" PYTHONPATH > suitable for general purpose development - but any distributed > application _can_ define their own PYTHONPATH that is independant of > any other Python systems or applications. People have been doing this > for years, including MS :-) How is this done? > Sorry Jim, but count this as another vote against it - which isnt to > argue that the current system is perfect, simply (IMO) better than the > Windows path and DLL search order. Sigh..... JimA From jim at interet.com Thu Dec 2 16:49:10 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 10:49:10 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> Message-ID: <384694F6.E5D74221@interet.com> Guido van Rossum wrote: > You must not have used the standard Python installer; if you had used > it you wouldn't have had this problem (and perhaps we wouldn't have > had this discussion). Correct, I did not use the standard Python installer. I compiled Python from the source distribution. There are good reasons for this in my case. First, my real issue is how to DISTRIBUTE Python programs, not to get Python working on my own machine. We have 12 machines on a network. It is not acceptable to run a Python installation script on every one of them just to run a simple Python program. OK, I guess I could do 12, but what about a larger company? And we ship to hundreds of customers. I can distribute simple C or C++ programs without a hassle, why not Python? It is not acceptable to ask our customers to run a separate Python installer. We have our own Wise installer to install our software. Every commercial vendor has Wise, Install Shield or other installer in place. No commercial vendor is going to abandon Wise et al. and move to The Official Python Installer because it will not have the features of Wise (such as binary patches across the network), and because what it does won't be documented, and because it is Just Different. Second, I can not run ANY installer on my development machine, Python or otherwise. This is a general Windows problem not specific to Python. Right now our help system is broken on every office machine except the one where the help system installer was run (where we develop help). If I run a Python installer, it may Just Work here. So testing is fine, but when I distribute the program to customers where the install program has not been run it fails. The installer made registry entries, installed files, etc. And what did it do?? No one knows. And how do I install at a customer site if I don't have documentation on what the Help installer or Python installer did?? No one knows. Who fixes it if something goes wrong?? Hours on the phone to Help System customer support. Does it work on Windows 2000?? No one knows. > f:\bin. "Modern" Python versions execute some code at startup that > comes up with a suitable value for sys.path; the Windows version of > this code is in PC/getpathp.c -- I recommend that you study it. This > [ Highly useful discussion of startup...] Thank you, I will study this. > know about (e.g. PIL, Numeric, etc.). With packagized versions of > those modules, even that won't be necessary, because the packages will > be dropped in the Python install directory (typically C:\Program > Files\Python). Yes, this is essential. Packages must be easily installed. I was hoping for single file package archive files. > I believe that most of your desire to get rid of PYTHONPATH comes from > your insistence to bypass the default installer. Correct, I refuse to execute the default installer. And I am a patient person who loves Python, so I will read getpathp.c to see what is happening. But other commercial developers, students, teachers, SysAdmins etc. are not so patient. In the interest of promoting Python, there should be documentation on the official way to easily install Python programs. > There's probably a > way to install your app in such a way that the getpathp.c algorithm > actually succeeds? There's also a separate env variable, PYTHONHOME, Perhaps, and if there is it should be prominently documented in the How to Distribute Your App section of the manual. I am worried about supporting versioning, but I will think about it. > I take blame for not documenting all this well enough. However I wish > you stopped criticizing the design -- I think the design is quite > solid. Thank you for the explanation. I will study the design again. I always wondered what PYTHONHOME did. JimA From guido at CNRI.Reston.VA.US Thu Dec 2 17:03:09 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 11:03:09 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Thu, 02 Dec 1999 10:49:10 EST." <384694F6.E5D74221@interet.com> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> Message-ID: <199912021603.LAA14455@eric.cnri.reston.va.us> > Perhaps, and if there is it should be prominently documented in the > How to Distribute Your App section of the manual. I > am worried about supporting versioning, but I will think about it. Join the distutil-SIG, they are discussing just this. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Thu Dec 2 16:48:40 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 02 Dec 1999 16:48:40 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> <199912021159.GAA13732@eric.cnri.reston.va.us> <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> Message-ID: <384694D8.DCA3D75E@lemburg.com> Fredrik Lundh wrote: > > Guido van Rossum wrote: > > > Actually I'm so happy with the magic cookie scheme that I've > > > advocated at various times in the past that something similar also > > > be used for determining where builtin modules and frozen modules > > > appear in sys.path... > > > > I see the magic cookies as a poor man's (but more compatible!) version > > of a chain of importers as advocated by Greg Stein and other imputil > > fans. I like the idea, except that I think that the chain should be > > manipulatable more easily than the current imputil implementation. > > I know this has been asked before, but cannot recall > any of the arguments against it: how about replacing > Jack's magic cookies with importer objects? > > (in other words, if a path item is a string, import as > usual. otherwise, ask the importer for a code object > or maybe better, a module object). Plus, for backward compatibility, make sure that str(importerobj) returns something which resembles a non-existing directory. Note that the builtin importer skips non-string entries in sys.path, so the above will only be needed for existing import hooks. Still, I would like to rephrase my 0.02EUR which I already posted twice... why not start to think about what these importers would do first ? If there are only a handful of wishes we could just add them to the builtin machinery and be done with it... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 29 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Thu Dec 2 17:28:28 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 11:28:28 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Fri, 19 Nov 1999 22:43:32 EST." <1269053086-27079185@hypernet.com> References: <1269053086-27079185@hypernet.com> Message-ID: <199912021628.LAA14506@eric.cnri.reston.va.us> > No success whatsoever in either direction across Samba. In > fact the mtime of my Linux home directory as seen from NT is > Jan 1, 1980. That's only the case for an NT mount point (something of the form \\host\name; I notice that os.stat() only believes it exists if you append a backslash: \\host\name\). For interior directories, at least with the Samba version that I'm using, os.stat() seems to give correct results. I think that this whole issue (that doing a stat on a directory to find out whether files in it were modified doesn't give usable results) is widely blown out of proportion. The only useful bit of info is that mtimes may have an up to 2 second granularity, and that anything as recent as 2 seconds should be considered as newer than the cache even if the cache is also less than 2 seconds. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at interet.com Thu Dec 2 17:28:50 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 11:28:50 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us> <38457B42.85552AC@interet.com> <199912012309.SAA10873@eric.cnri.reston.va.us> Message-ID: <38469E42.AF0A0D55@interet.com> Guido van Rossum wrote: > Let's assume sys.path is [A, B]. Let's assume there's a foo.py and a > foo.dll. If foo.py lives in A and foo.dll lives in B, then import foo > ... Thank you for the detailed discussion showing that sys.path is needed so a choice can be made whether to load foo.dll or foo.py. As you correctly point out, a separate search path defeats this behavior. But I don't think the usefulness of the feature compensates for its resultant complexity. Specifically, it will be hard to create this behavior in archive files. As I envision archive files (which of course is subject to change) they contain *.pyc files and not DLL's. The DLL's must be in a ./DLL directory since the OS can not load them from strings. So if every *.pyc is in an archive file, your only choice is whether to load all DLL's first or last. That is, archive.pyl is either before or after ./DLL. If a package (probably with lots of subdirectories) author depends on having a search path within a package which discriminates between pyc and DLL files with equal names, then that search path plus the existence of the DLL's must be recorded in the archive. This is much more complicated than just an archive with all *.pyc files entered in a dotted name space: foo foo.sub1 foo.sub2 foo.sub2.pkx I would question whether equally named foo.dll and foo.py is worth it. The alternative (which is IMHO more common) is to code the choice in Python in the module that cares about it. > > And what if something > > doesn't work? Think of Python being used as a teaching language > > for the 8th grade. Think of the 8th grade teacher trying to get > > all this right. The only thing that works is simplicity. > > We will provide an installer that Just Works [tm]. OK for this case. Not enough for Python program distribution. JimA From jim at interet.com Thu Dec 2 17:30:49 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 11:30:49 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us> Message-ID: <38469EB9.5EDB9617@interet.com> Guido van Rossum wrote: > > > Perhaps, and if there is it should be prominently documented in the > > How to Distribute Your App section of the manual. I > > am worried about supporting versioning, but I will think about it. > > Join the distutil-SIG, they are discussing just this. I already belong to the distutil-SIG and have seen no such discussion. Jim From guido at CNRI.Reston.VA.US Thu Dec 2 18:17:52 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 12:17:52 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Thu, 02 Dec 1999 11:30:49 EST." <38469EB9.5EDB9617@interet.com> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us> <38469EB9.5EDB9617@interet.com> Message-ID: <199912021717.MAA14682@eric.cnri.reston.va.us> [Jim] > > > Perhaps, and if there is it should be prominently documented in the > > > How to Distribute Your App section of the manual. I > > > am worried about supporting versioning, but I will think about it. [me] > > Join the distutil-SIG, they are discussing just this. [Jim again] > I already belong to the distutil-SIG and have seen no such > discussion. Sorry, you're right (except for a brief exchange between you and Paul Dubois :-). But I think they should, it falls under their charter. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 2 18:30:02 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 2 Dec 1999 12:30:02 -0500 (EST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <199912021717.MAA14682@eric.cnri.reston.va.us> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us> <38469EB9.5EDB9617@interet.com> <199912021717.MAA14682@eric.cnri.reston.va.us> Message-ID: <14406.44186.574647.651111@weyr.cnri.reston.va.us> Guido van Rossum writes: > Sorry, you're right (except for a brief exchange between you and Paul > Dubois :-). But I think they should, it falls under their charter. This was deliberatly postponed until after extension packages are supported and in place. I know Greg is interested in application installation as well as package installation. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From gmcm at hypernet.com Thu Dec 2 18:53:03 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 2 Dec 1999 12:53:03 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912021628.LAA14506@eric.cnri.reston.va.us> References: Your message of "Fri, 19 Nov 1999 22:43:32 EST." <1269053086-27079185@hypernet.com> Message-ID: <1267965342-1446902@hypernet.com> [Gordon] > > No success whatsoever in either direction across Samba. In fact > > the mtime of my Linux home directory as seen from NT is Jan 1, > > 1980. [Guido] > That's only the case for an NT mount point (something of the form > \\host\name; I notice that os.stat() only believes it exists if > you append a backslash: \\host\name\). For interior directories, > at least with the Samba version that I'm using, os.stat() seems > to give correct results. Correct (as I discovered not long after I posted). (I find that from NT I have to stat some file _in_ the directory to get an updated mtime from the stat _of_ the directory). > I think that this whole issue (that doing a stat on a directory > to find out whether files in it were modified doesn't give usable > results) is widely blown out of proportion. This has come up twice: re caching importers and dircache.py (used only by dircmp). We've arrived at the fact that it _can_ be made to work on Windows boxes. NFS? Andrew (anyone still use that)? IOW, do we want to trust it? Do we want to document that it might not be trustworthy in some situations? Make it optional- for-wizards? Kill it? IOOW, what's the proper proportion ;-)? > The only useful bit of info is that mtimes may have an up to 2 > second granularity, and that anything as recent as 2 seconds > should be considered as newer than the cache even if the cache is > also less than 2 seconds. From guido at CNRI.Reston.VA.US Thu Dec 2 21:43:46 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 15:43:46 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Fri, 19 Nov 1999 05:29:50 PST." References: Message-ID: <199912022043.PAA15108@eric.cnri.reston.va.us> Here's the promised response to Greg's response to my wishlist. > On Thu, 18 Nov 1999, Guido van Rossum wrote: > > Gordon McMillan wrote: > >... > > > I think imputil's emulation of the builtin importer is more of a > > > demonstration than a serious implementation. As for speed, it > > > depends on the test. > > > > Agreed. I like some of imputil's features, but I think the API > > need to be redesigned. > > It what ways? It sounds like you've applied some thought. Do you have any > concrete ideas yet, or "just a feeling" :-) I'm working through some > changes from JimA right now, and would welcome other suggestions. I think > there may be some outstanding stuff from MAL, but I'm not sure (Marc?) I actually think that the way the PVM (Python VM) calls the importer ought to be changed. Assigning to __builtin__.__import__ is a crock. The API for __import__ is a crock. > >... > > So here's a challenge: redesign the import API from scratch. > > I would suggest starting with imputil and altering as necessary. I'll use > that viewpoint below. > > > Let me start with some requirements. > > > > Compatibility issues: > > --------------------- > > > > - the core API may be incompatible, as long as compatibility layers > > can be provided in pure Python > > Which APIs are you referring to? The "imp" module? The C functions? The > __import__ and reload builtins? > I'm guessing some of imp, the two builtins, and only one or two C > functions. All of those. > > - support for rexec functionality > > No problem. I can think of a number of ways to do this. Agreed, I think that imputil can do this. > > - support for freeze functionality > > No problem. A function in "imp" must be exposed to Python to support this > within the imputil framework. Agreed. It currently exports init_frozen() which is about the right functionality. > > - load .py/.pyc/.pyo files and shared libraries from files > > No problem. Again, a function is needed for platform-specific loading of > shared libraries. Is it useful to expose the platform differences? The current imp.load_dynamic() should suffice. > > - support for packages > > No problem. Demo's in current imputil. > > > - sys.path and sys.modules should still exist; sys.path might > > have a slightly different meaning > > I would suggest that both retain their *exact* meaning. We introduce > sys.importers -- a list of importers to check, in sequence. The first > importer on that list uses sys.path to look for and load modules. The > second importer loads builtins and frozen code (i.e. modules not on > sys.path). This is looking like the redesign I was looking for. (Note that imputil's current chaining is not good since it's impossible to remove or reorder importers, which I think is a required feature; an explicit list would solve this.) Actually, the order is the other way around, but by now you should know that. It makes sense to have separate ones for builtin and frozen modules -- these have nothing in common. There's another issue, which isn't directly addressed by imputil, although with clever use of inheritance it might be doable. I'd like more support for this however. Quite orthogonally to the issue of having separate importers, I might want to recognize new extensions. Take the example of the ILU folks. They want to be able to drop a file "foo.isl" in any directory on sys.path and have the ILU stubber automatically run if you try to import foo (the client stubs) or foo__skel (the server skeleton). This doesn't fit in the sys.importers strategy, because they want to be able to drop their .isl files in any directory along sys.path. (Or, more likely, they want to have control over where in sys.modules the directory/directories with .isl files are placed.) This requires an ugly modification to the _fs_import() function. (Which should have been a method, by the way, to make overriding it in a subclass of PathImporter easier!) I've been thinking here along the lines of a strategy where the standard importer (the one that walks sys.path) has a set of hooks that define various things it could look for, e.g. .py files, .pyc files, .so or .dll files. This list of hooks could be changed to support looking for .isl files. There's an old, subtle issue that could be solved through this as well: whether or not a .pyc file without a .py file should be accepted or not. Long ago (in Python 0.9.8) a .pyc file alone would never be loaded. This was changed at the request of a small but vocal minority of Python developers who wanted to distribute .pyc files without .py files. It has occasionally caused frustration because sometimes developers move .py files around but forget to remove the .pyc files, and then the .pyc file is silently picked up if it occurs on sys.path earlier than where the .py was moved to. Having a set of hooks for various extensions would make it possible to have a default where lone .pyc files are ignored, but where one can insert a .pyc importer in the list of hooks that does the right thing here. (Of course, it may be possible that this whole feature of lone .pyc files should be replaced since the same need is easily taken care of by zip importers. I also want to support (Jim A notwithstanding :-) a feature whereby different things besides directories can live on sys.path, as long as they are strings -- these could be added from the PYTHONPATH env variable. Every piece of code that I've ever seen that uses sys.path doesn't care if a directory named in sys.path doesn't exist -- it may try to stat various files in it, which also don't exist, and as far as it is concerned that is just an indication that the requested module doesn't live there. Again, we would have to dissect imputil to support various hooks that deal with different kind of entities in sys.path. The default hook list would consist of a single item that interprets the name as a directory name; other hooks could support zip files or URLs. Jack's "magic cookies" could also be supported nicely through such a mechanism. > Users can insert/append new importers or alter sys.path as before. > > sys.modules continues to record name:module mappings. Yes. Note that the interpretation of __file__ could be problematic. To what value do you set __file__ for a module loaded from a zip archive? > > - $PYTHONPATH and $PYTHONHOME should still be supported > > No problem. > > > (I wouldn't mind a splitting up of importdl.c into several > > platform-specific files, one of which is chosen by the configure > > script; but that's a bit of a separate issue.) > > Easy enough. The standard importer can select the appropriate > platform-specific module/function to perform the load. i.e. these can move > to Modules/ and be split into a module-per-platform. Again: what's the advantage of exposing the platform specificity? > > New features: > > ------------- > > > > - Integrated support for Greg Ward's distribution utilities (i.e. a > > module prepared by the distutil tools should install painlessly) > > I don't know the specific requirements/functionality that would be > required here (does Greg? :-), but I can't imagine any problem with this. Probably more support is required from the other end: once it's common for modules to be imported from zip files, the distutil code needs to support the creation and installation of such zip files. Also, there is a need for the install phase of distutil to communicate the location of the zip file to the Python installation. > > - Good support for prospective authors of "all-in-one" packaging tool > > authors like Gordon McMillan's win32 installer or /F's squish. (But > > I *don't* require backwards compatibility for existing tools.) > > Um. *No* problem. :-) :-) > > - Standard import from zip or jar files, in two ways: > > > > (1) an entry on sys.path can be a zip/jar file instead of a directory; > > its contents will be searched for modules or packages Note that this is what I mention above for distutil support. > While this could easily be done, I might argue against it. Old > apps/modules that process sys.path might get confused. Above I argued that this shouldn't be a problem. > If compatibility is not an issue, then "No problem." > > An alternative would be an Importer instance added to sys.importers that > is configured for a specific archive (in other words, don't add the zip > file to sys.path, add ZipImporter(file) to sys.importers). This would be harder for distutil: where does Python get the initial list of importers? > Another alternative is an Importer that looks at a "sys.py_archives" list. > Or an Importer that has a py_archives instance attribute. OK, but again distutil needs to be able to add to this list when it installs a package. (Note that package deinstallation should also be supported!) (Of course I don't require this to affect Python processes that are already running; but it should be possible to easily change the default search path for all newly started instances of a given Python installation.) > > (2) a file in a directory that's on sys.path can be a zip/jar file; > > its contents will be considered as a package (note that this is > > different from (1)!) > > No problem. This will slow things down, as a stat() for *.zip and/or *.jar > must be done, in addition to *.py, *.pyc, and *.pyo. Fine, this is where the caching comes in handy. > > I don't particularly care about supporting all zip compression > > schemes; if Java gets away with only supporting gzip compression > > in jar files, so can we. > > I presume we would support whatever zlib gives us, and no more. That's it. :-) > > - Easy ways to subclass or augment the import mechanism along > > different dimensions. For example, while none of the following > > features should be part of the core implementation, it should be > > easy to add any or all: > > > > - support for a new compression scheme to the zip importer > > Presuming ZipImporter is a class (derived from Importer), then this > ability is wholly dependent upon the author of ZipImporter providing the > hook. Agreed. But since we're likely going to provide this as a standandard feature, we must ensure that it provides this hook. > The Importer class is already designed for subclassing (and its interface > is very narrow, which means delegation is also *very* easy; see > imputil.FuncImporter). But maybe it's *too* narrow; some of the hooks I suggest above seem to require extra interfaces -- at least in some of the subclasses of the Importer base class. Note: I looked at the doc string for get_code() and I don't understand what the difference is between the modname and fqname arguments. If I write "import foo.bar", what are modname and fqname? Why are both present? Also, while you claim that the API is narrow, the multiple return values (also the different types for the second item) make it complicated. > > - support for a new archive format, e.g. tar > > A cakewalk. Gordon, JimA, and myself each have archive formats. :-) > > > - a hook to import from URLs or other data sources (e.g. a > > "module server" imported in CORBA) (this needn't be supported > > through $PYTHONPATH though) > > No problem at all. > > > - a hook that imports from compressed .py or .pyc/.pyo files > > No problem at all. > > > - a hook to auto-generate .py files from other filename > > extensions (as currently implemented by ILU) > > No problem at all. See above -- I think this should be more integrated with sys.path than you are thinking of. The more I think about it, the more I see that the problem is that for you, the importer that uses sys.path is a final subclass of Importer (i.e. it is itself not further subclassed). Several of the hooks I want seem to require additional hooks in the PathImporter rather than new importers. > > - a cache for file locations in directories/archives, to improve > > startup time > > No problem at all. > > > - a completely different source of imported modules, e.g. for an > > embedded system or PalmOS (which has no traditional filesystem) > > No problem at all. > > In each of the above cases, the Importer.get_code() method just needs to > grab the byte codes from the XYZ data source. That data source can be > cmopressed, across a network, on-the-fly generated, or whatever. Each > importer can certainly create a cache based on its concept of "location". > In some cases, that would be a mapping from module name to filesystem > path, or to a URL, or to a compiled-in, frozen module. See above for sys.path integration remark. > > - Note that different kinds of hooks should (ideally, and within > > reason) properly combine, as follows: if I write a hook to recognize > > .spam files and automatically translate them into .py files, and you > > write a hook to support a new archive format, then if both hooks are > > installed together, it should be possible to find a .spam file in an > > archive and do the right thing, without any extra action. Right? > > Ack. Very, very difficult. Actually, I take most of this back. Importers that deal with new extension types often have to go through a file system to transform their data to .py files, and this is just too complicated. However it would be still nice if there was code sharing between the code that looks for .py and .pyc files in a zip archive and the code that does the same in a filesystem. Hm, maybe even that shouldn't be necessary, the zip file probably should contain only .pyc files... (Unrelated remark: I should really try to release the set of modules we've written here at CNRI to deal with zip files. Unfortunately zip files are hairy and so is our code.) > The imputil scheme combines the concept of locating/loading into one step. > There is only one "hook" in the imputil system. Its semantic is "map this > name to a code/module object and return it; if you don't have it, then > return None." That's fine. I actually don't recall where the find-then-load API came from, I think it may be an artefact of the original implementation strategy. It is currently used as follows: we try to see if there's a .pyc and then we try to see if there's a .py; if both exist we compare the timestamps etc. to choose which one. But that's still a red herring. > Your compositing example is based on the capabilities of the > find-then-load paradigm of the existing "ihooks.py". One module finds > something (foo.spam) and the other module loads it (by generating a .py). I still don't understand why ihooks.py had to be so complicated. I guess I just had much less of an understanding of the issues. (It was also partly a compromise with an alternative design by Ken Manheimer, who basically forced me to support packages, originally through ni.py.) > All is not lost, however. I can easily envision the get_code() hook as > allowing any kind of return type. If it isn't a code or module object, > then another hook is called to transform it. > [ actually, I'd design it similarly: a *series* of hooks would be called > until somebody transforms the foo.spam into a code/module object. ] OK. This could be a feature of a subclass of Importer. > The compositing would be limited ony by the (Python-based) Importer > classes. For example, my ZipImporter might expect to zip up .pyc files > *only*. Obviously, you would want to alter this to support zipping any > file, then use the suffic to determine what to do at unzip time. > > > - It should be possible to write hooks in C/C++ as well as Python > > Use FuncImporter to delegate to an extension module. Maybe not so great, since it sounds like the C code can't benefit from any of the infrastructure that imputil offers. I'm not sure about this one though. > This is one of the benefits of imputil's single/narrow interface. Plus its vague specs? :-) > > - Applications embedding Python may supply their own implementations, > > default search path, etc., but don't have to if they want to piggyback > > on an existing Python installation (even though the latter is > > fraught with risk, it's cheaper and easier to understand). > > An application would have full control over the contents of sys.importers. > > For a restricted execution app, it might install an Importer that loads > files from *one* directory only which is configured from a specific > Win32 Registry entry. That importer could also refuse to load shared > modules. The BuiltinImporter would still be present (although the app > would certainly omit all but the necessary builtins from the build). > Frozen modules could be excluded. Actually there's little reason to exclude frozen modules or any .py/.pyc modules -- by definition, bytecode can't be dangerous. It's the builtins and extensions that need to be censored. We currently do this by subclassing ihooks, where we mask the test for builtins with a comparison to a predefined list of names. > > Implementation: > > --------------- > > > > - There must clearly be some code in C that can import certain > > essential modules (to solve the chicken-or-egg problem), but I don't > > mind if the majority of the implementation is written in Python. > > Using Python makes it easy to subclass. > > I posited once before that the cost of import is mostly I/O rather than > CPU, so using Python should not be an issue. MAL demonstrated that a good > design for the Importer classes is also required. Based on this, I'm a > *strong* advocate of moving as much as possible into Python (to get > Python's ease-of-coding with little relative cost). Agreed. However, how do you explain the slowdown (from 9 to 13 seconds I recall) though? Are you a lousy coder? :-) > The (core) C code should be able to search a path for a module and import > it. It does not require dynamic loading or packages. This will be used to > import exceptions.py, then imputil.py, then site.py. It does, however, need to import builtin modules. imputil currently imports imp, sys, strop and __builtin__, struct and marshal; note that struct can easily be a dynamic loadable module, and so could strop in theory. (Note that strop will be unnecessary in 1.6 if you use string methods.) I don't think that this chicken-or-egg problem is particularly problematic though. > The platform-specific module that perform dynamic-loading must be a > statically linked module (in Modules/ ... it doesn't have to be in the > Python/ directory). See earlier comments. > site.py can complete the bootstrap by setting up sys.importers with the > appropriate Importer instances (this is where an application can define > its own policy). sys.path was initially set by the import.c bootstrap code > (from the compiled-in path and environment variables). I thing that algorithm (currently in getpath.c / getpathp.c) might also be moved to Python code -- imported frozen. Sadly, rebuilding with a new version of a frozen module might be more complicated than rebuilding with a new version of a C module, but writing and maintaining this code in Python would be *sooooooo* much easier that I think it's worth it. > Note that imputil.py would not install any hooks when it is loaded. That > is up to site.py. This implies the core C code will import a total of > three modules using its builtin system. After that, the imputil mechanism > would be importing everything (site.py would .install() an Importer which > then takes over the __import__ hook). (Three not counting the builtin modules.) > Further note that the "import" Python statement could be simplified to use > only the hook. However, this would require the core importer to inject > some module names into the imputil module's namespace (since it couldn't > use an import statement until a hook was installed). While this > simplification is "neat", it complicates the run-time system (the import > statement is broken until a hook is installed). Same chicken-or-egg. We can be pragmatic. For a developer, I'd like a bit of robustness (all this makes it rather hard to debug a broken imputil, and that's a fair amount of code!). > Therefore, the core C code must also support importing builtins. "sys" and > "imp" are needed by imputil to bootstrap. > > The core importer should not need to deal with dynamic-load modules. Same question. Since that all has to be coded in C anyway, why not? > To support frozen apps, the core importer would need to support loading > the three modules as frozen modules. I'd like to see a description of how someone like Jim A would build a single-file application using the new mechanism. This could completely replace freeze. (Freeze currently requires a C compiler; that's bad.) > The builtin/frozen importing would be exposed thru "imp" for use by > imputil for future imports. imputil would load and use the (builtin) > platform-specific module to do dynamic-load imports. Sure. > > - In order to support importing from zip/jar files using compression, > > we'd at least need the zlib extension module and hence libz itself, > > which may not be available everywhere. > > Yes. I don't see this as a requirement, though. We wouldn't start to use > these by default, would we? Or insist on zlib being present? I see this as > more along the lines of "we have provided a standardized Importer to do > this, *provided* you have zlib support." Agreed. Zlib support is easy to get, but there are probably platforms where it's not. (E.g. maybe the Mac? I suppose that on the Mac, there would be some importer classes to import from a resource fork.) > > - I suppose that the bootstrap is solved using a mechanism very > > similar to what freeze currently used (other solutions seem to be > > platform dependent). > > The bootstrap that I outlined above could be done in C code. The import > code would be stripped down dramatically because you'll drop package > support and dynamic loading. Not the dynamic loading. But yes the package support. > Alternatively, you could probably do the path-scanning in Python and > freeze that into the interpreter. Personally, I don't like this idea as it > would not buy you much at all (it would still need to return to C for > accessing a number of scanning functions and module importing funcs). > > > - I also want to still support importing *everything* from the > > filesystem, if only for development. (It's hard enough to deal with > > the fact that exceptions.py is needed during Py_Initialize(); > > I want to be able to hack on the import code written in Python > > without having to rebuild the executable all the time. > > My outline above does not freeze anything. Everything resides in the > filesystem. The C code merely needs a path-scanning loop and functions to > import .py*, builtin, and frozen types of modules. Good. Though I think there's also a need for freezing everything. And when we go the route of the zip archive, the zip archive handling code needs to be somewhere -- frozen seems to be a reasonable choice. > If somebody nukes their imputil.py or site.py, then they return to Python > 1.4 behavior where the core interpreter uses a path for importing (i.e. no > packages). They lose dynamically-loaded module support. But if the path guessing is also done by site.py (as I propose) the path will probably be wrong. A warning should be printed. > > Let's first complete the requirements gathering. Are these > > requirements reasonable? Will they make an implementation too > > complex? Am I missing anything? > > I'm not a fan of the compositing due to it requiring a change to semantics > that I believe are very useful and very clean. However, I outlined a > possible, clean solution to do that (a secondary set of hooks for > transforming get_code() return values). As you may see from my responses, I'm a big fan of having several different sets of hooks. I do withdraw the composition requirement though. > The requirements are otherwise reasonable to me, as I see that they can > all be readily solved (i.e. they aren't burdensome). > > While this email may be long, I do not believe the resulting system would > be complex. From the user-visible side of things, nothing would be > changed. sys.path is still present and operates as before. They *do* have > new functionality they can grow into, though (sys.importers). The > underlying C code is simplified, and the platform-specific dynamic-load > stuff can be distributed to distinct modules, as needed > (e.g. BeOS/dynloadmodule.c and PC/dynloadmodule.c). > > > Finally, to what extent does this impact the desire for dealing > > differently with the Python bytecode compiler (e.g. supporting > > optimizers written in Python)? And does it affect the desire to > > implement the read-eval-print loop (the >>> prompt) in Python? > > If the three startup files require byte-compilation, then you could have > some issues (i.e. the byte-compiler must be present). Another chicken-or-egg. No biggie. > Once you hit site.py, you have a "full" environment and can easily detect > and import a read-eval-print loop module (i.e. why return to Python? just > start things up right there). You mean "why return to C?" I agree. It would be cool if somehow IDLE and Pythonwin would also be bootstrapped using the same mechanisms. (This would also solve the question "which interactive environment am I using?" that some modules and apps want to see answered because they need to do things differently when run under IDLE,for example.) > site.py can also install new optimizers as desired, a new Python-based > parser or compiler, or whatever... If Python is built without a parser or > compiler (I hope that's an option!), then the three startup modules would > simply be frozen into the executable. More power to hooks! --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 2 22:22:33 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 2 Dec 1999 16:22:33 -0500 (EST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us> References: <199912022043.PAA15108@eric.cnri.reston.va.us> Message-ID: <14406.58137.359127.921135@weyr.cnri.reston.va.us> Guido van Rossum writes: > variable. Every piece of code that I've ever seen that uses sys.path > doesn't care if a directory named in sys.path doesn't exist -- it may > try to stat various files in it, which also don't exist, and as far as Not the case -- I know you've looked at some of my code in the KOE that ensures only real directories are on the path, and each is only there once (pathhack.py). Given that sys.path is often too long and includes duplicate entries in a large system (often one entry with and one without a trailing / for a given directory), it useful to be able to distinguish between things that should be interpretable as paths and things that aren't. It should not be hard to declare that "cookies" or whatever have some special form, like "". > (Unrelated remark: I should really try to release the set of modules > we've written here at CNRI to deal with zip files. Unfortunately zip > files are hairy and so is our code.) It doesn't help that that code just plain stinks. I maintain that no one here understands the whole of it. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jcw at equi4.com Thu Dec 2 22:41:46 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Thu, 02 Dec 1999 22:41:46 +0100 Subject: [Python-Dev] Import redesign [LONG] References: <199912022043.PAA15108@eric.cnri.reston.va.us> Message-ID: <3846E79A.446EAFD5@equi4.com> Guido van Rossum wrote: [...] > Note that the interpretation of __file__ could be problematic. To > what value do you set __file__ for a module loaded from a zip archive? Makefiles use "archive(entry)" (this also supports nesting if needed). [...] > I'd like to see a description of how someone like Jim A would build a > single-file application using the new mechanism. This could > completely replace freeze. (Freeze currently requires a C compiler; > that's bad.) [...] This may be off-topic, but has anyone considered what it would take to load shared libs out of an archive? One way is to extract on-the-fly to a temporary area. A refinement is to leave extracted files there as cache, and perhaps even to extract to a file with a name derived from its MD5 digest (this way multiple users and even Python installations can share the cache). Would it be useful to define a "standard" area? -- Jean-Claude From gmcm at hypernet.com Fri Dec 3 00:15:50 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 2 Dec 1999 18:15:50 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us> References: Your message of "Fri, 19 Nov 1999 05:29:50 PST." Message-ID: <1267945992-2611810@hypernet.com> [Guido] big snip > Note that the interpretation of __file__ could be problematic. > To what value do you set __file__ for a module loaded from a zip > archive? I just left it alone (ie, as it was when I picked up the .pyc). Turns out OK, because then when the end user files a bug report, the developer can track it down. > Note: I looked at the doc string for get_code() and I don't > understand what the difference is between the modname and fqname > arguments. If I write "import foo.bar", what are modname and > fqname? As I recall: import foo.bar -> get_code(None, 'foo', 'foo') # returns foo -> get_code(, 'bar', 'foo.bar') > Why are both present? I think so the importer can choose between being tree structured or flat. > I'd like to see a description of how someone like Jim A would > build a single-file application using the new mechanism. This > could completely replace freeze. (Freeze currently requires a C > compiler; that's bad.) I have something working for Linux now. I froze exceptions.py. I hacked getpath.c so prefix = exec_prefix = executable's directory and the starting path is [prefix]. Although I did it differently, you could regard imputil.py and archive.py as frozen, too. (On WIndows it's somewhat different, because the result uses the stock python15.dll.) This somewhat oversimplifies; and I haven't really thought out all the ways people might try to use sym links. I'm inclined to think the starting path should contain both the executable's real directory and the sym link's directory. > .... I do withdraw the composition > requirement though. Hooray! - Gordon From gstein at lyra.org Fri Dec 3 01:19:14 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 2 Dec 1999 16:19:14 -0800 (PST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <384694D8.DCA3D75E@lemburg.com> Message-ID: On Thu, 2 Dec 1999, M.-A. Lemburg wrote: >... > Still, I would like to rephrase my 0.02EUR which I already > posted twice... why not start to think about what these > importers would do first ? If there are only a handful of > wishes we could just add them to the builtin machinery and > be done with it... I'd rather see the builtin machinery move to Python, regardless of what system is used and/or what features are added. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Fri Dec 3 04:19:40 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 2 Dec 1999 19:19:40 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us> Message-ID: On Thu, 2 Dec 1999, Guido van Rossum wrote: >... > Sometime, Greg Stein wrote: >... > > On Thu, 18 Nov 1999, Guido van Rossum wrote: >... > > > Agreed. I like some of imputil's features, but I think the API > > > need to be redesigned. > > > > It what ways? It sounds like you've applied some thought. Do you have any > > concrete ideas yet, or "just a feeling" :-) I'm working through some > > changes from JimA right now, and would welcome other suggestions. I think > > there may be some outstanding stuff from MAL, but I'm not sure (Marc?) > > I actually think that the way the PVM (Python VM) calls the importer > ought to be changed. Assigning to __builtin__.__import__ is a crock. > The API for __import__ is a crock. Something like sys.set_import_hook() ? The other alternative that I see would be to have the C code scan sys.importers, assuming each are callable objects, and call them with the appropriate params (e.g. module name). Of course, to move this scanning into Python would require something like sys.set_import_hook() unless Python looks for a hard-coded module and entrypoint. >... > > Which APIs are you referring to? The "imp" module? The C functions? The > > __import__ and reload builtins? > > > I'm guessing some of imp, the two builtins, and only one or two C > > functions. > > All of those. We can provide Python code to provide compatibility for "imp" and the two hooks. Nothing we can do to the C code, though. I'm not sure what the import API looks like from C, and whether they could all stay. A brief glance looks like most could stay. [ removing any would change Python's API version, which might be "okay" ] >... > > > - load .py/.pyc/.pyo files and shared libraries from files > > > > No problem. Again, a function is needed for platform-specific loading of > > shared libraries. > > Is it useful to expose the platform differences? The current > imp.load_dynamic() should suffice. This comes up several times throughout this message, and in some off-list mail Guido and I have exchanged. Namely, "should dynamic loading be part of the core, or performed via a module?" I would rather see it become a module, rather than inside the core (despite the fact that the module would have to be compiled into the interpreter). I believe this provides more flexibility for people looking to replace/augment/update/fix dynamic loading on various architectures. Rather than changing the core, a person can just drop in another module. The isolation between the core and modules is nicer, aesthetically, to me. The modules would also be exposing Just Another Importer Function, rather than a specialized API in the builtin imp module. Also note that it is easier to keep a module *out* of a Python-based application, than it is to yank functions out of the core of Python. Frozen apps, embedded apps, etc could easily leave out dynamic loading. Are there strict advantages? Not any that I can think of right now (beyond a bit of ease-of-use mentioned above). It just feels better to me. >... > > > - sys.path and sys.modules should still exist; sys.path might > > > have a slightly different meaning > > > > I would suggest that both retain their *exact* meaning. We introduce > > sys.importers -- a list of importers to check, in sequence. The first > > importer on that list uses sys.path to look for and load modules. The > > second importer loads builtins and frozen code (i.e. modules not on > > sys.path). > > This is looking like the redesign I was looking for. (Note that > imputil's current chaining is not good since it's impossible to remove > or reorder importers, which I think is a required feature; an explicit > list would solve this.) The chaining is an aspect of the current, singular import hook that Python uses. In the past, I've suggested the installation of a "manager" that maintains a list. sys.importers is similar in practice. Note that this Manager would be present with the sys.set_import_hook() scheme, while the Manager is implied if the core scans sys.importers. > Actually, the order is the other way around, but by now you should > know that. It makes sense to have separate ones for builtin and > frozen modules -- these have nothing in common. Yes, JimA pointed this out. The latest imputil has corrected this. I combined the builtin and frozen Importers because they were just so similar. I didn't want to iterate over two Importers when a single one sufficed quite well. *shrug* Could go either way, really. > There's another issue, which isn't directly addressed by imputil, > although with clever use of inheritance it might be doable. I'd like > more support for this however. Quite orthogonally to the issue of > having separate importers, I might want to recognize new extensions. Correct: while imputil doesn't address this, the standard/default Importer classes *definitely* can. >... > the directory/directories with .isl files are placed.) This requires > an ugly modification to the _fs_import() function. (Which should have > been a method, by the way, to make overriding it in a subclass of > PathImporter easier!) I yanked that code out of the DirectoryImporter so that the PathImporter could use it. I could see a reorg that creates a FileSystemImporter that defines the method, and the other two just subclass from that. > I've been thinking here along the lines of a strategy where the > standard importer (the one that walks sys.path) has a set of hooks > that define various things it could look for, e.g. .py files, .pyc > files, .so or .dll files. This list of hooks could be changed to > support looking for .isl files. Agreed. It should be easy to have a mapping of extension to handler. One issue: should there be an ordering to the extensions? Exercise for the reader to alter the data structures... > There's an old, subtle issue that could be solved through this as > well: whether or not a .pyc file without a .py file should be accepted > or not. Long ago (in Python 0.9.8) a .pyc file alone would never be > loaded. This was changed at the request of a small but vocal minority > of Python developers who wanted to distribute .pyc files without .py > files. It has occasionally caused frustration because sometimes > developers move .py files around but forget to remove the .pyc files, > and then the .pyc file is silently picked up if it occurs on sys.path > earlier than where the .py was moved to. I think, "too bad for them." :-) Having just a .pyc is a very nice feature. But how can you tell whether it was meant to be a plain .pyc or a mis-ordered one? To truly resolve that, you would need to scan the whole path, looking for a .py. However, maybe somebody put the .pyc there on purpose, to override the .py! --- begin slightly-off-topic --- Here is a neat little Bash script that allows you to use a .pyc as a CGI (to avoid parse overhead). Normally, you can't just drop a .pyc into the cgi-bin directory because the OS doesn't know how to execute it. Not a problem, I say... just append your .pyc to the following Bash script and execute! :-) #!/bin/bash exec - 3< $0 ; exec python -c 'import os,marshal ; f = os.fdopen(3, "rb") ; f.readline() ; f.readline() ; f.seek(8, 1) ; _c = marshal.load(f) ; del os, marshal, f ; exec _c' $@ (the script should be two lines; and no... you can't use readlines(2)) The above script will preserve stdin, stdout, and stderr. If the caller also use 3< ... well, that got overridden :-) The script doesn't work on Windows for two reasons, though: 1) Bash, 2) the "rb" mode followed by readline() Detailed info at the bottom of http://www.lyra.org/greg/python/ --- end of off-topic --- > Having a set of hooks for various extensions would make it possible to > have a default where lone .pyc files are ignored, but where one can > insert a .pyc importer in the list of hooks that does the right thing > here. (Of course, it may be possible that this whole feature of lone > .pyc files should be replaced since the same need is easily taken care > of by zip importers. Maybe. I'd still like to see plain .pyc files, but I know I can work around any change you might make here :-) (i.e. whatever you'd like to do... go for it) > I also want to support (Jim A notwithstanding :-) a feature whereby > different things besides directories can live on sys.path, as long as > they are strings -- these could be added from the PYTHONPATH env > variable. Every piece of code that I've ever seen that uses sys.path > doesn't care if a directory named in sys.path doesn't exist -- it may > try to stat various files in it, which also don't exist, and as far as > it is concerned that is just an indication that the requested module > doesn't live there. I'm not in favor of this, but it is more-than-doable. Again: your discretion... > Again, we would have to dissect imputil to support various hooks that > deal with different kind of entities in sys.path. The default hook > list would consist of a single item that interprets the name as a > directory name; other hooks could support zip files or URLs. Jack's > "magic cookies" could also be supported nicely through such a > mechanism. Specifically, the PathImporter would get "dissected" :-). No problem. > > Users can insert/append new importers or alter sys.path as before. > > > > sys.modules continues to record name:module mappings. > > Yes. > > Note that the interpretation of __file__ could be problematic. To > what value do you set __file__ for a module loaded from a zip archive? You don't (certainly in a way that is nice/compatible for modules that refer to it). This is why I don't like __file__ and __path__. They just don't make sense in archives or frozen code. Python code that relies on them will create problems when that code is placed into different packaging mechanisms. >... > > > (I wouldn't mind a splitting up of importdl.c into several > > > platform-specific files, one of which is chosen by the configure > > > script; but that's a bit of a separate issue.) > > > > Easy enough. The standard importer can select the appropriate > > platform-specific module/function to perform the load. i.e. these can move > > to Modules/ and be split into a module-per-platform. > > Again: what's the advantage of exposing the platform specificity? See above. >... > Probably more support is required from the other end: once it's common > for modules to be imported from zip files, the distutil code needs to > support the creation and installation of such zip files. Also, there > is a need for the install phase of distutil to communicate the > location of the zip file to the Python installation. I'm quite confident that something can be designed that would satisfy the needs here. Something akin to .pth files that a zip importer could read. >... > > > - Standard import from zip or jar files, in two ways: > > > > > > (1) an entry on sys.path can be a zip/jar file instead of a directory; > > > its contents will be searched for modules or packages > > Note that this is what I mention above for distutil support. > > > While this could easily be done, I might argue against it. Old > > apps/modules that process sys.path might get confused. > > Above I argued that this shouldn't be a problem. For most code, no, but as Fred mentioned (and I surmise), there are things out there assuming that sys.path contains strings which specify directories. Sure, we can do this (your discretion), but my feeling is to avoid it. > > If compatibility is not an issue, then "No problem." > > > > An alternative would be an Importer instance added to sys.importers that > > is configured for a specific archive (in other words, don't add the zip > > file to sys.path, add ZipImporter(file) to sys.importers). > > This would be harder for distutil: where does Python get the initial > list of importers? Default is just the two: BuiltinImporter and PathImporter. Adding ZipImporters (or anything else) at startup is TBD, but shouldn't pose a problem. >... > > > (2) a file in a directory that's on sys.path can be a zip/jar file; > > > its contents will be considered as a package (note that this is > > > different from (1)!) > > > > No problem. This will slow things down, as a stat() for *.zip and/or *.jar > > must be done, in addition to *.py, *.pyc, and *.pyo. > > Fine, this is where the caching comes in handy. IFF caching is enabled for the particular platform and installation. >... > > The Importer class is already designed for subclassing (and its interface > > is very narrow, which means delegation is also *very* easy; see > > imputil.FuncImporter). > > But maybe it's *too* narrow; some of the hooks I suggest above seem to > require extra interfaces -- at least in some of the subclasses of the > Importer base class. Correct -- the *subclasses*. I still maintain the imputil design of a single hook (get_code) is Right. I'll make a swipe at PathImporter in the next few weeks to add the capability for new extensions. > Note: I looked at the doc string for get_code() and I don't understand > what the difference is between the modname and fqname arguments. If I > write "import foo.bar", what are modname and fqname? Why are both > present? Also, while you claim that the API is narrow, the multiple > return values (also the different types for the second item) make it > complicated. Gordon detailed this in another note... Yes, the multiple return values make it a bit more complicated, but I can't think of any reasonable alternatives. A bit more doc should do the trick, I'd guess. >... > > > - a hook to auto-generate .py files from other filename > > > extensions (as currently implemented by ILU) > > > > No problem at all. > > See above -- I think this should be more integrated with sys.path than > you are thinking of. The more I think about it, the more I see that > the problem is that for you, the importer that uses sys.path is a > final subclass of Importer (i.e. it is itself not further subclassed). > Several of the hooks I want seem to require additional hooks in the > PathImporter rather than new importers. Correct -- I've currently designed/implemented PathImporter as "final". I don't forsee a problem turning it into something that can be hooked at run-time, or subclassed at code-time. A detailing of the features needed would be handy: * allow alternative file suffixes, with functions or subclasses to map the file into a code/module object. >... > > > - Note that different kinds of hooks should (ideally, and within > > > reason) properly combine, as follows: if I write a hook to recognize > > > .spam files and automatically translate them into .py files, and you > > > write a hook to support a new archive format, then if both hooks are > > > installed together, it should be possible to find a .spam file in an > > > archive and do the right thing, without any extra action. Right? > > > > Ack. Very, very difficult. > > Actually, I take most of this back. Importers that deal with new > extension types often have to go through a file system to transform > their data to .py files, and this is just too complicated. However it > would be still nice if there was code sharing between the code that > looks for .py and .pyc files in a zip archive and the code that does > the same in a filesystem. Hm, maybe even that shouldn't be necessary, > the zip file probably should contain only .pyc files... Gordon replies to this... All of the archives that myself, Gordon, and JimA have been using only store .pyc files. I don't see much code sharing between the filesystem and archive import code. >... > > All is not lost, however. I can easily envision the get_code() hook as > > allowing any kind of return type. If it isn't a code or module object, > > then another hook is called to transform it. > > [ actually, I'd design it similarly: a *series* of hooks would be called > > until somebody transforms the foo.spam into a code/module object. ] > > OK. This could be a feature of a subclass of Importer. That would be my preference, rather than loading more into the Importer base class itself. >... > > > - It should be possible to write hooks in C/C++ as well as Python > > > > Use FuncImporter to delegate to an extension module. > > Maybe not so great, since it sounds like the C code can't benefit from > any of the infrastructure that imputil offers. I'm not sure about > this one though. There isn't any infrastructure that needs to be accessed. get_code() is the call-point, and there is no mechanism provided to the callee to call back into the imputil system. > > This is one of the benefits of imputil's single/narrow interface. > > Plus its vague specs? :-) Ouch. I thought I was actually doing quite a bit better than normal with that long doc-string on get_code :-( >... > > For a restricted execution app, it might install an Importer that loads > > files from *one* directory only which is configured from a specific > > Win32 Registry entry. That importer could also refuse to load shared > > modules. The BuiltinImporter would still be present (although the app > > would certainly omit all but the necessary builtins from the build). > > Frozen modules could be excluded. > > Actually there's little reason to exclude frozen modules or any > .py/.pyc modules -- by definition, bytecode can't be dangerous. It's > the builtins and extensions that need to be censored. > > We currently do this by subclassing ihooks, where we mask the test for > builtins with a comparison to a predefined list of names. True. My concern is an invader misusing one "type" of module for another. For example, let's say you've provided a selection of modules each exporting function FOO, and the user can configure which module to use. Can they do damage if some unrelated, frozen module also exports FOO? Minor issue, anyhow. All the functionality is there. >... > > I posited once before that the cost of import is mostly I/O rather than > > CPU, so using Python should not be an issue. MAL demonstrated that a good > > design for the Importer classes is also required. Based on this, I'm a > > *strong* advocate of moving as much as possible into Python (to get > > Python's ease-of-coding with little relative cost). > > Agreed. However, how do you explain the slowdown (from 9 to 13 > seconds I recall) though? Are you a lousy coder? :-) Heh :-) I have not spent *any* time working on optimization. Currently, each Importer in the chain redoes some work of the prior Importer. A bit of restructuring would split the common work out to a Manager, which then calls a method in the Importer (and passes all the computed work). Of course, a bit of profiling wouldn't hurt either. Some of the "imp" interfaces could possibly be refined to better support the BuiltinImporter or the dynamic load features. The question is still valid, though -- at the moment, I can't explain it because I haven't looked into it. > > The (core) C code should be able to search a path for a module and import > > it. It does not require dynamic loading or packages. This will be used to > > import exceptions.py, then imputil.py, then site.py. Note: after writing this, I realized there is really no need for the core to do the imputil import. site.py can easily do that. > It does, however, need to import builtin modules. imputil currently Correct. > imports imp, sys, strop and __builtin__, struct and marshal; note that > struct can easily be a dynamic loadable module, and so could strop in > theory. (Note that strop will be unnecessary in 1.6 if you use string > methods.) I knew about strop, but imputil would be harder to use today if it relied on the string methods. So... I've delayed that change. The struct module is used in a couple teeny cases, dealing with constructing a network-order, 4-byte, binary integer value. It would be easy enough to just do that with a bit of Python code instead. > I don't think that this chicken-or-egg problem is particularly > problematic though. Right. In my ideal world, the core couldn't do a dynamic load, so that would need to be considered within the bootstrap process. >... > > site.py can complete the bootstrap by setting up sys.importers with the > > appropriate Importer instances (this is where an application can define > > its own policy). sys.path was initially set by the import.c bootstrap code > > (from the compiled-in path and environment variables). > > I thing that algorithm (currently in getpath.c / getpathp.c) might > also be moved to Python code -- imported frozen. Sadly, rebuilding > with a new version of a frozen module might be more complicated than > rebuilding with a new version of a C module, but writing and > maintaining this code in Python would be *sooooooo* much easier that I > think it's worth it. I think we can find a better way to freeze modules and to use them. Especially for the cases where we have specific "core" functions implemented in Python. (e.g. freezing parsers, compilers, and/or the read-eval loop) I don't forsee an issue that the build process becomes more complicated. If we nuke "makesetup" in favor of a Python script, then we could create a stub Python executable which runs the build script which writes the Setup file and the getpath*.c file(s). > > Note that imputil.py would not install any hooks when it is loaded. That > > is up to site.py. This implies the core C code will import a total of > > three modules using its builtin system. After that, the imputil mechanism > > would be importing everything (site.py would .install() an Importer which > > then takes over the __import__ hook). > > (Three not counting the builtin modules.) Correct, although I'll modify my statement to "two plus the builtins". > > Further note that the "import" Python statement could be simplified to use > > only the hook. However, this would require the core importer to inject > > some module names into the imputil module's namespace (since it couldn't > > use an import statement until a hook was installed). While this > > simplification is "neat", it complicates the run-time system (the import > > statement is broken until a hook is installed). > > Same chicken-or-egg. We can be pragmatic. > > For a developer, I'd like a bit of robustness (all this makes it > rather hard to debug a broken imputil, and that's a fair amount of > code!). True. I threw that out as an alternative, and then presented the counter argument :-) >... > > Therefore, the core C code must also support importing builtins. "sys" and > > "imp" are needed by imputil to bootstrap. > > > > The core importer should not need to deal with dynamic-load modules. > > Same question. Since that all has to be coded in C anyway, why not? It simplifies the core's import code to not deal with that stuff at all. > > To support frozen apps, the core importer would need to support loading > > the three modules as frozen modules. > > I'd like to see a description of how someone like Jim A would build a > single-file application using the new mechanism. This could > completely replace freeze. (Freeze currently requires a C compiler; > that's bad.) The portable mechanism for freezing will always need a compiler. Platform specific mechanisms (e.g. append to the .EXE, or use the linker to create a new ELF section) can optimize the freeze process in different ways. I don't have a design in my head for the freeze issues -- I've been considering that the mechanism would remain about the same. However, I can easily see that different platforms may want to use different freeze processes... hmm... >... > > Yes. I don't see this as a requirement, though. We wouldn't start to use > > these by default, would we? Or insist on zlib being present? I see this as > > more along the lines of "we have provided a standardized Importer to do > > this, *provided* you have zlib support." > > Agreed. Zlib support is easy to get, but there are probably platforms > where it's not. (E.g. maybe the Mac? I suppose that on the Mac, > there would be some importer classes to import from a resource fork.) Exactly. And importer classes to load from a Win32 resources (modifying a .EXE's resources post-link is cleaner than the append solution) >... > > My outline above does not freeze anything. Everything resides in the > > filesystem. The C code merely needs a path-scanning loop and functions to > > import .py*, builtin, and frozen types of modules. > > Good. Though I think there's also a need for freezing everything. > And when we go the route of the zip archive, the zip archive handling > code needs to be somewhere -- frozen seems to be a reasonable choice. Sure. > > If somebody nukes their imputil.py or site.py, then they return to Python > > 1.4 behavior where the core interpreter uses a path for importing (i.e. no > > packages). They lose dynamically-loaded module support. > > But if the path guessing is also done by site.py (as I propose) the > path will probably be wrong. A warning should be printed. All right. Doesn't Python already print a warning if it can't find site.py? > > > Let's first complete the requirements gathering. Are these > > > requirements reasonable? Will they make an implementation too > > > complex? Am I missing anything? > > > > I'm not a fan of the compositing due to it requiring a change to semantics > > that I believe are very useful and very clean. However, I outlined a > > possible, clean solution to do that (a secondary set of hooks for > > transforming get_code() return values). > > As you may see from my responses, I'm a big fan of having several > different sets of hooks. Yes. However, I've only recognized one so far. Propose more... I'm confident we can update the PathImporter design to accomodate (and retain the underlying imputil paradigm). > I do withdraw the composition requirement > though. :-) >... > > Once you hit site.py, you have a "full" environment and can easily detect > > and import a read-eval-print loop module (i.e. why return to Python? just > > start things up right there). > > You mean "why return to C?" I agree. It would be cool if somehow Heh. Yah, that's what I meant :-) > IDLE and Pythonwin would also be bootstrapped using the same > mechanisms. (This would also solve the question "which interactive > environment am I using?" that some modules and apps want to see > answered because they need to do things differently when run under > IDLE,for example.) Haven't thought on this. Should be doable, I'd think. > > site.py can also install new optimizers as desired, a new Python-based > > parser or compiler, or whatever... If Python is built without a parser or > > compiler (I hope that's an option!), then the three startup modules would > > simply be frozen into the executable. > > More power to hooks! :-) You betcha! I believe my next order of business: * update PathImporter with the file-extension hook * dynload C code reorg, per the other email * create new-model site.py and trash import.c * review freeze mechanisms and process * design mechanism for frozen core functionality (eg. getpath*.c) (coding and building design) * shift core functions to Python, using above design I'll just plow ahead, but also recognize that any/all may change. ie. I'll build examples/finals/prototypes and Guido can pick/choose/reimplement/etc as needed. I'm out next week, but should start on the above items by the end of the month (will probably do another mod_dav release in there somewhere). Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Fri Dec 3 11:10:10 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 3 Dec 1999 11:10:10 +0100 Subject: [Python-Dev] Import redesign [LONG] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> Message-ID: <023601bf3d78$0ec3dc30$f29b12c2@secret.pythonware.com> Jean-Claude Wippler wrote: > This may be off-topic, but has anyone considered what it would take to > load shared libs out of an archive? well, we do that in a number of applications. (lazy installers are really cool... if you've installed works, you've seen some weird stuff -- for example, when the application starts the first time, it's loading everything from inside the installer. the rest of the installation is done from within the application itself, using archives in the installation executable) I think things like this are better left for the application designers, though... From mal at lemburg.com Fri Dec 3 11:03:31 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 03 Dec 1999 11:03:31 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: Message-ID: <38479573.B2CFDD2B@lemburg.com> Greg Stein wrote: > > On Thu, 2 Dec 1999, M.-A. Lemburg wrote: > >... > > Still, I would like to rephrase my 0.02EUR which I already > > posted twice... why not start to think about what these > > importers would do first ? If there are only a handful of > > wishes we could just add them to the builtin machinery and > > be done with it... > > I'd rather see the builtin machinery move to Python, regardless of what > system is used and/or what features are added. In the long run that's probably the right direction, but right now we are only talking a very small set of additional features, which can easily be added to the existing code without too much fuzz. Plus it won't slow things down, which is important since Python startup time is already an issue all by itself. The imputil.py approach of doing (a whole bunch of) recursive Python function calls to all kinds of importers will not speed this up, I'm afraid. A on-disk lookup table would speed this up, but it would also break the current logic in imputil.py, which puts importer independence above all. -- IMHO, we should retreat to a more centralized interface, one which more resembles a manager rather than the agent interface implemented in imputil.py. Add-ons can then register themselves to say "hey, I can handle pyz-archives" or "I know how to import .so modules" or "I provide a search function which you can call to have me scan my module container (directory, web-site, archive)". The manager would take care of what to call and in which order, plus delegate requests to add-ons which implement the needed logic, e.g. add-ons for signature checking, unzipping archives, file system lookup tables, etc. It could also trace its actions and then keep an on-disk knowledge base for what it did in the past to find certain modules under certain conditions. Anyway, all this is extra magic for some future version of Python. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 28 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Dec 3 14:45:07 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 03 Dec 1999 08:45:07 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Fri, 03 Dec 1999 11:03:31 +0100." <38479573.B2CFDD2B@lemburg.com> References: <38479573.B2CFDD2B@lemburg.com> Message-ID: <199912031345.IAA16376@eric.cnri.reston.va.us> [Greg] > > I'd rather see the builtin machinery move to Python, regardless of what > > system is used and/or what features are added. [Marc] > In the long run that's probably the right direction, but right now > we are only talking a very small set of additional features, > which can easily be added to the existing code without too much > fuzz. I disagree. We should do the redisign right rather than tweaking the existing code. > Plus it won't slow things down, which is important since > Python startup time is already an issue all by itself. The > imputil.py approach of doing (a whole bunch of) recursive Python > function calls to all kinds of importers will not speed this up, > I'm afraid. A on-disk lookup table would speed this up, but > it would also break the current logic in imputil.py, which > puts importer independence above all. I don't care about the current logic in imputil. It's only a prototype! > IMHO, we should retreat to a more centralized interface, > one which more resembles a manager rather than the agent > interface implemented in imputil.py. Add-ons can then > register themselves to say "hey, I can handle pyz-archives" > or "I know how to import .so modules" or "I provide a > search function which you can call to have me scan > my module container (directory, web-site, archive)". This makes sense. > The manager would take care of what to call and in which > order, plus delegate requests to add-ons which implement > the needed logic, e.g. add-ons for signature checking, unzipping > archives, file system lookup tables, etc. > > It could also trace its actions and then keep an on-disk > knowledge base for what it did in the past to find certain > modules under certain conditions. > > Anyway, all this is extra magic for some future version of > Python. I would say the manager API design and a basic set of specific handlers should go into 1.6. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Fri Dec 3 15:14:00 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 3 Dec 1999 15:14:00 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> Message-ID: <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> MAL wrote: > > IMHO, we should retreat to a more centralized interface, > > one which more resembles a manager rather than the agent > > interface implemented in imputil.py. Add-ons can then > > register themselves to say "hey, I can handle pyz-archives" > > or "I know how to import .so modules" or "I provide a > > search function which you can call to have me scan > > my module container (directory, web-site, archive)". but why? in my small-minded view of how python works, an importer carries out a very simple task: given a name, check if you have a module with that name, and install it. if you cannot, fail (in which case python asks the next importer along the path). why do you have to complicate things beyond that? why not just let Python provide a few base classes and mixins for people who want to create custom importers, and be done with it? rationale, please. From jim at interet.com Fri Dec 3 15:34:40 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 03 Dec 1999 09:34:40 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> Message-ID: <3847D500.53833D06@interet.com> "M.-A. Lemburg" wrote: > > Greg Stein wrote: > > I'd rather see the builtin machinery move to Python, regardless of what > > system is used and/or what features are added. > > In the long run that's probably the right direction, but right now > we are only talking a very small set of additional features, > which can easily be added to the existing code without too much > fuzz. I volunteer to write a Python archive in either Python or C. In fact I currently have prototypes for both. But I have to agree with Greg here. I think a Python importer is the way to go. The C code is 300 lines mostly in import.c and parallel to existing code. The Python archive is about 100 lines and is prettier, easy to read, alter and re-use (obviously). > Plus it won't slow things down, which is important since > Python startup time is already an issue all by itself. The I think archive files should be able to be fast, and should help, not hurt, startup time. Provided that the use of sys.path is curtailed, os.readdir() is not needed, and the specifications are not complicated. Although archive files are my special concern, I realize that imputil is not just about archives. JimA From guido at CNRI.Reston.VA.US Fri Dec 3 15:39:25 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 03 Dec 1999 09:39:25 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Thu, 02 Dec 1999 19:19:40 PST." References: Message-ID: <199912031439.JAA16524@eric.cnri.reston.va.us> Greg, Great response. I think we know where we each stand. Please go ahead with a new design. (That's trust, not carte blanche.) Just one thought: the more I think about it, the less I like sys.importers: functionality which is implemented through sys.importers must necessarily be placed either in front of all of sys.path or after it. While this is helpful for "canned" apps that want *everything* to be imported from a fixed archive, I think that for regular Python installations sys.path should remain the point of attack. In particular, installing a new package (e.g. PIL) should affect sys.path, regardless of the way of delivery of the modules (shared libs, .py files, .pyc files, or a zip archive). I'm not too worried about code that inspects sys.path and expects certain invariants; that code is most likely interfering with the import mechanism so should be revisited anyway. On the lone .pyc issue: I'd like to see this disappear when using the filesystem, I see no use for it there if we support .pyc files in zip archives. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at interet.com Fri Dec 3 15:44:54 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 03 Dec 1999 09:44:54 -0500 Subject: [Python-Dev] Import redesign [LONG] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> Message-ID: <3847D766.1E5FFAF3@interet.com> Jean-Claude Wippler wrote: > > Guido van Rossum wrote: > > [...] > > Note that the interpretation of __file__ could be problematic. To > > what value do you set __file__ for a module loaded from a zip archive? > > Makefiles use "archive(entry)" (this also supports nesting if needed). I discovered the hard way this entry is not optional. I just used the archive file name for __file__. > This may be off-topic, but has anyone considered what it would take to > load shared libs out of an archive? One way is to extract on-the-fly to > a temporary area. A refinement is to leave extracted files there as > cache, and perhaps even to extract to a file with a name derived from > its MD5 digest (this way multiple users and even Python installations > can share the cache). Would it be useful to define a "standard" area? IMHO putting shared libs in an archive is a bad idea because the OS can not use them there. They must be extracted as you say. But then storage is wasted by using space in the archive and the external file. Deleting them after use wastes time. Better to leave them out of the archive and provide for them in the installer. IMHO the archive is a basic simple feature, and people make installers on top of that. Archives shouldn't try to do it all. JimA From mal at lemburg.com Fri Dec 3 15:14:09 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 03 Dec 1999 15:14:09 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> Message-ID: <3847D030.2C936E24@lemburg.com> Guido van Rossum wrote: > > [Greg] > > > I'd rather see the builtin machinery move to Python, regardless of what > > > system is used and/or what features are added. > > [Marc] > > In the long run that's probably the right direction, but right now > > we are only talking a very small set of additional features, > > which can easily be added to the existing code without too much > > fuzz. > > I disagree. We should do the redisign right rather than tweaking the > existing code. Ok, then... > > IMHO, we should retreat to a more centralized interface, > > one which more resembles a manager rather than the agent > > interface implemented in imputil.py. Add-ons can then > > register themselves to say "hey, I can handle pyz-archives" > > or "I know how to import .so modules" or "I provide a > > search function which you can call to have me scan > > my module container (directory, web-site, archive)". > > This makes sense. > > > The manager would take care of what to call and in which > > order, plus delegate requests to add-ons which implement > > the needed logic, e.g. add-ons for signature checking, unzipping > > archives, file system lookup tables, etc. > > > > It could also trace its actions and then keep an on-disk > > knowledge base for what it did in the past to find certain > > modules under certain conditions. > > > > Anyway, all this is extra magic for some future version of > > Python. > > I would say the manager API design and a basic set of specific > handlers should go into 1.6. BTW, is there a timeline for the 1.6 release ? I mean which things will have to be in 1.6 ? Some recent topics as hints: 1. Unicode 2. Import Manager API + default handlers 3. Python style coercion at C type level 4. Rich comparisons 5. __doc__ string extraction tool -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 28 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 3 15:24:04 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 03 Dec 1999 15:24:04 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> Message-ID: <3847D284.8CBF2A9C@lemburg.com> Fredrik Lundh wrote: > > MAL wrote: > > > IMHO, we should retreat to a more centralized interface, > > > one which more resembles a manager rather than the agent > > > interface implemented in imputil.py. Add-ons can then > > > register themselves to say "hey, I can handle pyz-archives" > > > or "I know how to import .so modules" or "I provide a > > > search function which you can call to have me scan > > > my module container (directory, web-site, archive)". > > but why? in my small-minded view of how python > works, an importer carries out a very simple task: > > given a name, check if you have a > module with that name, and install > it. if you cannot, fail (in which case > python asks the next importer along > the path). > > why do you have to complicate things beyond that? > why not just let Python provide a few base classes > and mixins for people who want to create custom > importers, and be done with it? Because importing in Python has become *much* more complicated over time. There are requests for new features which touch subjects such as storage mechanisms, lookups, signatures (for trusted code), lazy imports, etc. A chain of simple minded importers won't work together too well, duplicate work and downgrade performance considerably due to the many recursive function calls. Also, centralized caching strategies are hard to implement across import handlers. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 28 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jeremy at cnri.reston.va.us Fri Dec 3 17:47:54 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Fri, 3 Dec 1999 11:47:54 -0500 (EST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <14406.58137.359127.921135@weyr.cnri.reston.va.us> References: <199912022043.PAA15108@eric.cnri.reston.va.us> <14406.58137.359127.921135@weyr.cnri.reston.va.us> Message-ID: <14407.62522.360386.757519@goon.cnri.reston.va.us> >>>>> "FLD" == Fred L Drake, writes: >> (Unrelated remark: I should really try to release the set of >> modules we've written here at CNRI to deal with zip files. >> Unfortunately zip files are hairy and so is our code.) FLD> It doesn't help that that code just plain stinks. I maintain FLD> that no one here understands the whole of it. I'm all for improving the code and getting it out. The real problem is that interfaces have been glommed on for every new use of a Zip file. (You want to read one off a socket and extract files before you've got the whole thing? No problem! Add a new class.) We need to figure out the common patterns for using the archives and write a new set of interfaces to support that. Jeremy From guido at CNRI.Reston.VA.US Fri Dec 3 18:12:07 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 03 Dec 1999 12:12:07 -0500 Subject: [Python-Dev] What to do with our Zip code? In-Reply-To: Your message of "Fri, 03 Dec 1999 11:47:54 EST." <14407.62522.360386.757519@goon.cnri.reston.va.us> References: <199912022043.PAA15108@eric.cnri.reston.va.us> <14406.58137.359127.921135@weyr.cnri.reston.va.us> <14407.62522.360386.757519@goon.cnri.reston.va.us> Message-ID: <199912031712.MAA17061@eric.cnri.reston.va.us> [Jeremy, on our Zip code] > I'm all for improving the code and getting it out. The real problem > is that interfaces have been glommed on for every new use of a Zip > file. (You want to read one off a socket and extract files before > you've got the whole thing? No problem! Add a new class.) We need to > figure out the common patterns for using the archives and write a new > set of interfaces to support that. If we gave you the code we currently have, would someone else in this forum be willing to redesign it? Eventually it would become part of the Python distribution. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Sat Dec 4 10:54:30 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 4 Dec 1999 10:54:30 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> Message-ID: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> M.-A. Lemburg wrote: > > given a name, check if you have a > > module with that name, and install > > it. if you cannot, fail (in which case > > python asks the next importer along > > the path). > > > > why do you have to complicate things beyond that? > > why not just let Python provide a few base classes > > and mixins for people who want to create custom > > importers, and be done with it? > > Because importing in Python has become *much* more > complicated over time. There are requests for new > features which touch subjects such as storage mechanisms, > lookups, signatures (for trusted code), lazy imports, etc. sorry, I still don't understand it. our applications already use different storage mechanisms, databases, signatures, lazy importing, version handling, etc, etc. now, if *we* have managed to build all that on top of an old version of imputil.py, how come it's not sufficient for the rest of you? > A chain of simple minded importers won't work together > too well why? it sure works for us... > duplicate work avoiding duplicate work is what object oriented design is all about. and last time I checked, Python had excellent support for that. > and downgrade performance considerably due to the > many recursive function calls now that's what I call premature optimization. and this scares the hell out of me: if the rest of the python-dev crowd don't seriously believe that Python is (or can be made) fast enough to implement things like this, why the heck are you using Python at all? am I the only one here who doesn't believe in osterhout's talk about "the great system vs. scripting language divide"? From fredrik at pythonware.com Sat Dec 4 10:54:42 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 4 Dec 1999 10:54:42 +0100 Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> Message-ID: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> James C. Ahlstrom wrote: > IMHO putting shared libs in an archive is a bad idea because the OS > can not use them there. They must be extracted as you say. But then > storage is wasted by using space in the archive and the external file. > Deleting them after use wastes time. Better to leave them out of the > archive and provide for them in the installer. IMHO the > archive is a basic simple feature, and people make installers on top > of that. Archives shouldn't try to do it all. have you tried it? if not, why do you think you should be allowed to forbid others from doing it? in "the inmates are running the asylum", alan cooper points out that the *major* reason people all over the world love web applications are that there are no bloody installers. and here you are advocating that we all should be forced to use installers, when python makes it trivial to write self-installing apps. double-argh! (on the other hand, why do I complain? all pythonworks customers is going to be able to do all this anyway...). frankly, this "design by committee" (or is it "design by people who've never even been close to implementing something because they thought it was too hard, and thus think they're qualified to argue against those of us who didn't even realize that it was a hard problem"?) trend I've been seeing in all kinds of python forums makes me sooooo sad. the more of this I see (dist- utils-sig, doc-sig, here, c.l.python), the sadder I get, and the more I sympathise with John Skaller who's defining his own python-like universe... if someone needs me, I'll be down in the pub having a beer with the mad scientist, the shiny eff-bot, and mr. nitpicker. if we're not there, you'll find us in the lab, working on new string matching facilities for 1.6, SOAP [1], tkinter replacements for the masses, and whatever else we can come up with... see you! 1) http://www.newsalert.com/bin/story?StoryId=Coenz0bWbu0znmdKXqq From gstein at lyra.org Sat Dec 4 11:42:27 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 02:42:27 -0800 (PST) Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) In-Reply-To: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> Message-ID: On Sat, 4 Dec 1999, Fredrik Lundh wrote: > M.-A. Lemburg wrote: >... > > Because importing in Python has become *much* more > > complicated over time. There are requests for new > > features which touch subjects such as storage mechanisms, > > lookups, signatures (for trusted code), lazy imports, etc. > > sorry, I still don't understand it. our applications already > use different storage mechanisms, databases, signatures, > lazy importing, version handling, etc, etc. now, if *we* > have managed to build all that on top of an old version > of imputil.py, how come it's not sufficient for the rest > of you? I agree. The imputil mechanism has been proven in combat to work for many scenarios. I have not (yet) heard of a case where the model has proven insufficient. > > A chain of simple minded importers won't work together > > too well > > why? it sure works for us... Exactly. "Why?" Please provide an example. >... > > and downgrade performance considerably due to the > > many recursive function calls > > now that's what I call premature optimization. and this > scares the hell out of me: if the rest of the python-dev > crowd don't seriously believe that Python is (or can be > made) fast enough to implement things like this, why > the heck are you using Python at all? am I the only > one here who doesn't believe in osterhout's talk about > "the great system vs. scripting language divide"? Don't worry Fredrik... I'm with you on this one. I do not believe there is a problem with the speed. Nobody has yet profiled imputil to find out where/how the time is being spent. Nobody has tried to speed it up. Therefore, any claims about its performance are simply FUD. I claim that its interface is correct, and you (Fredrik) stated it well: "given a name, please give me a module if you can (otherwise None)." Underneath that semantic, there are a lot of things that can be done to alter the performance and organization. Claims about speed are entirely premature. Yes, I'm biased. But, in truth, I haven't seen a better mechanism yet. I've tossed out a few ideas on how imputil could be improved (which are solely based on guess, rather than empirical evidence of profiling output). When those changes are completed and there is still an issue, then I'll admit defeat and wait for somebody else to provide a new design. Cheers, -g -- Greg Stein, http://www.lyra.org/ From marangoz at python.inrialpes.fr Sat Dec 4 12:15:53 1999 From: marangoz at python.inrialpes.fr (Vladimir Marangozov) Date: Sat, 4 Dec 1999 12:15:53 +0100 (CET) Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] In-Reply-To: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> from "Fredrik Lundh" at Dec 04, 1999 10:54:42 AM Message-ID: <199912041115.MAA00539@python.inrialpes.fr> Fredrik Lundh wrote: > [snip] > > > > frankly, this "design by committee"... [snip] > ... see you! > > > C'mon /F, it's a battle of ideas and that's the way it works before filtering the good ones from the bad ones, then focusing on the appropriate implementation. I'm in sync with the discussion, although I haven't posted my partial notes on it due to lack of time. But let me say that overall, this discussion is a good thing and the more opinions we get, the better. BTW, you just _can't_ leave like this and start playing solitaire at the bar, first, because we need beer too and it's unlikely that you'll find a bar we don't know already, and second, because it was you who revived this discussion with 1 word, repeated 3 times: > Subject: Re: [Python-Dev] Python 1.6 status > Date: Wed, 17 Nov 1999 12:46:01 +0100 > > Guido van Rossum wrote: > > - suggestions for new issues that maybe ought to be settled in 1.6 > > three things: imputil, imputil, imputil > > Thus, with no visible argumentation (so don't shoot on others when they argue instead of you), and with this one word, you pushed Guido to the extreme of suggesting a complete redesign of the import machinery from scratch, based on a "Grand Architecture" :-). Right? -- Right! This is a fact and a fairly amount of the credits go entirely to you! Since then, however, I haven't really seen your arguments, and I believe that nobody here got exactly your point. I, for one, may well argue against imputil as being just another brick on top of the grand mess. But because I haven't made the time to write properly my notes, I don't dare to express a partial opinion, not blame those who argue good or bad in the meantime, when I'm silent. So, why are you showing us your back when you have clearly something to say, but like me, you haven't made the time to say it? Please don't waste my time with emotional rants ;-). Everybody here tries to contribute according to its knowledge, experience and availability. Later, -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From mal at lemburg.com Sat Dec 4 11:45:52 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 04 Dec 1999 11:45:52 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> Message-ID: <3848F0E0.B8132AD2@lemburg.com> Fredrik Lundh wrote: > > M.-A. Lemburg wrote: > > > given a name, check if you have a > > > module with that name, and install > > > it. if you cannot, fail (in which case > > > python asks the next importer along > > > the path). > > > > > > why do you have to complicate things beyond that? > > > why not just let Python provide a few base classes > > > and mixins for people who want to create custom > > > importers, and be done with it? > > > > Because importing in Python has become *much* more > > complicated over time. There are requests for new > > features which touch subjects such as storage mechanisms, > > lookups, signatures (for trusted code), lazy imports, etc. > > sorry, I still don't understand it. our applications already > use different storage mechanisms, databases, signatures, > lazy importing, version handling, etc, etc. now, if *we* > have managed to build all that on top of an old version > of imputil.py, how come it's not sufficient for the rest > of you? I've tried to get (an older) imputil.py version up and running too. It did work, but only after some considerable tweaking and even with integrated cache mechanisms did not reach the performance of the builtin importer (which doesn't use the kinds of caching strategies I had built into imputil.py). Getting the whole setup to work wasn't easy at all, because of the way imputil importers delegate work and things get even more confusing when it starts to "take over" certain parts of packages by installing temselves as importers for a particular package. > > A chain of simple minded importers won't work together > > too well > > why? it sure works for us... An example: A path importer knows how to scan directories and how to use a path to tell the correct order. It can maybe also import .py/.pyc/.pyo files. Now what happens if it finds a shared lib as module... the usual imputil way would be to delegate the request to some other importer which can handle shared libs... but wait: how does the shared lib importer know where to look ? It will have to rescan the directories, etc... > > duplicate work > > avoiding duplicate work is what object oriented design > is all about. and last time I checked, Python had excellent > support for that. See my example above. The agent approach used by imputil does not support OO design too well: even though you can avoid duplicate programming work on the importers by using a few base classes which implement dir scans, shared lib imports, etc. the imputil design does not provide means to avoid duplicate actions taken by the importers. > > and downgrade performance considerably due to the > > many recursive function calls > > now that's what I call premature optimization. and this > scares the hell out of me: if the rest of the python-dev > crowd don't seriously believe that Python is (or can be > made) fast enough to implement things like this, why > the heck are you using Python at all? am I the only > one here who doesn't believe in osterhout's talk about > "the great system vs. scripting language divide"? Looks like you are in ranting mode here ;-) Seriously, I've checked my imputil.py version (with caches enabled) against the builtin importer and noticed a performance downgrade by factor >2. This was enough to convince me of looking for other techniques to handle the problems I had at the time... you know, relative imports and things. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 27 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Sat Dec 4 12:04:15 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 04 Dec 1999 12:04:15 +0100 Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> Message-ID: <3848F52F.5F5B748F@lemburg.com> Fredrik Lundh wrote: > > > > frankly, this "design by committee" (or is it "design by > people who've never even been close to implementing > something because they thought it was too hard, and > thus think they're qualified to argue against those of > us who didn't even realize that it was a hard problem"?) Huh ? Two points: 1. How can you be sure that people haven't tried implementing their ideas and for various reasons have come to some conclusion about those ideas ? 2. Would you seriously disqualify people from joining a discussion by the simple arguement that they have not implemented anything yet ? Just take the Unicode discussion as example: it was very lively and resulted in a decent proposal which is now subject to further investigation by the implementors ;-) Many people have joined in even though they did not and/or will not implement anything. Still, their arguments were very useful to show up weaknesses in the proposal. Now, let's rather have a beer in the pub around the corner than go on ranting about :-). -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 27 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Sat Dec 4 12:53:33 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 04 Dec 1999 12:53:33 +0100 Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) References: Message-ID: <384900BD.D16E72BC@lemburg.com> Greg Stein wrote: > > > [me:] > > > A chain of simple minded importers won't work together > > > too well > > > > why? it sure works for us... > > Exactly. "Why?" Please provide an example. See my reply to Fredrik. > >... > > > and downgrade performance considerably due to the > > > many recursive function calls > > > > now that's what I call premature optimization. and this > > scares the hell out of me: if the rest of the python-dev > > crowd don't seriously believe that Python is (or can be > > made) fast enough to implement things like this, why > > the heck are you using Python at all? am I the only > > one here who doesn't believe in osterhout's talk about > > "the great system vs. scripting language divide"? > > Don't worry Fredrik... I'm with you on this one. I do not believe there is > a problem with the speed. Nobody has yet profiled imputil to find out > where/how the time is being spent. Nobody has tried to speed it up. Sorry, Greg, but that is simply not true. I've spend a few days on trying to get more performance out of it and have succeeded, but in the end it wasn't enough to convince me of the approach. > Therefore, any claims about its performance are simply FUD. BTW, did anybody mention that an import manager wouldn't be able to provide an API which is useable for imputil style importers ? I'm not argueing against the possibility to use imputil style importers, just against making it the sole method of adding wisdom to Python imports. The imputil importers could well benefit from a manager providing logic to do basic things like importing shared libs, checking signatures, downloading modules from the web, etc. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 27 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein at lyra.org Sat Dec 4 13:15:13 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 04:15:13 -0800 (PST) Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) In-Reply-To: <384900BD.D16E72BC@lemburg.com> Message-ID: On Sat, 4 Dec 1999, M.-A. Lemburg wrote: >... > > Don't worry Fredrik... I'm with you on this one. I do not believe there is > > a problem with the speed. Nobody has yet profiled imputil to find out > > where/how the time is being spent. Nobody has tried to speed it up. > > Sorry, Greg, but that is simply not true. I've spend a few > days on trying to get more performance out of it and have > succeeded, but in the end it wasn't enough to convince me > of the approach. You sent me your changes... I don't believe that you were aggressive enough. As I've mentioned before, I think it is quite possible to retain the general Importer style and get_code() interface, but to shift some functionality out (to be computed once) to a higher-level mechanism. The patches that you sent me did not do this, so I'm not surprised that you hit a wall. Ack. See? Now I'm getting into discussions about performance and implementation without truly knowing where the timing is spent. Eyeballing it, I have an idea, but it would be best too see a profile output. My mantra is always "90% of the time you're wrong about where 90% of the time is being spent." I am unconcerned about performance, but will work on it so that I don't need to continue this conversation. That burden is on me. > > Therefore, any claims about its performance are simply FUD. > > BTW, did anybody mention that an import manager wouldn't > be able to provide an API which is useable for imputil > style importers ? I'm not argueing against the possibility > to use imputil style importers, just against making it the > sole method of adding wisdom to Python imports. Since the core will delegate out to Python (note: current working theory), then it certainly is not the "sole method" (since you can just replace the Python code). But there must be a default mechanism. The ihooks stuff was too complicated. imputil seems to be much easier. I'd love to see a third mechanism.... so I can steal ideas :-) > The imputil importers could well benefit from a manager > providing logic to do basic things like importing > shared libs, checking signatures, downloading modules > from the web, etc. For shared libs, yes. For the others: geez... I don't want to see that in the core infrastructure. Shift that out to specialized Importers. The infrstructure ought to be teeny and agnostic about how to map a module name to a module. Side note to python-dev people: I apologize... I realize that I'm beginning to get a bit defensive here. I'm going to be at XML '99 until Friday, so that should give me a breather. When I get back, I'll skip the talk and do some code. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Sat Dec 4 13:32:04 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 04:32:04 -0800 (PST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <3848F0E0.B8132AD2@lemburg.com> Message-ID: On Sat, 4 Dec 1999, M.-A. Lemburg wrote: > Fredrik Lundh wrote: >... > > sorry, I still don't understand it. our applications already > > use different storage mechanisms, databases, signatures, > > lazy importing, version handling, etc, etc. now, if *we* > > have managed to build all that on top of an old version > > of imputil.py, how come it's not sufficient for the rest > > of you? > > I've tried to get (an older) imputil.py version up and running > too. It did work, but only after some considerable tweaking > and even with integrated cache mechanisms did not reach > the performance of the builtin importer (which doesn't > use the kinds of caching strategies I had built into > imputil.py). 1) yes, it was an older version and did not have the PathImporter class. As a by product, the DirectoryImporters that it *did* have were much slower. It still did not support builtins, frozen modules, or dynamic loads. All of that is present now, so it works "out of the box" much better. 2) Performance: as I wrote in the other email, I don't believe that is an argument against the design. The imputil approach *will* be slower than the current Python mechanism, but there is some more coding to do to truly see how much. The side benefits (e.g. ZipImporter and caching) may outweigh the result. Time will tell. > Getting the whole setup to work wasn't easy > at all, because of the way imputil importers delegate work > and things get even more confusing when it starts to "take > over" certain parts of packages by installing temselves > as importers for a particular package. I don't understand this. If it is relevant, then please expand. Thx. > > > A chain of simple minded importers won't work together > > > too well > > > > why? it sure works for us... > > An example: > > A path importer knows how to scan directories and how to use > a path to tell the correct order. It can maybe also import > .py/.pyc/.pyo files. Now what happens if it finds a shared > lib as module... the usual imputil way would be to delegate > the request to some other importer which can handle shared > libs... but wait: how does the shared lib importer know > where to look ? It will have to rescan the directories, > etc... No, the "usual imputil way" is that the PathImporter understands searching a path and loading stuff from that path. An Importer is a combination of locating and loading (since they are, typically, tightly bound). The next rev will allow user-plugging of support for new file types. > > > duplicate work > > > > avoiding duplicate work is what object oriented design > > is all about. and last time I checked, Python had excellent > > support for that. > > See my example above. > > The agent approach used by imputil does not support > OO design too well: even though you can avoid duplicate > programming work on the importers by using a few > base classes which implement dir scans, shared lib > imports, etc. the imputil design does not provide > means to avoid duplicate actions taken by the importers. There is always a balance to be struck between independence and coupling. I chose to reduce coupling and increase independence. If you shift a bunch of stuff out of the Importers, then you will increase the coupling between the imputil framework and the Importers. That coupling will then close off future possibilities. Within the framework itself (e.g. between _import_hook and get_code), there is a lot of opportunity for change. Since that is behind the covers, it is no big deal to shift functionality around. I plan to do so. >... > Looks like you are in ranting mode here ;-) Seriously, > I've checked my imputil.py version (with caches enabled) > against the builtin importer and noticed a performance > downgrade by factor >2. This was enough to convince me > of looking for other techniques to handle the problems > I had at the time... you know, relative imports and things. I have run a long series of tests. Without doing any performance work on imputil, the ratio is 9 to 13. The 13 may have bumped up to about 15 or 16 when I added some dynamic loading code (I forget). Regardless, it is definitely less than a 2X increase. And that is with zero optimization. *shrug* I'm done. I'll do some code in a couple weeks. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Sat Dec 4 14:12:32 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 05:12:32 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912031439.JAA16524@eric.cnri.reston.va.us> Message-ID: On Fri, 3 Dec 1999, Guido van Rossum wrote: >... > Great response. I think we know where we each stand. Please go ahead > with a new design. (That's trust, not carte blanche.) Accepted gratefully. Thx. > Just one thought: the more I think about it, the less I like > sys.importers: functionality which is implemented through > sys.importers must necessarily be placed either in front of all of > sys.path or after it. While this is helpful for "canned" apps that > want *everything* to be imported from a fixed archive, I think that > for regular Python installations sys.path should remain the point of > attack. In particular, installing a new package (e.g. PIL) should > affect sys.path, regardless of the way of delivery of the modules > (shared libs, .py files, .pyc files, or a zip archive). Okay. I'll design with respect to this model. To be explicit/clear and to be sure I'm hearing you right: sys.path may contain Importer instances. Given the name FOO, the system will step through sys.path looking for the first occurence of FOO (looking in a directory or delegating). FOO may be found with any number of (configurable) file extensions, which are ordered (e.g. ".so" before ".py" before ".isl"). > I'm not too worried about code that inspects sys.path and expects > certain invariants; that code is most likely interfering with the > import mechanism so should be revisited anyway. The Benevolent Dictator has spoken. So be it. :-) > On the lone .pyc issue: I'd like to see this disappear when using the > filesystem, I see no use for it there if we support .pyc files in zip > archives. No problem. This actually creates a simplification in the system, as I'm seeing it now. I'm also seeing opportunities for a code reorg which may work towards MAL's issues with performance. I hope to have something in two or three weeks. I also hope people can be patient :-), but I certainly wouldn't mind seeing some alternative code! Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm at hypernet.com Sat Dec 4 15:59:44 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Sat, 4 Dec 1999 09:59:44 -0500 Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) In-Reply-To: <384900BD.D16E72BC@lemburg.com> Message-ID: <1267803104-11215142@hypernet.com> M.-A. Lemburg wrote: > Greg Stein wrote: > > Don't worry Fredrik... I'm with you on this one. I do not > > believe there is a problem with the speed. Nobody has yet > > profiled imputil to find out where/how the time is being spent. > > Nobody has tried to speed it up. > > Sorry, Greg, but that is simply not true. I've spend a few > days on trying to get more performance out of it and have > succeeded, but in the end it wasn't enough to convince me > of the approach. Remember those comparisons of Perl and Python, to which you added cgipython? I've added to the list a version that uses an old version of imputil (probably the one you optimized) and a compressed std lib. Note that my Linux python (1.5.2) is built in the RedHat style - even struct and strop are .so's; so that accounts for the majority of the open calls. This is a full Python (runs code.py if you don't pass it a script name). For lack of a better name, I've called it "pykit". First, the size of log files (in lines), i.e. number of system calls: Solaris Linux IRIX[1] Perl 88 85 70 Python 425 316 257 cgipython 182 pykit 136 Next, the number of "open" calls: Solaris Linux IRIX Perl 16 10 9 Python 107 71 48 cgipython 33 pykit 9 And the number of unsuccessful "open" calls: Solaris Linux IRIX Perl 6 1 3 Python 77 49 32 cgipython 28 pykit 2 Number of "mmap" calls: Solaris Linux IRIX Perl 25 25 1 Python 36 24 1 cgipython 13 pykit 21 This test would show off more if it went beyond startup. An import of a standard lib module in my stock Python involves 2 failed stats and 6 failed opens, then 2 successful opens and 2 fstats before the module is loaded. None of these occur in pykit. The downside (asking my Importer for a .so or a module not in the importer) takes no system calls, and involves a dozen or so lines of Python and a check of a dictionary. - Gordon From tismer at appliedbiometrics.com Sat Dec 4 16:29:03 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sat, 04 Dec 1999 16:29:03 +0100 Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) References: Message-ID: <3849333F.1DF2A201@appliedbiometrics.com> Greg Stein wrote: ... > My mantra is always "90% of the time you're wrong about where 90% > of the time is being spent." What a great sentence! We all know it, but many of us (especially me) forget about it during 90% of our coding time. Much better to spend this on design (as you did). thanks - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jim at interet.com Sat Dec 4 18:27:44 1999 From: jim at interet.com (James C. Ahlstrom) Date: Sat, 04 Dec 1999 12:27:44 -0500 Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> Message-ID: <38494F10.C644BA7@interet.com> Fredrik Lundh wrote: > > James C. Ahlstrom wrote: > > IMHO putting shared libs in an archive is a bad idea because the OS Dear Fredrik, I thought the point of Python-Dev was to propose designs and get feedback, right? Well, I got feedback :-). OK, I agree to alter my archive format so it provides the ability to store shared libs and not just *.pyd. I will add the string length and if needed a flag indicating the name is a shared lib. Now the details: > have you tried it? if not, why do you think you should > be allowed to forbid others from doing it? Yes I have tried it, and I am currently on my fourth version of an archive format which is based on formats by Greg Stein and Gordon McMillan. I hope it meets with the favor of the Grand Inquisition, and becomes the standard format. But maybe it won't. Oh well. > bloody installers. and here you are advocating that > we all should be forced to use installers, when python > makes it trivial to write self-installing apps. double-argh! I am not forcing anyone to do anything, only proposing that shared libs are best handled directly by imputil and not the class within imputil which handles archive files. It is just a geeky design issue, nothing more. JimA From jim at interet.com Sat Dec 4 19:31:48 1999 From: jim at interet.com (James C. Ahlstrom) Date: Sat, 04 Dec 1999 13:31:48 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> Message-ID: <38495E14.9C2FB107@interet.com> "M.-A. Lemburg" wrote: > An example: > > A path importer knows how to scan directories and how to use > a path to tell the correct order. It can maybe also import > .py/.pyc/.pyo files. Now what happens if it finds a shared > lib as module... the usual imputil way would be to delegate > the request to some other importer which can handle shared > libs... but wait: how does the shared lib importer know > where to look ? It will have to rescan the directories, > etc... The above refers to an earlier but still very recent version of imputil. On that basis is is perfectly accurate. Here is another example from my own experience almost identical to the above: One possible archive file format holds its list of archived *.pyc file names as keys in a dictionary. This is simple and efficient, but fails to correctly address the problem of shared libs (aka DLL's in Windows) with names identical to names of *.pyc files in the archive. For example, suppose foo.pyc is in the archive, and foo.dll is in a directory. Suppose sys.path is to be used to decide whether to load foo.pyc or foo.dll. Then an "archive importer" will fail to do this. Specifically you can't see if foo.pyc is in the archive and then check sys.path, nor can you do the reverse. You must call the "archive importer" repeatedly for each element of sys.path and search the directory at the same time. JimA From jim at interet.com Sat Dec 4 20:51:47 1999 From: jim at interet.com (James C. Ahlstrom) Date: Sat, 04 Dec 1999 14:51:47 -0500 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <384970D3.26A9ECDB@interet.com> Greg Stein wrote: > > On Fri, 3 Dec 1999, Guido van Rossum wrote: > > attack. In particular, installing a new package (e.g. PIL) should > > affect sys.path, regardless of the way of delivery of the modules > > (shared libs, .py files, .pyc files, or a zip archive). > To be explicit/clear and to be sure I'm hearing you right: sys.path may > contain Importer instances. Given the name FOO, the system will step > through sys.path looking for the first occurence of FOO (looking in a > directory or delegating). FOO may be found with any number of > (configurable) file extensions, which are ordered (e.g. ".so" before > ".py" before ".isl"). This is basically a gripe about this design spec. So if the answer turns out to be "we need this functionality so shut up" then just say that and don't flame me. This spec is painful. Suppose sys.path has 10 elements, and there are six file extensions. Then the simple algorithm is slow: for path in sys.path: # Yikes, may not be a string! for ext in file_extensions: name = "%s.%s" % (module_name, ext) full_path = os.path.join(path, name) if os.path.isfile(full_path): # Process file here And sys.path can contain class instances which only makes things slower. You could do a readdir() and cache the results, but maybe that would be slower. A better algorithm might be faster, but a lot more complicated. In the context of archive files, it is also painful. It prevents you from saving a single dictionary of module names. Instead you must have len(sys.path) dictionaries. You could try to save in the archive information about whether (say) a foo.dll was present in the file system, but the list of extensions is extensible. The above problem only exists to support equally-named modules; that is, to support a run-time choice of whether to load foo.pyc, foo.dll, foo.isl, etc. I claim (without having written it) that the fastest algorithm to solve the unique-name case is much faster than the fastest algorithm to solve the choose-among-equal-names case. Do we really need to support the equal-name case [Jim runs for cover...]? If so, how about inventing a new way to support it. Maybe if equal names exist, these must be pre-loaded from a known location? JimA From gstein at lyra.org Sat Dec 4 22:59:00 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 13:59:00 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <384970D3.26A9ECDB@interet.com> Message-ID: On Sat, 4 Dec 1999, James C. Ahlstrom wrote: > Greg Stein wrote: >... > > To be explicit/clear and to be sure I'm hearing you right: sys.path may > > contain Importer instances. Given the name FOO, the system will step > > through sys.path looking for the first occurence of FOO (looking in a > > directory or delegating). FOO may be found with any number of > > (configurable) file extensions, which are ordered (e.g. ".so" before > > ".py" before ".isl"). > > This is basically a gripe about this design spec. So if the answer > turns out to be "we need this functionality so shut up" then just > say that and don't flame me. > > This spec is painful. Suppose sys.path has 10 elements, and there > are six file extensions. Then the simple algorithm is slow: > for path in sys.path: # Yikes, may not be a string! > for ext in file_extensions: > name = "%s.%s" % (module_name, ext) > full_path = os.path.join(path, name) > if os.path.isfile(full_path): > # Process file here This is the algorithm that Python uses today, and my standard Importers follow. > And sys.path can contain class instances > which only makes things slower. IMO, we don't know this, or whether it is significant. > You could do a readdir() and cache > the results, but maybe that would be slower. A better > algorithm might be faster, but a lot more complicated. Who knows. BUT: the import process is now in Python -- it makes it *much* easier to run these experiments. We could not really do this when the import process is "hard-coded" in C code. > In the context of archive files, it is also painful. It prevents > you from saving a single dictionary of module names. Instead you > must have len(sys.path) dictionaries. You could try to > save in the archive information about whether (say) a foo.dll was > present in the file system, but the list of extensions is extensible. I am not following this. What/where is the "single dictionary of module names" ? Are you referring to a cache? Or is this about building an archive? An archive would look just like we have now: map a name to a module. It would not need multiple dictionaries. > The above problem only exists to support equally-named modules; that > is, to support a run-time choice of whether to load foo.pyc, foo.dll, > foo.isl, etc. I claim (without having written it) that the fastest > algorithm to solve the unique-name case is much faster than the fastest > algorithm to solve the choose-among-equal-names case. > > Do we really need to support the equal-name case [Jim runs for > cover...]? > If so, how about inventing a new way to support it. Maybe if equal > names exist, these must be pre-loaded from a known location? I don't understand what the problem is. I don't see one. We are still mapping a name to a module. sys.path defines a precedence. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Sun Dec 5 02:17:57 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 17:17:57 -0800 (PST) Subject: [Python-Dev] pyc archives (was: .DLL vs .PYD search order) In-Reply-To: <38495E14.9C2FB107@interet.com> Message-ID: On Sat, 4 Dec 1999, James C. Ahlstrom wrote: >... > One possible archive file format holds its list of archived > *.pyc file names as keys in a dictionary. This is simple and > efficient, but fails to correctly address the problem of shared > libs (aka DLL's in Windows) with names identical to names of > *.pyc files in the archive. For example, suppose foo.pyc is in the > archive, and foo.dll is in a directory. Suppose sys.path is to be > used to decide whether to load foo.pyc or foo.dll. Then an > "archive importer" will fail to do this. Specifically you can't > see if foo.pyc is in the archive and then check sys.path, nor can > you do the reverse. You must call the "archive importer" repeatedly > for each element of sys.path and search the directory at the same time. What? The archive is independent of each .pyc's original position in sys.path. There is no reason/need to carry that information into an archive. If the archive contains "foo", then you're done. If it doesn't, then move on to the next element of sys.path (directory or Importer instance) and look there. Basically: if you deploy an archive, then all of its files will take precedence over any file found later on sys.path. This is exactly what sys.path is about: establishing precedence. If I understand you correctly, then you're trying to say there is some sort of interleaving that must occur. If so, then I don't understand why. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Mon Dec 6 13:20:34 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 6 Dec 1999 13:20:34 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> <384B7E32.F7B81D82@lemburg.com> Message-ID: <004401bf3fe4$4cab6ea0$f29b12c2@secret.pythonware.com> > > you obviously attempted to use imputil to implement > > non-standard import behaviour on top of the standard > > storage system -- while we've used it to implement > > standard import behaviour on top of non-standard > > storage systems. > > No, I tried to make the imputil approach work as replacement > for the standard builtin importer. I'm confused. earlier, you said (or rather, I think you said) that you looked at imputil to see if it could "handle the problems you had at the time"... and now you say that you tried to use it as a drop-in replacement for the "standard path importer". I must be missing something here... > After I got that to work, I added some caching > to avoid duplicated stats. The resulting importer was > around twice as slow as the builtin one for the following > imports: > > # the default one Python does at startup, plus: > from mx import HTMLTools,DateTime,ODBC > > This is a pretty common setup for my scripts, so its > preformance is relevant to me. did you try stuffing all your PYC's into an archive file, and running them from there? From fredrik at pythonware.com Sun Dec 5 19:22:57 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun, 5 Dec 1999 19:22:57 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> Message-ID: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> > I've checked my imputil.py version (with caches enabled) > against the builtin importer and noticed a performance > downgrade by factor >2. This was enough to convince me > of looking for other techniques to handle the problems > I had at the time... you know, relative imports and things. hmm. I think I see the problem here... you obviously attempted to use imputil to implement non-standard import behaviour on top of the standard storage system -- while we've used it to implement standard import behaviour on top of non-standard storage systems. I don't know if imputil is good enough for the former, and I don't think I care... I've spent too many nights debugging code that relied on clever, non-standard hacks. PS. on the performance side of things, did you know that 're' can be up to ten times slower than 'regex'? but people don't complain -- probably because it allows them to do things they couldn't do before... From jim at interet.com Mon Dec 6 20:40:01 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 06 Dec 1999 14:40:01 -0500 Subject: [Python-Dev] Re: pyc archives (was: .DLL vs .PYD search order) References: Message-ID: <384C1111.92984B5A@interet.com> Greg Stein wrote: > > On Sat, 4 Dec 1999, James C. Ahlstrom wrote: > >... > > One possible archive file format holds its list of archived > > *.pyc file names as keys in a dictionary. This is simple and > > efficient, but fails to correctly address the problem of shared > What? The archive is independent of each .pyc's original position in > sys.path. There is no reason/need to carry that information into an > archive. > > If the archive contains "foo", then you're done. If it doesn't, then move > on to the next element of sys.path (directory or Importer instance) and > look there. > > Basically: if you deploy an archive, then all of its files will take > precedence over any file found later on sys.path. This is exactly what > sys.path is about: establishing precedence. Sorry, I am a little slow today. My daughter got me up at 6 am to work on her computer video editor. No disk space, fragmentation, 2 gig limit on AVI files, ........ Are you saying this? If foo is imported, the archive importer is consulted first to see if it can provide foo. If not, sys.path is searched for foo.pyc, foo.pyl etc., and if foo.pyl is found, then its contents are added to the single archive importer dictionary. The order of addition to the archive dictionary is determined by sys.path, and duplicate names are not entered because they lie later on sys.path. But once a file is recognized as in an archive, it effectively precedes all of sys.path. Or this? If foo is imported, sys.path is searched for foo.pyc, foo.pyl, etc., and also all archive files found at each element of sys.path are searched for foo. If "bar" is imported, it may be found in foo.pyl. That is, there is an instance of an archive importer for each element of sys.path. What if the user names an archive file not on sys.path? What order does it have? JimA From jim at interet.com Mon Dec 6 19:34:41 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 06 Dec 1999 13:34:41 -0500 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <384C01C1.8D1AFFFF@interet.com> Greg Stein wrote: > > On Sat, 4 Dec 1999, James C. Ahlstrom wrote: > > # Process file here > > This is the algorithm that Python uses today, and my standard Importers > follow. Agreed. > > And sys.path can contain class instances > > which only makes things slower. > > IMO, we don't know this, or whether it is significant. Agreed. > > You could do a readdir() and cache > > the results, but maybe that would be slower. A better > > algorithm might be faster, but a lot more complicated. > > Who knows. BUT: the import process is now in Python -- it makes it *much* > easier to run these experiments. We could not really do this when the > import process is "hard-coded" in C code. Agreed. > > In the context of archive files, it is also painful. It prevents > > you from saving a single dictionary of module names. Instead you > > must have len(sys.path) dictionaries. You could try to > > save in the archive information about whether (say) a foo.dll was > > present in the file system, but the list of extensions is extensible. > > I am not following this. What/where is the "single dictionary of module > names" ? Are you referring to a cache? Or is this about building an > archive? > > An archive would look just like we have now: map a name to a module. It > would not need multiple dictionaries. The "single dictionary of names" is in the single archive importer instance and has nothing to do with creating the archive. It is currently programmed this way. Suppose the user specifies by name 12 archive files to be searched. That is, the user hacks site.py to add archive names to the importer. The "single dictionary" means that the archive importer takes the 12 dictionaries in the 12 files and merges them together into one dictionary in order to speed up the search for a name. The good news is you can always just call the archive importer to get a module. The bad news is you can't do that for each entry on sys.path because there is no necessary identity between archive files and sys.path. The user specified the archive files by name, and they may or may not be on sys.path, and the user may or may not have specified them in the same order as sys.path even if they are. Suppose archive files must lie on sys.path and are processed in order. Then to find them you must know their name. But IMHO you want to avoid doing a readdir() on each element of sys.path and looking for files *.pyl. Suppose archive file names in general are the known name "lib.pyl" for the Python library, plus the names "package.pyl" where "package" can be the name of a Python package as a single archive file. Then if the user tries to import foo, imputil will search along sys.path looking for foo.pyc, foo.pyl, etc. If it finds foo.pyl, the archive importer will add it to its list of known archive files. But it must not add it to its single dictionary, because that would destroy the information about its position along sys.path. Instead, it must keep a separate dictionary for each element of sys.path and search the separate dictionaries under control of imputil. That is, get_code() needs a new argument for the element of sys.path being searched. Alternatively, you could create a new importer instance for each archive file found, but then you still have multiple dictionaries. They are in the multiple instances. All this is needed only to support import of identically named modules. If there are none, there is no problem because sys.path is being used only to find modules, not to disambiguate them. See also my separate reply to your other post which discusses this same issue. JimA From gstein at lyra.org Tue Dec 7 01:43:21 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 6 Dec 1999 16:43:21 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <384C01C1.8D1AFFFF@interet.com> Message-ID: On Mon, 6 Dec 1999, James C. Ahlstrom wrote: > Greg Stein wrote: >... > > I am not following this. What/where is the "single dictionary of module > > names" ? Are you referring to a cache? Or is this about building an > > archive? > > > > An archive would look just like we have now: map a name to a module. It > > would not need multiple dictionaries. > > The "single dictionary of names" is in the single archive importer > instance and has nothing to do with creating the archive. It > is currently programmed this way. Ah. There is the problem. In Guido's suggestion for the "next path of inquiry" :-), there is no "single dictionary of names". Instead, you have Importer instances as items in sys.path. Each instance maintains its dictionary, and they are not (necessarily) combined. If we were to combine them, then we would need to maintain the ordering requirements implied by sys.path. However, this would be problematic if sys.path changed -- we would have to detect the situation and rebuild a merged dict. > Suppose the user specifies by name 12 archive files to be searched. > That is, the user hacks site.py to add archive names to the importer. > The "single dictionary" means that the archive importer takes the 12 > dictionaries in the 12 files and merges them together into one > dictionary > in order to speed up the search for a name. The good news is you can > always just call the archive importer to get a module. The bad news is > you can't do that for each entry on sys.path because there is no > necessary identity between archive files and sys.path. The user > specified the archive files by name, and they may or may not be on > sys.path, and the user may or may not have specified them in the > same order as sys.path even if they are. The importer must be inserted into sys.path to establish a precedence. If the user wants to add 12 libraries... fine. But *all* of those modules will fall under a precedence defined by the Importer's position on sys.path. > Suppose archive files must lie on sys.path and are processed in order. > Then to find them you must know their name. But IMHO you want to > avoid doing a readdir() on each element of sys.path and looking for > files *.pyl. I do not believe that we will arbitrarily locate and open library files. They must be specified explicitly. > Suppose archive file names in general are the known name "lib.pyl" > for the Python library, plus the names "package.pyl" where "package" > can be the name of a Python package as a single archive file. Then > if the user tries to import foo, imputil will search along sys.path > looking for foo.pyc, foo.pyl, etc. If it finds foo.pyl, the archive > importer will add it to its list of known archive files. But it must > not add it to its single dictionary, because that would destroy the > information about its position along sys.path. Instead, it must keep > a separate dictionary for each element of sys.path and search the > separate dictionaries under control of imputil. That is, get_code() > needs a new argument for the element of sys.path being searched. > Alternatively, you could create a new importer instance for each > archive file found, but then you still have multiple dictionaries. > They are in the multiple instances. If the user installs ".pyl" as a recognized extension (i.e. installs into the PathImporter), then the above scenario is possible. In my in-head-design, I had not imagined any state being retained for extension-recognizer hooks. Of course, state can be retained simply by using a bound-method for the hook function. get_code() would not need to change. The foo.pyl would be consulted at the appropriate time based on where it is found in sys.path. Note that file- extension hooks would definitely have a complete path to the target file. Those are not Importers, however (although they will closely follow the get_code() hook since the extension is called from get_code). From tim_one at email.msn.com Tue Dec 7 06:11:25 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 7 Dec 1999 00:11:25 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> Message-ID: <001601bf4071$8278cc20$88a0143f@tim> [/F] > PS. on the performance side of things, did you know > that 're' can be up to ten times slower than 'regex'? > but people don't complain -- probably because it > allows them to do things they couldn't do before... Bad example: people do complain about this. Those who care a lot continue to use regex, temporarily pacified by the promise that re.py will get recoded in C and thus regain a good chunk of regex's speed. Those who care a whale of a lot continue to use Perl <0.9 wink>. From guido at CNRI.Reston.VA.US Tue Dec 7 13:45:25 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 07 Dec 1999 07:45:25 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Mon, 06 Dec 1999 16:43:21 PST." References: Message-ID: <199912071245.HAA21596@eric.cnri.reston.va.us> > If we were to combine them, then we would need to maintain the ordering > requirements implied by sys.path. However, this would be problematic if > sys.path changed -- we would have to detect the situation and rebuild a > merged dict. No need to worry about this: just don't merge the caches. Compared to the hundreds of failed open() calls that are done now, it's no big deal to do 12 failed Python dictionary lookups instead of one. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Tue Dec 7 14:25:54 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 7 Dec 1999 14:25:54 +0100 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com> Greg Stein wrote: > > The "single dictionary of names" is in the single archive importer > > instance and has nothing to do with creating the archive. It > > is currently programmed this way. > > Ah. There is the problem. In Guido's suggestion for the "next path of > inquiry" :-), there is no "single dictionary of names". Instead, you have > Importer instances as items in sys.path. Each instance maintains its > dictionary, and they are not (necessarily) combined. so the "sys.path contains importers (or strings)" strategy is now officially sanctioned? cool!!! (a quick look in our code base says that this will cause some trouble, unless os.path.isdir() is modified to reject non-strings... after all, if it's not a string, it cannot be a valid directory path, so this does make some sense ;-) another aside: can we have a standard mechanism for listing the contents of a given archive, please? we have a lot of "path scanning" stuff (PIL and PST, among others), and it would be great if things didn't break down if you stuff it all in an archive. something like: for path in sys.path: if os.path.isdir(path): files = os.listdir(path) else: try: files = path.listdir() except AttributeError: files = None if files is None: # no idea what's in here else: # path provides (at least) these modules would be really useful. and yes, it shouldn't have to be mentioned, since squeeze have done it since early 1997, but archive importers should provide a standard way to include non-module resources in the archive, and a standard way to access such resources as ordinary python streams. e.g: file = path.open(name, "rb") or something... From jim at interet.com Tue Dec 7 16:20:15 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 07 Dec 1999 10:20:15 -0500 Subject: [Python-Dev] Import redesign [LONG] References: <199912071245.HAA21596@eric.cnri.reston.va.us> Message-ID: <384D25AF.4C4F5107@interet.com> Guido van Rossum wrote: > No need to worry about this: just don't merge the caches. Compared to > the hundreds of failed open() calls that are done now, it's no big > deal to do 12 failed Python dictionary lookups instead of one. Agreed. JimA From jim at interet.com Tue Dec 7 16:31:30 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 07 Dec 1999 10:31:30 -0500 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <384D2852.3C36C216@interet.com> Greg Stein wrote: > Ah. There is the problem. In Guido's suggestion for the "next path of > inquiry" :-), there is no "single dictionary of names". Instead, you have > Importer instances as items in sys.path. Each instance maintains its > dictionary, and they are not (necessarily) combined. > [A large number of other design issues] OK, all design issues agreed. I will make needed changes. JimA From jim at interet.com Tue Dec 7 16:37:36 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 07 Dec 1999 10:37:36 -0500 Subject: [Python-Dev] Import redesign [LONG] References: <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com> Message-ID: <384D29C0.3D3A2194@interet.com> Fredrik Lundh wrote: > another aside: can we have a standard mechanism for > listing the contents of a given archive, please? I will add this. > and yes, it shouldn't have to be mentioned, since squeeze > have done it since early 1997, but archive importers should > provide a standard way to include non-module resources in > the archive, and a standard way to access such resources > as ordinary python streams. I will add this. JimA From gstein at lyra.org Tue Dec 7 17:53:49 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 7 Dec 1999 08:53:49 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912071245.HAA21596@eric.cnri.reston.va.us> Message-ID: On Tue, 7 Dec 1999, Guido van Rossum wrote: > > If we were to combine them, then we would need to maintain the ordering > > requirements implied by sys.path. However, this would be problematic if > > sys.path changed -- we would have to detect the situation and rebuild a > > merged dict. > > No need to worry about this: just don't merge the caches. Compared to > the hundreds of failed open() calls that are done now, it's no big > deal to do 12 failed Python dictionary lookups instead of one. Have no fear... I wasn't planning on this... complicates too much stuff for too little gain. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at CNRI.Reston.VA.US Wed Dec 8 13:07:31 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 07:07:31 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim> References: <000201bf4150$46749da0$5aa2143f@tim> Message-ID: <199912081207.HAA00040@eric.cnri.reston.va.us> [Great analysis, Tim!] > 4) The audience is Python end-users "in general", and the product is pure > Python. I think this is the most important one for Distutils to address, > and compilation isn't a part of it. So far, though, what Gordon is doing > seems more appropriate than what Distutils has been up to. I hope his work > gets folded into this. I'm not sure what stuff by which Gordon you're referring to. I am only familiar with his installer, which I thought is win32 only (but I may be mistaken) and is an installer for a whole application, not just a bunch of modules. Please correct me if I'm wrong. But this reminds me of a different issue, which Jim Ahlstrom has been hammering about before: there's a completely separate set of cases where what you are distributing is a stand-alone application, and the target consists of end users who are entirely uninterested in whether it's written in Python, C or Elvish. (And then there's still the distinction between Win32, Unix or both.) The current distutil dools don't deal with this at all. I think it should though, and I think its framework is powerful enough to be able to add this, e.g. as a new "appdist" command. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Dec 8 15:16:07 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 09:16:07 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim> Message-ID: <1267460464-31845181@hypernet.com> Guido wrote: > [Great analysis, Tim!] > > > 4) The audience is Python end-users "in general", and the > > product is pure Python. I think this is the most important one > > for Distutils to address, and compilation isn't a part of it. > > So far, though, what Gordon is doing seems more appropriate > > than what Distutils has been up to. I hope his work gets > > folded into this. > > I'm not sure what stuff by which Gordon you're referring to. I > am only familiar with his installer, which I thought is win32 > only (but I may be mistaken) and is an installer for a whole > application, not just a bunch of modules. Please correct me if > I'm wrong. It needed a name. I hate the word "Installer", but it expresses in one word the most common use of my stuff. I'll be releasing a beta for Linux real soon. Only some of the tricks are Windows only (such as self-extracting executables, which is only culturally appropriate on Windows, anyway). But more importantly it's not just for installing. The Python I use (interactively) on my wife's machine is 1 directory with about 6 files in it. On my Linux box I've been using the std lib in a .pyz for about a month now. Someone distributing a pure Python package could instead ship 3 files (imputil.py, archive.py and .pyz) with the "install" consisting of adding one line to site.py in the user's perfectly normal Python installation. And yeah, I solved the "manifest" problem, too. Mine predates Distutils, so don't accuse me of duplicate effort, (I pointed them to it a couple times). It uses ConfigParser and a config file, so it allows finer control. While .pyz's are completely cross-platform, I have yet to work out endianness issues in the other archive I use (which should probably be zip format - it can hold anything). And at the "Installer" end, I have yet to work out how things should work on non-ELF/COFF platforms (where I can't append the archive to the executable). But there aren't any technical issues involved; just lack of time. So no, it's not just for Windows; and no, it's not just for creating standalones (though that's what almost everyone uses it for). - Gordon From guido at CNRI.Reston.VA.US Wed Dec 8 15:56:42 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 09:56:42 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com> References: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim> <1267460464-31845181@hypernet.com> Message-ID: <199912081456.JAA00200@eric.cnri.reston.va.us> > It needed a name. I hate the word "Installer", but it expresses > in one word the most common use of my stuff. > > I'll be releasing a beta for Linux real soon. Only some of the > tricks are Windows only (such as self-extracting executables, > which is only culturally appropriate on Windows, anyway). > > But more importantly it's not just for installing. The Python I > use (interactively) on my wife's machine is 1 directory with > about 6 files in it. On my Linux box I've been using the std lib > in a .pyz for about a month now. Someone distributing a pure > Python package could instead ship 3 files (imputil.py, > archive.py and .pyz) with the "install" consisting of > adding one line to site.py in the user's perfectly normal Python > installation. > > And yeah, I solved the "manifest" problem, too. Mine predates > Distutils, so don't accuse me of duplicate effort, (I pointed > them to it a couple times). It uses ConfigParser and a config > file, so it allows finer control. > > While .pyz's are completely cross-platform, I have yet to work > out endianness issues in the other archive I use (which should > probably be zip format - it can hold anything). And at the > "Installer" end, I have yet to work out how things should work > on non-ELF/COFF platforms (where I can't append the archive > to the executable). But there aren't any technical issues > involved; just lack of time. > > So no, it's not just for Windows; and no, it's not just for > creating standalones (though that's what almost everyone > uses it for). Gordon, I'm sorry, but from this description I still have no idea what your stuff is (and I forgot the URL so I can't look it up). For example, if it's not (just) for installing, what *is* it for? What is the ``"manifest" problem'' and how did you solve it? Also, note that editing site.py is a no-no! You can create/edit sitecustomize.py, but you should leave site.py alone! --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Dec 8 17:17:03 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 11:17:03 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081456.JAA00200@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com> Message-ID: <1267453215-32281635@hypernet.com> Guido, > Gordon, I'm sorry, but from this description I still have no idea > what your stuff is (and I forgot the URL so I can't look it up). http://starship.python.org/crew/gmcm/installer.html The Linux stuff has a couple alpha testers and will probably get announced in a week or two. > For example, if it's not (just) for installing, what *is* it for? At the bottom level, it's a bunch of tools using freeze's modulefinder, imputil.py and 2 kinds of archives. There's at least 2 layers above that, with "Installer" being the top. There's a clean separation between the layers, so you can break in wherever you like. > What is the ``"manifest" problem'' and how did you solve it? The problem is specifying a set of resources, hopefully without having to list them explicitly. I solve this with a config file that lets you specify packages, directories, directory trees.. with filters that can work from paths, names, extensions, regular expressions... > Also, note that editing site.py is a no-no! You can create/edit > sitecustomize.py, but you should leave site.py alone! That would work fine. One of the standalone configurations will write a site.py, but that's for a completely self-contained installation (ie, one which will have no conflicts with another Python installation). I'd also note that, for Windows at least, the path-expanding mechanism created by site.py has not caught on. I've got lots installed, and no site-python, site-packages or sitecustomize. - Gordon From guido at CNRI.Reston.VA.US Wed Dec 8 17:23:34 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 11:23:34 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> References: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com> <1267453215-32281635@hypernet.com> Message-ID: <199912081623.LAA04119@eric.cnri.reston.va.us> [me] > > Also, note that editing site.py is a no-no! You can create/edit > > sitecustomize.py, but you should leave site.py alone! [Gordon] > That would work fine. One of the standalone configurations will > write a site.py, but that's for a completely self-contained > installation (ie, one which will have no conflicts with another > Python installation). > > I'd also note that, for Windows at least, the path-expanding > mechanism created by site.py has not caught on. I've got lots > installed, and no site-python, site-packages or sitecustomize. You shouldn't see site-python or site-packages, they only exist on Unix. On Windows, everything is installed in the top Python directory. However you should see .pth files there, which is what site.py looks for. I believe NumPy and PIL use those. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Dec 8 17:55:51 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 11:55:51 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081623.LAA04119@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> Message-ID: <1267450887-32421651@hypernet.com> > [Gordon] > > That would work fine. One of the standalone configurations will > > write a site.py, but that's for a completely self-contained > > installation (ie, one which will have no conflicts with another > > Python installation). > > > > I'd also note that, for Windows at least, the path-expanding > > mechanism created by site.py has not caught on. I've got lots > > installed, and no site-python, site-packages or sitecustomize. [Guido] > You shouldn't see site-python or site-packages, they only exist > on Unix. You mean "they only exist _for_ Unix", (site.py looks for them on Windows). I don't like that. For one thing, modulo a few platform differences, the same mechanism should work for multi-user Unix and Windows LAN installations. And single- user Windows (I know, redundant, even on NT) should be a degenerate case of the above. > On Windows, everything is installed in the top Python > directory. However you should see .pth files there, which is > what site.py looks for. I believe NumPy and PIL use those. No NumPy, no PIL, no .pth files. 99% of everything out there just says "unzip this somewhere on your Python path". In this case, Jim Ahlstrom may be right - there are too many options, or at least an insufficiently emphasized "proper" method. Until I worked out my own way of installing stuff, I used to lose a large number of packages whenever I upgraded my Windows Python. Much as I love Mark's stuff (and hesitate to criticize crazy Aussies), I wish there weren't so much special casing here for Windows. And no, I don't have any solutions to this, I'm just griping... - Gordon From guido at CNRI.Reston.VA.US Wed Dec 8 18:07:30 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 12:07:30 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 11:55:51 EST." <1267450887-32421651@hypernet.com> References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> Message-ID: <199912081707.MAA04242@eric.cnri.reston.va.us> > [Guido] > > You shouldn't see site-python or site-packages, they only exist > > on Unix. [Gordon] > You mean "they only exist _for_ Unix", (site.py looks for them > on Windows). No it doesn't. The code in site.py only adds site-packages and site-python when os.sep is '/'. RTSL. > I don't like that. For one thing, modulo a few > platform differences, the same mechanism should work for > multi-user Unix and Windows LAN installations. And single- > user Windows (I know, redundant, even on NT) should be a > degenerate case of the above. What do you mean by "the same mechanism should work"? The same mechanism for what? Are you talking about sharing the installed files somehow? > > On Windows, everything is installed in the top Python > > directory. However you should see .pth files there, which is > > what site.py looks for. I believe NumPy and PIL use those. > > No NumPy, no PIL, no .pth files. 99% of everything out there > just says "unzip this somewhere on your Python path". Fair enough. Of course I know about .pth files so I unzipped them elsewhere and added a .pth file pointing there... > In this case, Jim Ahlstrom may be right - there are too many > options, or at least an insufficiently emphasized "proper" > method. Until I worked out my own way of installing stuff, I > used to lose a large number of packages whenever I upgraded > my Windows Python. The .pth files are designed for this. Maybe they haven't been explained as well as they should. > Much as I love Mark's stuff (and hesitate to criticize crazy > Aussies), I wish there weren't so much special casing here for > Windows. It's not Mark's fault, it's Microsoft's fault. If you don't do things the way MS wants you to, experienced Windows users will gripe, misunderstand what you do, etc. > And no, I don't have any solutions to this, I'm just griping... Ditto. Understanding the problems is half of the solution though. The problems seem pretty complex! --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Dec 8 19:25:50 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 13:25:50 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 11:55:51 EST." <1267450887-32421651@hypernet.com> Message-ID: <1267445488-32746429@hypernet.com> [Guido] > No it doesn't. The code in site.py only adds site-packages and > site-python when os.sep is '/'. RTSL. Oops. Missed that. > > I don't like that. For one thing, modulo a few > > platform differences, the same mechanism should work for > > multi-user Unix and Windows LAN installations. And single- user > > Windows (I know, redundant, even on NT) should be a degenerate > > case of the above. > > What do you mean by "the same mechanism should work"? The same > mechanism for what? Are you talking about sharing the installed > files somehow? In the above, "mechanism" basically meant that which creates sys.path. Basically, this came up for me because in standalone configurations (my Installer again), I have to take complete control of sys.path. After doing so differently on Windows and Linux, I finally realized that I can do it the same way on both. Which makes me question why they are so different. > The .pth files are designed for this. Maybe they haven't been > explained as well as they should. I'd say "badgered" or "browbeaten" instead of "explained" ;-). > > Much as I love Mark's stuff (and hesitate to criticize crazy > > Aussies), I wish there weren't so much special casing here for > > Windows. > > It's not Mark's fault, it's Microsoft's fault. If you don't do > things the way MS wants you to, experienced Windows users will > gripe, misunderstand what you do, etc. Even MS doesn't do things the way MS says they want you to. I find MS users equally divided between those who scream bloody murder if you touch the registry, and those who scream if you don't. It's not like *nixen suffer from an excessive degree of conformity in preferred installation procedures, but somehow Python survives there... > > And no, I don't have any solutions to this, I'm just griping... > > Ditto. Understanding the problems is half of the solution > though. The problems seem pretty complex! Grumpily agreed ;-). - Gordon From jim at interet.com Wed Dec 8 19:33:51 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 08 Dec 1999 13:33:51 -0500 Subject: [Python-Dev] Linux Journal confirms evil rumor References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> Message-ID: <384EA48F.F5190180@interet.com> I finally got around to reading the current Linux Journal (which just keeps getting better and better) and lo! there was a picture of a familiar face I just couldn't quite.... Oh no! Could it be true? I heard rumors but I refused to believe them until now. The glasses are gone! Guido now looks like an investment banker! The sky is falling! Next will probably be a Python 1.6 as a 27 Meg DLL, and a Python IPO. Well, maybe not. Now that I look more closely, he is wearing a black and white and mustard (??MUSTARD) T-shirt which says "You Need Python". At least we ought to make him wear a name tag at IPC8. JimA From fdrake at acm.org Wed Dec 8 19:37:44 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 8 Dec 1999 13:37:44 -0500 (EST) Subject: [Python-Dev] Linux Journal confirms evil rumor In-Reply-To: <384EA48F.F5190180@interet.com> References: <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> <384EA48F.F5190180@interet.com> Message-ID: <14414.42360.309237.967766@weyr.cnri.reston.va.us> James C. Ahlstrom writes: > Oh no! Could it be true? I heard rumors but I refused to > believe them until now. The glasses are gone! Guido now > looks like an investment banker! The sky is falling! I'm afraid this non-distinctive look was introduced at IPC7... it's too bad we can't tell people Python was invented by the guy with the glasses anymore. > Next will probably be a Python 1.6 as a 27 Meg DLL, and > a Python IPO. Well, maybe not. Now that I look more > closely, he is wearing a black and white and mustard > (??MUSTARD) T-shirt which says "You Need Python". It's really the blue & white & orange IPC7 shirt. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw at cnri.reston.va.us Wed Dec 8 19:41:51 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 8 Dec 1999 13:41:51 -0500 (EST) Subject: [Python-Dev] Linux Journal confirms evil rumor References: <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> <384EA48F.F5190180@interet.com> Message-ID: <14414.42607.701538.783684@anthem.cnri.reston.va.us> >>>>> "JCA" == James C Ahlstrom writes: JCA> Oh no! Could it be true? I heard rumors but I refused to JCA> believe them until now. The glasses are gone! Guido now JCA> looks like an investment banker! The sky is falling! He's not the only one who's, like, "gone corporate", but I won't mention any names, so as to protect the guilty. From jim at digicool.com Wed Dec 8 20:03:42 1999 From: jim at digicool.com (Jim Fulton) Date: Wed, 08 Dec 1999 14:03:42 -0500 Subject: [Python-Dev] Linux Journal confirms evil rumor References: <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> <384EA48F.F5190180@interet.com> <14414.42607.701538.783684@anthem.cnri.reston.va.us> Message-ID: <384EAB8E.EBA595B5@digicool.com> "Barry A. Warsaw" wrote: > > He's not the only one who's, like, "gone corporate", but I won't > mention any names, so as to protect the guilty. OK, Buzz. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From tim_one at email.msn.com Thu Dec 9 06:31:52 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 9 Dec 1999 00:31:52 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us> Message-ID: <000301bf4206$b39e5b80$36a2143f@tim> [Guido] > [Great analysis, Tim!] I beg to differ: it's internally inconsistent and should have identified at least 3 axes and hence at least 8 cases. Still, you got more than you paid for . >> 4) The audience is Python end-users "in general", and the >> product is pure Python. I think this is the most important one >> for Distutils to address, and compilation isn't a part of it. >> So far, though, what Gordon is doing seems more appropriate >> than what Distutils has been up to. I hope his work gets folded >> into this. > I'm not sure what stuff by which Gordon you're referring to. You guessed right! > I am only familiar with his installer, which I thought is win32 > only (but I may be mistaken) and is an installer for a whole > application, not just a bunch of modules. Please correct me if > I'm wrong. If it can install a whole app, what makes you suspect it couldn't install just a bunch of modules <0.5 wink>? It started life as Windows-only, and I believe it's been virtually ignored by non-Windows folk because of that. Bad blind spot. It supplies already-working approaches to many of the issues that are still being *talked* about on Distutils (at least archive formats, code to manipulate same, manifest files (how do you tell the tool which files to package?), and transparently bundling a Python interpreter when needed). > But this reminds me of a different issue, which Jim Ahlstrom has > been hammering about before: there's a completely separate set of > cases where what you are distributing is a stand-alone application, > and the target consists of end users who are entirely uninterested > in whether it's written in Python, C or Elvish. I include part of that in my case #4 above, where the app happens to be written in Pure Python -- but the user doesn't have to know that. Gordon is addressing at least that part of it. AFAIK he can't deal with transparently compiling C or exorcising Elvish on the target platform, but if you're just distributing the binaries I expect his work is directly usable already. > (And then there's still the distinction between Win32, Unix or > both.) I vote "both". The world really doesn't need another Win32-only (or Unix-only) installer, archive format, compression format, or distribution model. Jim seems mostly interested in Win32-only to me, and his concerns haven't been about the mechanics of distribution but about how-- regardless of tool --to create a bulletproof Python installation by hook or by crook. Last time we went thru this, it was concluded that one couldn't without patching the Python Windows binary with a resource editor (to point to its own infernal <0.5 wink> registry entries). Distutils hasn't talked about that at all (that I've seen, anyway); if there were a less radical approach to that, I suspect Jim would be delighted to use one of the commercial Win32 installation pkgs (and if that's what his customers expect, delighted or not that's what he'll do). > The current distutil dools don't deal with this at all. That's why I said I thought what Gordon is doing seems more appropriate to case #4 than what Distutils has been doing. > I think it should though, Ditto. > and I think its framework is powerful enough to be able to > add this, e.g. as a new "appdist" command. I cordially invite (since Gordon will uncordially browbeat ) people to look seriously at what he's done. Best I can tell, for apps that don't need compilation "on the other end", it's mostly "there" already! give-the-man-a-hand-ly y'rs - tim From tim_one at email.msn.com Thu Dec 9 06:52:23 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 9 Dec 1999 00:52:23 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <1267453215-32281635@hypernet.com> Message-ID: <000601bf4209$90a90c80$36a2143f@tim> > http://starship.python.org/crew/gmcm/installer.html Eh? Doesn't work for me. This does: http://starship.python.net/crew/gmcm/distribute.html From tim_one at email.msn.com Thu Dec 9 07:38:54 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 9 Dec 1999 01:38:54 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us> Message-ID: <000701bf4210$10925a40$36a2143f@tim> [Gordon] >> Much as I love Mark's stuff (and hesitate to criticize crazy >> Aussies), I wish there weren't so much special casing here for >> Windows. [Guido] > It's not Mark's fault, it's Microsoft's fault. If you don't do > things the way MS wants you to, experienced Windows users will > gripe, misunderstand what you do, etc. Something just occurred to me: MS's guidelines aren't arbitrary, they actually have very good reasons. In the case of putting all an app's crucial info in the Registry, it's the only way to allow a site administrator to set policy and site options remotely (an admin can fiddle other machines' registries remotely). This works very well indeed when there's only "one copy" of an app on a machine (or at most one copy "per user"). What just occurred to me is that JimA is concerned with *not* letting any info from a previously-installed Python affect the app he's installing. Similarly, Gordon's Win32 "standalone installer" modifies python.exe and pythonw.exe to use a PYTHONPATH he forces, leaving the registry out of it. Similarly, the woes I've had in trying to sell Python as a general Win32 scripting tool at work mostly boil down to that there's no effortless way to do it that doesn't risk picking up info from-- or forcing info onto --pre-existing or future distinct Python installations (in contrast, Perl "just works" in this respect). IOW, the three of us find getting path info out of the registry intolerable because we are in fact trying to do the opposite of what the registry mechanism was *designed* for: we want perfect isolation, not perfect sharing. This has come up on Python-Help a few times too, in the guise of someone installing a product that in turn installs an older version of Python, which in turn confuses another product that relies on features in a newer version of Python. So while the traditional Windows .ini file (like Unix this-or-that.rc file) model was replaced by the registry for excellent reasons, those reasons don't apply to the way we're using Python! The .ini file model was exactly right for what most of us seem to want to do, and the registry model is exactly wrong. just-thought-i'd-cheer-you-up-ly y'rs - tim From skip at mojam.com Thu Dec 9 08:38:36 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 9 Dec 1999 01:38:36 -0600 (CST) Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <000701bf4210$10925a40$36a2143f@tim> References: <199912081707.MAA04242@eric.cnri.reston.va.us> <000701bf4210$10925a40$36a2143f@tim> Message-ID: <14415.23676.775163.786028@dolphin.mojam.com> Tim> So while the traditional Windows .ini file (like Unix Tim> this-or-that.rc file) model was replaced by the registry for Tim> excellent reasons, those reasons don't apply to the way we're using Tim> Python! The .ini file model was exactly right for what most of us Tim> seem to want to do, and the registry model is exactly wrong. Alright! Now I understand what all the hubbub is about! My eyes have mostly been glazing over trying to follow all this Windows registry/path/ini stuff. MS believes that Python is the application. Those of us writing Python programs view those programs as the applications, not the Python interpreter per se. Is there some way that people writing applications in Python can set up registry entries that are specific to their application (e.g. tabnanny.py) instead of only specific to the Python interpreter? Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From gmcm at hypernet.com Thu Dec 9 15:17:27 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 9 Dec 1999 09:17:27 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <000701bf4210$10925a40$36a2143f@tim> References: <199912081707.MAA04242@eric.cnri.reston.va.us> Message-ID: <1267374045-37047016@hypernet.com> [Guido] > > It's not Mark's fault, it's Microsoft's fault. If you don't do > > things the way MS wants you to, experienced Windows users will > > gripe, misunderstand what you do, etc. [Tim] > Something just occurred to me: MS's guidelines aren't arbitrary, > they actually have very good reasons. In the case of putting all > an app's crucial info in the Registry, it's the only way to allow > a site administrator to set policy and site options remotely (an > admin can fiddle other machines' registries remotely). This > works very well indeed when there's only "one copy" of an app on > a machine (or at most one copy "per user"). And actually, the business about separate subtrees for the machine's configuration and the user's configuration is pretty clever. MS doesn't explain it well, and it gets misused, but when done right, it's a lot simpler than the maze of .xxxrc files you sometimes find in other OSes. > What just occurred to me is that JimA is concerned with *not* > letting any info from a previously-installed Python affect the > app he's installing. Similarly, Gordon's Win32 "standalone > installer" modifies python.exe and pythonw.exe to use a > PYTHONPATH he forces, leaving the registry out of it. Similarly, > the woes I've had in trying to sell Python as a general Win32 > scripting tool at work mostly boil down to that there's no > effortless way to do it that doesn't risk picking up info from-- > or forcing info onto --pre-existing or future distinct Python > installations (in contrast, Perl "just works" in this respect). In my Linux version, I went to the heart of the matter - getpath.c. It occurs to me that getpath.c might do better to follow a normal bootstrap process - ie, create the absolute minimal sys.path required to go to the next step. Then the rest of what goes on in getpath.c could be written in Python. Maybe that Python code needs to get frozen in (to prevent bozos from destroying an installation by stepping on getpath.py), but it would make it a lot easier to create independent installations, and also reduce the variations between platforms at the C level. (Then again, I've never heard of anyone stepping on exceptions.py.) If some registry manipulation primitives were exposed (say, through ntpath) that would mean that Windows developers could (if they wanted) play by the MS rules with at least the option of not stepping on each other. - Gordon From jim at interet.com Thu Dec 9 16:02:18 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 10:02:18 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> Message-ID: <384FC47A.BB4DA517@interet.com> Tim Peters wrote: > Jim seems mostly interested in Win32-only to me, and his concerns haven't > been about the mechanics of distribution but about how-- regardless of > tool --to create a bulletproof Python installation by hook or by crook. Not exactly. I am interested in how to create a bullet-proof installation. But I am equally interested in Unix (especially Linux) and dislike the current dichotomy in the code base. Lately I have been more active in distribution via archive files. Part of the solution is an archive file format which is identical on Unix and Windows, and which can hold the Python library and packages as single files. For my own efforts on this see: ftp://ftp.interet.com/pub/pylib.html This is an archive file format similar to Gordon's format, although Gordon's work goes well beyond just file formats. I currently have fifth generation code for this format, and am adding features as suggested by Fredrik Lundt. I hope it gets considered as a candidate for a Python standard format. > Distutils hasn't talked about that at all (that I've seen, anyway); Gordon, Greg Stein and I have discussed file formats before. I think it was on distutils. Anyway that was months ago. JimA From guido at CNRI.Reston.VA.US Thu Dec 9 17:17:18 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 11:17:18 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 09:17:27 EST." <1267374045-37047016@hypernet.com> References: <199912081707.MAA04242@eric.cnri.reston.va.us> <1267374045-37047016@hypernet.com> Message-ID: <199912091617.LAA05742@eric.cnri.reston.va.us> > [Guido] > > > It's not Mark's fault, it's Microsoft's fault. If you don't do > > > things the way MS wants you to, experienced Windows users will > > > gripe, misunderstand what you do, etc. > [Tim] > > Something just occurred to me: MS's guidelines aren't arbitrary, > > they actually have very good reasons. In the case of putting all > > an app's crucial info in the Registry, it's the only way to allow > > a site administrator to set policy and site options remotely (an > > admin can fiddle other machines' registries remotely). This > > works very well indeed when there's only "one copy" of an app on > > a machine (or at most one copy "per user"). [Gordon] > And actually, the business about separate subtrees for the > machine's configuration and the user's configuration is pretty > clever. MS doesn't explain it well, and it gets misused, but > when done right, it's a lot simpler than the maze of .xxxrc files > you sometimes find in other OSes. I agree. And I am guilty of not even try to find MS' explanation -- I just looked in the registry at what other apps did and tried to mimic that (plus what Mark had already done), without really knowing what I was doing. I now know a little better -- see the end of this message. > In my Linux version, I went to the heart of the matter - > getpath.c. It occurs to me that getpath.c might do better to > follow a normal bootstrap process - ie, create the absolute > minimal sys.path required to go to the next step. Then the > rest of what goes on in getpath.c could be written in Python. > Maybe that Python code needs to get frozen in (to prevent > bozos from destroying an installation by stepping on > getpath.py), but it would make it a lot easier to create > independent installations, and also reduce the variations > between platforms at the C level. (Then again, I've never heard > of anyone stepping on exceptions.py.) Yes, this is exactly what was proposed in the thread on the Big Import Rewrite. > If some registry manipulation primitives were exposed (say, > through ntpath) that would mean that Windows developers > could (if they wanted) play by the MS rules with at least the > option of not stepping on each other. That's a good idea. These functions are already available through Mark's win32api extension -- much of which will eventually (I hope before 1.6 is out!) become part of the core distribution. In the mean time, I've been thinking a bit more about how Python should be using the Windows registry. (It's clear to me that Python should use the registry -- those who disagree can go build their own Python distribution.) The basic ideas of Python's current registry usage are sound: there's a resource built into the DLL which is part of the key into the registry used for all information. The problem lies in which key is used. All versions of Python 1.5.x (1.5, 1.5.1, 1.5.2) use the same key! This is a main cause of trouble, because it means that different versions cannot peacefully live together even if the user installs them into different directories -- they will all use the registry keys of the last version installed. This, in turn, means that someone who writes a Python application that has a dependency on a particular Python version (and which application worth distributing doesn't :-) cannot trust that if a Python installation is present, it is the right one. But they also cannot simply bundle the standard installer for the correct Python version with their program, because its installation would overwrite an existing Python application, thus breaking some *other* Python apps that the user might already have installed. (There's a solution for app builders who are willing to do a lot of work -- you can change the registry key resource in the DLL. For example, Alice comes with its own version of Python 1.5.1 and it uses "1.5.1-alice" as its registry key. The Alice installer installs Python in a subdirectory of the Alice installation directory and points the 1.5.1-alice registry entries there. The problem is that this is a lot of work for the average app builder.) I thought a bit about how VB solves this. I think that when you wrap up a VB app in, all the support code (mostly a big DLL) is wrapped with it. When the user runs the installer, the DLL is installed (probably in the WINDOWS directory). If a user installs several VB apps built with the same VB version, they all attempt to install the exact same DLL; of course the installers notice this and optimize it away, keeping a reference count. (Ignoring for now the fact that those reference counts don't always work!) If an app builty with a different VB version is installed, it has a DLL with a different name, and that is installed separately. Other support files, I presume, are dealt with in much the same way. Voila, there's the theory. How can we do something similar for Python? A app written in Python should need to install only three or four files: - a driver EXE to start the app - a copy of the Python DLL - the Python library in an archive - the app code in an archive The latter two could be combined into a single archive, but I propose that we use two archives so that the DLL and the Python library archive can be shared between installations of independent Python apps as long as they use the exact same Python version and don't need additional 3rd party packages. (I believe that Jim A's proposal combines the archives with the EXE and the DLL, reducing the number of files to two. That's fine too.) Is there a use for the registry here at all? Maybe not. (I notice that VB seems to have a single registry entry, pointing to a DLL; all other VB files also seem to live there.) Complications: - Some apps may need a custom extension module, which has to be installed as a PYD file. So it seems that there needs to be a directory per app, and perhaps per version of the app (if the app distributor cares). - Some apps need other, non-pyc files (e.g. data tables or help files); it would be handy if these could be stored in the archives as well. - Some standard extension modules are in their own PYD files; these also need to be installed. They aren't typically marked with a version, so perhaps a path directory per version of Python (if not per installed app) is wise. - How to distribute an app that needs 3rd party stuff, e.g. Tcl/Tk, or PIL, or NumPy? Their Python code can easily be wrapped up in another archive with a standard name incorporating a version number; but the required PYD and DLL files are a separate story. (E.g. for Tkinter, you need _tkinter.pyd which links against tcl80.dll.) Basically the same solution as for standard PYD files can work; the needed DLL files can be installed either systemwide (if they have a reliable version number in their name, like tcl80.dll) or in the per-app or per-package directory (like NumPy). - Presumably, the archives will contain PYC files only. This means that tracebacks will not show source code, only line numbers. For Jim A, this is probably exactly what he wants (if the user gets a traceback, his "robust app" has miserably failed, and he takes it in pride that this doesn't happen). But for some others, access to the sources could be essential. For example, I might want to distribute IDLE using this mechanism; users of IDLE who are curious about the standard library (or about IDLE itself) should be able to open the source for an arbitrary module (and maybe even edit it, although that's not a priority and perhaps should even be discouraged). Library source access is an important feature of the IDLE debugger as well. A way out for IDLE is to install a classic distribution of the Python library sources, into the filesystem at an IDLE specific location. Other apps, with only the need for source code in tracebacks, might choose to to have the PY files in the archives sitting next to the PYC files, and somehow the traceback mechanism should be accessing the archive to get a hold of the source. And yes, I realize that Jim A's latest offering solves most of these problems to a large extent -- well done. (Jim, would you care to comment on the issues that you don't address? Will you address them in a future version?) Final notes: There are two different problems here. One is how to distribute Python apps robustly to end users who don't particular care about Python. This is Jim A's problem (and he has a solution that works for him). In general the solutions here try to isolate the installed app from other Python installations. I'm proposing that at least the DLL and the Python library archive can probably be shared between apps without reducing robustness if we keep track more carefully of version numbers. The other problem is how to distribute packages of Python and extension modules for use by Python users. These typically need to drop into some existing Python installation. This is Paul Dubois' problem with NumPy (amongst others) and is the current focus of the distutil SIG. However I believe that there could be a lot of common infrastructure that would help us create better solutions for both problems. For package distribution, common infrastructure (a.k.a. standards) is essential. For app distribution, common infrastructure isn't so important (since the solutions strive for total isolation, there's no problem if different apps use solutions). However, this changes when app creators want to distribute robust self-sufficient apps that use 3rd party packages -- then the 3rd party packages must allow being packaged up using the app distribution creator of choice. Solving this compound problem (creating package distributions that can be redistributed easily as part of robust Python app distributions) should be an important goal for the infrastructure we're building here. The Big Import Rewrite ought to add this to its list of objectives if it isn't already on it. My guess is that the solution for this compound problem will increase the dependency of app distribution tools on the package distribution infrastructure; which to me seems like a Good Thing because it would lead to more code sharing. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at interet.com Thu Dec 9 17:24:40 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 11:24:40 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000701bf4210$10925a40$36a2143f@tim> Message-ID: <384FD7C8.12832BF1@interet.com> Tim Peters wrote: > Something just occurred to me: MS's guidelines aren't arbitrary, they > actually have very good reasons. In the case of putting all an app's > crucial info in the Registry, it's the only way to allow a site > administrator to set policy and site options remotely (an admin can fiddle > other machines' registries remotely). This works very well indeed when > there's only "one copy" of an app on a machine (or at most one copy "per > user"). The registry is still a bad idea because it lumps critical and app data into single files and brings up the ugly problem of protecting individual registry entries instead of just files. Microsoft should have put all app config into the app directory and provided for remote admin of that. But that is not really your point (just ranting about the registry again). > IOW, the three of us find getting path info out of the registry intolerable > because we are in fact trying to do the opposite of what the registry > mechanism was *designed* for: we want perfect isolation, not perfect > sharing. > > This has come up on Python-Help a few times too, in the guise of someone > installing a product that in turn installs an older version of Python, which > in turn confuses another product that relies on features in a newer version > of Python. Or, in other words, no isolation is possible if critical info depends on global data like PYTHONPATH or a _common_ registry entry. We could have different registry entries, but this is confusing and not documented. I think we can solve this with archive files in a way compatible with Unix without going off on a Windows-only wavelength. If the archive file contains everything, and it is in the dir of the app, and the app looks there and finds it, then it Just Works. See also my reply to Skip. JimA From akuchlin at mems-exchange.org Thu Dec 9 17:32:08 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Thu, 9 Dec 1999 11:32:08 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list Message-ID: <199912091632.LAA09236@amarok.cnri.reston.va.us> After poking around in the O'Reilly POSIX book, here's a list of POSIX functions that don't seem to be available in Python. Not all of them seem worth supporting. Ironically, Greg Ward's daemonize() Perl subroutine, which started me on this, doesn't actually seem to need anything that Python doesn't have. I'm looking for corrections to the list; are there other POSIX functions I've missed, or are some of them actually in Python? I think implementing most of these functions is straightforward, with the exception of opendir/readdir/closedir. Worth adding? ============= opendir(), readdir(), closedir() -- most of their functionality is available through os.listdir(), but it might be useful to have a direct interface. Downside is that this would require a new extension type for the C DIR struct. My (lazy) inclination is to not bother. Worth adding: ============= abort() -- used in Py_FatalError(), but not accessible to Python code ctermid(), ctermid_r() -- returns the terminal pathname -- probably just add ctermid(), but use ctermid_r() for thread-safety fpathconf(fd, name) -- Get configuration limit for a file -- would need constants from unistd.h getlogin() -- returns user's login name -- could do something similar with pwd.getpwuid( os.getuid() )[0], but getlogin() apparently looks in utmp getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs pathconf(path, name) -- Gets config variables for a path -- would need constants from unistd.h sysconf(int name) -- Gets system configuration information -- would need constants from unistd.h Not worth adding: ================= clearerr() -- looks like fileobjects call clearerr() before raising errors cuserid() -- returns user's login name -- ORA book says "Do not use this function" -- removed in 1990 POSIX difftime -- seems only required in C "because no addition properties are defined for time_t" (Solaris man page) tmpfile(), tmpnam() -- Create temp file, generate temp filename -- Similar functionality available in tempfile.py mblen(), mbstowcs(), mbtowc(), wcstombs(), wctomb() -- Multi-byte character functions: -- Don't bother; wait for the Unicode type. -- A.M. Kuchling http://starship.python.net/crew/amk/ I'm sorry I became abusive just now ... calling you worms... I was just speaking relatively, you understand. -- Dekko, in ZOT! #3 From jcw at equi4.com Thu Dec 9 17:38:13 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Thu, 09 Dec 1999 17:38:13 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> Message-ID: <384FDAF5.C25C447C@equi4.com> "James C. Ahlstrom" wrote: [...] > ftp://ftp.interet.com/pub/pylib.html Ouch - what's wrong with zip archives? There are utilities to convert to/from zip, to re-pack, to mount zip transparently so it's entries look like regular files, FTP servers, etc. Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format. Zips would seem natural with JPython. And suppose that scripting ever starts to consolidate to a common scripting kernel (yah, well), do you really want a system which is closing all doors to cross-fertilization? Zip has an advantage over .tar.gz in that its table of contents is available without having to decompress the whole kaboodle. Your format has no checksum, which for deployment and long-term storage can be important. If you want a marshalled TOC, then why not add a manifest entry for it, sort of like what ranlib does with ar? You designed the format so archives can be concatenated without any tool (other than "cat"), but this works just as well with zip files, as the Tcl Wrap approach demonstrates. Allow me to very, very loosely paraphrase Guido here: sure, everyone can design an archive format, but they are likely to make the same mistakes all over again - so why not adopt a format which is tried and tested? With all due respect - I sincerely hope you will reconsider and alter your code to work with zip files. It's probably a small adjustment? Unless your *intent* is to create a diverging standard, of course... -- Jean-Claude From jim at interet.com Thu Dec 9 17:46:35 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 11:46:35 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <199912081707.MAA04242@eric.cnri.reston.va.us> <000701bf4210$10925a40$36a2143f@tim> <14415.23676.775163.786028@dolphin.mojam.com> Message-ID: <384FDCEB.2226C1C1@interet.com> Skip Montanaro wrote: > MS believes that Python is the application. Those of us writing > Python programs view those programs as the applications, not the Python > interpreter per se. I think this is a good point. Windows app programmers (mostly) view Python as part of their app and try it install it in their app directory. Unix installs Python as a system app in multiple versions and users use PATH to pick a version. Unix users view the Python interpreter as a system service which is needed for running their app. I think this is because a Windows app is a visual program, and the Python release compiles to a console app (not really a visual program). So all (?most) Windows Python apps are custom mains with Python as a component, but the stock python.exe is not the main. This makes it difficult to document a way to install Python in the Unix fashion, since all apps need their own binary main and python15.dll is the only thing in common. IMHO archive files can solve this a lot more simply. JimA From guido at CNRI.Reston.VA.US Thu Dec 9 17:55:40 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 11:55:40 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 17:38:13 +0100." <384FDAF5.C25C447C@equi4.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> Message-ID: <199912091655.LAA05928@eric.cnri.reston.va.us> > "James C. Ahlstrom" wrote: > > [...] > > ftp://ftp.interet.com/pub/pylib.html Jean-Claude Wippler replied: > Ouch - what's wrong with zip archives? > > There are utilities to convert to/from zip, to re-pack, to mount zip > transparently so it's entries look like regular files, FTP servers, etc. > > Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format. > > Zips would seem natural with JPython. And suppose that scripting ever > starts to consolidate to a common scripting kernel (yah, well), do you > really want a system which is closing all doors to cross-fertilization? > > Zip has an advantage over .tar.gz in that its table of contents is > available without having to decompress the whole kaboodle. > > Your format has no checksum, which for deployment and long-term storage > can be important. > > If you want a marshalled TOC, then why not add a manifest entry for it, > sort of like what ranlib does with ar? > > You designed the format so archives can be concatenated without any tool > (other than "cat"), but this works just as well with zip files, as the > Tcl Wrap approach demonstrates. > > Allow me to very, very loosely paraphrase Guido here: sure, everyone can > design an archive format, but they are likely to make the same mistakes > all over again - so why not adopt a format which is tried and tested? > > With all due respect - I sincerely hope you will reconsider and alter > your code to work with zip files. It's probably a small adjustment? > > Unless your *intent* is to create a diverging standard, of course... Exactly my sentiments. We have rough Python code to deal with zip files; it's very rough because we got kind of carried away adding features and ended up with spaghetti code :-( But it's working code nevertheless and we're offering it up for anyone in this group to clean up (we could do that ourselves but it's not high on our current priority list). I don't know anything about Tcl Wrap. I do know a great deal about the ZIP format, but apparently I missed the concatenation feature. How does this work? Does that work for all zip tools, or just for the ZIP reader in Wrap? (I looked up how Jim A does it -- his central directory at the end of the file contains the total size of the data covered by that directory, so he seeks back to the beginning of it and sees if another magic number precedes it; and so on. Very simple.) I quickly looked at the Wrap page; it shows how to access data files stored in the archive. Question: does the wrap::open code go out to the regular filesystem if it finds there's no wrap archive? That would be handy so you can test the code in its unwrapped form without change. Python needs this too. --Guido van Rossum (home page: http://www.python.org/~guido/) From gward at cnri.reston.va.us Thu Dec 9 18:12:00 1999 From: gward at cnri.reston.va.us (Greg Ward) Date: Thu, 9 Dec 1999 12:12:00 -0500 Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Dec 09, 1999 at 11:32:08AM -0500 References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <19991209121159.B20179@cnri.reston.va.us> On 09 December 1999, Andrew M. Kuchling said: > After poking around in the O'Reilly POSIX book, here's a list of POSIX > functions that don't seem to be available in Python. Not all of them > seem worth supporting. Ironically, Greg Ward's daemonize() Perl > subroutine, which started me on this, doesn't actually seem to need > anything that Python doesn't have. I think I already pointed this your way, but don't forget the man page for Perl's POSIX module: "perldoc POSIX". I suspect POSIX functions that don't make sense in Perl also don't make sense in Python. I agree with all your assessments about what's worth adding and what's not, and that {close,read,open}dir() are questionable and probably not worth the bother. Random thoughts: > abort() -- used in Py_FatalError(), but not accessible to Python code Would this do the same as in C, ie. terminate the process and dump core? > getlogin() -- returns user's login name > -- could do something similar with pwd.getpwuid( os.getuid() )[0], but > getlogin() apparently looks in utmp With a documentation proviso that utmp is very old-fashioned, and you really should do the getuid() thing unless you definitely want to get the login ID from utmp. Perhaps an alternate "getlogin" (different name?) that does the getuid() thing could be provided. Greg From guido at CNRI.Reston.VA.US Thu Dec 9 18:16:03 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 12:16:03 -0500 Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: Your message of "Thu, 09 Dec 1999 12:12:00 EST." <19991209121159.B20179@cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> <19991209121159.B20179@cnri.reston.va.us> Message-ID: <199912091716.MAA06063@eric.cnri.reston.va.us> > > getlogin() -- returns user's login name > > -- could do something similar with pwd.getpwuid( os.getuid() )[0], but > > getlogin() apparently looks in utmp > > With a documentation proviso that utmp is very old-fashioned, and you > really should do the getuid() thing unless you definitely want to get > the login ID from utmp. Perhaps an alternate "getlogin" (different > name?) that does the getuid() thing could be provided. There's the getpass module which has a getuser() function that looks in various env vars and if all else fails uses getuid() and pwd. If the goal is to get the user ID without being fooled, using os.getuid() or os.geteuid() directly seems to be the right thing to do; I don't see the need for a shorthand for pwd.getpwuid(os.getuid())[0] (which is what getuser() uses). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Thu Dec 9 18:18:10 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 12:18:10 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 10:02:18 EST." <384FC47A.BB4DA517@interet.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> Message-ID: <199912091718.MAA06087@eric.cnri.reston.va.us> [Jim A] > Lately I have been more active in distribution via archive files. > Part of the solution is an archive file format which is identical on > Unix and Windows, and which can hold the Python library and packages > as single files. For my own efforts on this see: > > ftp://ftp.interet.com/pub/pylib.html Apart from agreeing with Jean-Claude's rant about inventing a new archive format, I think this is a good proposal because it is very clear about the problem it tries to solve and doesn't get distracted by other issues. I also commend Jim for building upon Greg Stein's imputil (like Gordon did). I wish I could present a solution this simple as The Standard Way, but (as explained in my long post earlier today) there just are so many wrinkles that I'd rather hold out for the Right Solution... But I've taken good notice of Jim's solution. --Guido van Rossum (home page: http://www.python.org/~guido/) From beazley at cs.uchicago.edu Thu Dec 9 18:16:57 1999 From: beazley at cs.uchicago.edu (David Beazley) Date: Thu, 9 Dec 1999 11:16:57 -0600 (CST) Subject: [Python-Dev] Missing POSIX functions: the list References: <199912091632.LAA09236@amarok.cnri.reston.va.us> <19991209121159.B20179@cnri.reston.va.us> Message-ID: <199912091716.LAA15624@gargoyle.cs.uchicago.edu> Greg Ward writes: > > I think I already pointed this your way, but don't forget the man page > for Perl's POSIX module: "perldoc POSIX". I suspect POSIX functions > that don't make sense in Perl also don't make sense in Python. > > I agree with all your assessments about what's worth adding and what's > not, and that {close,read,open}dir() are questionable and probably not > worth the bother. Random thoughts: > I disagree. I think that the POSIX module should strive to be as complete as possible--even if certain functions are closely related other functionality in the library (tmpfile for instance). I suspect that this sort of thing is probably the cause of the missing functionality in the current library (as in, "why would anyone want to do that?" when in fact there may be a perfectly good reason in certain situations). > > abort() -- used in Py_FatalError(), but not accessible to Python code > > Would this do the same as in C, ie. terminate the process and dump core? > Sure, why not? This might be a useful thing to do every so often---when trying to figure out what's wrong with a C extension module for instance. Cheers, Dave From jim at interet.com Thu Dec 9 18:43:57 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 12:43:57 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> Message-ID: <384FEA5D.A07F23EC@interet.com> Jean-Claude Wippler wrote: > Ouch - what's wrong with zip archives? Thanks very much for looking over the format. In general Zip archives store whole branches of a file system. A Python ./Lib zip archive would contain: N:/python/Python-1.5.2/Lib/string.pyc N:/python/Python-1.5.2/Lib/os.pyc N:/python/Python-1.5.2/Lib/copy.pyc N:/python/Python-1.5.2/Lib/test/testall.pyc Zip archives are isomorphic to branches of a file system. That means there must be a sys.path for each zip archive file. How would this be specified? The archive format stores modules as dotted names, just as they appear in the import statement. The search path is "." in every archive file by definition. The import statement "import foo" just results in a dictionary lookup for key "foo", not a search through a zip directory along a local search path for "foo.something" where "something" can be pyc, pyo, py, etc. The intent was to link the archives to the import statement, not re-create a directory tree. It borrowed this feature from the archive formats of Greg and Gordon. > There are utilities to convert to/from zip, to re-pack, to mount zip > transparently so it's entries look like regular files, FTP servers, etc. Basic operations (to, from, repack) are easy in Python. > Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format. Hmmm.... > Your format has no checksum, which for deployment and long-term storage > can be important. Actually the pylib.py "dir()" method reads all *.pyc with marshal, and I am depending on marshal to object to bad data and also out-of-date magic numbers. But this is a good point. > If you want a marshalled TOC, then why not add a manifest entry for it, > sort of like what ranlib does with ar? Sorry, I don't understand. Please explain. > You designed the format so archives can be concatenated without any tool > (other than "cat"), but this works just as well with zip files, as the > Tcl Wrap approach demonstrates. Are you saying that cat zip1.zip zip2.zip > myzip.zip works? An important feature is the ability to concatenate to a binary: cat python.exe zip1.zip > myapp.exe Searching for this isn't fast unless magic numbers are at the end. Are zip files recognizable from the end (I don't know)? > Allow me to very, very loosely paraphrase Guido here: sure, everyone can > design an archive format, but they are likely to make the same mistakes > all over again - so why not adopt a format which is tried and tested? > > With all due respect - I sincerely hope you will reconsider and alter > your code to work with zip files. It's probably a small adjustment? > > Unless your *intent* is to create a diverging standard, of course... The intent is to create a standard but not a diverging standard. Are there any zip experts out there? Can zip files satisfy all the design requirements I listed in pylib.html? Is there zip code available? All my code is in Python. JimA From jcw at equi4.com Thu Dec 9 18:57:33 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Thu, 09 Dec 1999 18:57:33 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> Message-ID: <384FED8D.3C535D38@equi4.com> Guido van Rossum wrote: > > [... my not-really-meant-as-rant about adopting zip as format ...] > [zip concatenation feature] > How does this work? Does that work for all zip tools, or just for the > ZIP reader in Wrap? (I looked up how Jim A does it -- his central > directory at the end of the file contains the total size of the data > covered by that directory, so he seeks back to the beginning of it and > sees if another magic number precedes it; and so on. Very simple.) Same for Wrap. Standard tools would not see the preceding ZIP groups. In terms of maintenance, I'd avoid this trick. I merely wanted to point out that zip archives can be stacked, if the reader is set up to it. > Question: does the wrap::open code go out to the regular filesystem > if it finds there's no wrap archive? That would be handy so you can > test the code in its unwrapped form without change. IIRC, Wrap overrides "open" for embedded entries as "file.zip/abc.py". There's more being developed in this area: a "virtual file system" which lets you mount archives and such (VFS by Matt Newman, mentioned with his permission), so that the file-system model can be extended to navigate into a lot more things than real file systems. Andrew Kuchling's post hints at another tangent: opendir/readdir is of course simply an enumeration. There's a lot of "genericity" lurking in scanning across file systems, trees, networks, and resources in general. The filesystem <-> OO dichotomy needs a review. > Python needs this too. Concepts like these have a lot to offer - and would make even more sense if they were done in a way which benefits multiple scripting languages. Feel free to reply by email if you ever want to further discuss this. -- Jean-Claude From fdrake at acm.org Thu Dec 9 19:10:44 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 13:10:44 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <14415.61604.415084.520092@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > After poking around in the O'Reilly POSIX book, here's a list of POSIX > functions that don't seem to be available in Python. Not all of them > seem worth supporting. Ironically, Greg Ward's daemonize() Perl I think your assessment is reasonable. I looked at posixmodule.c and note also that the functions use PyArg_Parse() and PyArg_NoArgs() instead of using PyArg_ParseTuple(). The advantage of PyArg_ParseTuple() is that the name of the function can be specified for inclusion in TypeError messages when the arguments are not of the right type. I'm doing some work to correct this now. I've also added ctermid(), and will try to add at least a few more before I check in the changes. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw at cnri.reston.va.us Thu Dec 9 19:17:35 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 9 Dec 1999 13:17:35 -0500 (EST) Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> Message-ID: <14415.62015.856931.750279@anthem.cnri.reston.va.us> >>>>> "JW" == Jean-Claude Wippler writes: JW> Same for Wrap. Standard tools would not see the preceding ZIP JW> groups. JW> In terms of maintenance, I'd avoid this trick. I merely JW> wanted to point out that zip archives can be stacked, if the JW> reader is set up to it. I agree. I can't recall the details now, but I had a lot of problems with zip concatenation in JPython. I think at least some of the older Java tools for groking zips don't work with contatenation. -Barry From guido at CNRI.Reston.VA.US Thu Dec 9 19:21:42 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 13:21:42 -0500 Subject: [Python-Dev] Virtual filesystem APIs In-Reply-To: Your message of "Thu, 09 Dec 1999 18:57:33 +0100." <384FED8D.3C535D38@equi4.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> Message-ID: <199912091821.NAA06209@eric.cnri.reston.va.us> Jean-Claude Wippler: > There's more being developed in this area: a "virtual file system" which > lets you mount archives and such (VFS by Matt Newman, mentioned with his > permission), so that the file-system model can be extended to navigate > into a lot more things than real file systems. I agree. We have experimented with this a bunch in the Knowbot sofware, where we have some code that wants to look at a "filesystem" but could be talking to some kind of filesystem emulation across an RPC connection or alternatively could be accessing a zip file. Our conclusion is that a convenient interface is modeled after (a subset of) the os and os.path functionality. In fact, the only thing you would need to add to the os module would be a function to open a file object; I've proposed to add os.fopen() as an alias for the built-in open(). The idea that you could mount one VFS inside another is nice, although I'm not sure how practical it is. For one thing, in our fs code, os.path.sep and friends (e.g. os.path.normcase behavior) were set per filesystem; what would happen if you mounted a Unix filesystem in an NT tree? Doing the translations is hard too; e.g. on a Mac fs, the separator is ':' and a '/' can be part of a filename -- do you simply swap them? What if a Mac file has both '/' and '\' and you mount it on a Windows FS? I'd rather stay away from this. On the other hand the VFS concept could be used as a totally different solution to the sys.importers vs. sys.path > Andrew Kuchling's post hints at another tangent: opendir/readdir is of > course simply an enumeration. There's a lot of "genericity" lurking in > scanning across file systems, trees, networks, and resources in general. I'd still rather see listdir() (which our sample virtual FS API supported). I don't think it necessarily makes sense to do this on a more generic basis -- other trees and graphs have sufficiently different semantics that using a FS like API doesn't necessarily cut it. Take for example the Windows registry -- looks a lot like a filesystem, doesn't it? Yet it has one fundamental property that a typical FS doesn't: directory nodes can have data *and* children... I've written a tree widget and found that it's remarkably hard to come up with a workable API to talk to trees *in general*. Trees are a universal concept, but code sharing is still elusive... Perhaps because the concept is so simple? > The filesystem <-> OO dichotomy needs a review. I think that my proposal above should cover this. (We looked briefly at doing a similar thing for Java, and found that it's actually harder there -- they have all these nice objects representing paths, but it's not easily subclassable to represent paths in some virtual filesystem.) > Concepts like these have a lot to offer - and would make even more sense > if they were done in a way which benefits multiple scripting languages. > Feel free to reply by email if you ever want to further discuss this. I see only very hope for this point of view, but I will refrain to comment more. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Thu Dec 9 19:23:14 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 9 Dec 1999 13:23:14 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <384FEA5D.A07F23EC@interet.com> Message-ID: <1267359311-37934097@hypernet.com> James C. Ahlstrom wrote: > Jean-Claude Wippler wrote: > > > Ouch - what's wrong with zip archives? > In general Zip archives store whole branches of a file > system. > The archive format stores modules as dotted names, just as they > appear in the import statement. The search path is "." in every > archive file by definition. The import statement "import foo" > just results in a dictionary lookup for key "foo", not a search > through a zip directory along a local search path for > "foo.something" where "something" can be pyc, pyo, py, etc. > > The intent was to link the archives to the import statement, not > re-create a directory tree. It borrowed this feature from the > archive formats of Greg and Gordon. As I've stated before, I have 2 archive formats. This may seem a needless complication, but my suspicion is that sooner or later, people will want 2 different kinds. One is a .pyz format, which corresponds closely to Jim's .pyl format (with a number of minor differences: it's compressed, the archive as a whole has the Python magic number, instead of each entry, and it's not designed for concatenation). The other is like a zip, and probably should be zip format. It's designed to hold _anything_, and can be manipulated from C and from Python. It can be concatenated and / or embedded (and the innner one opened without extraction). It's table of contents is more file-system like. Importing from one is slower, but that's not really what it's for. It's for packaging up arbitrary resources. Like .pyz's, or Tcl/Tk for Tkinter apps, or configuration files. Jim is correct that a good importer (which can say "No, it's not mine" as quickly as possible) is better satisfied by a simple dictionary lookup than fooling with file extensions and directories (virtual or real). > > If you want a marshalled TOC, then why not add a manifest entry > > for it, sort of like what ranlib does with ar? > > Sorry, I don't understand. Please explain. The table of contents is just another entry. > An important feature is the ability to concatenate to a binary: > cat python.exe zip1.zip > myapp.exe > Searching for this isn't fast unless magic numbers are at the > end. Are zip files recognizable from the end (I don't know)? Where do you think we got this idea? > Are there any zip experts out there? Can zip files satisfy all > the design requirements I listed in pylib.html? Is there zip > code available? All my code is in Python. Hmm. My bookmark appears to be dead (I was there not long ago): http://www.cubic.org/source/archive/fileform/packers/appnote.t xt There have been several references on this list to Guido et al having some Python / zip code. - Gordon From guido at CNRI.Reston.VA.US Thu Dec 9 19:23:27 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 13:23:27 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 13:17:35 EST." <14415.62015.856931.750279@anthem.cnri.reston.va.us> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> <14415.62015.856931.750279@anthem.cnri.reston.va.us> Message-ID: <199912091823.NAA06243@eric.cnri.reston.va.us> > I agree. I can't recall the details now, but I had a lot of problems > with zip concatenation in JPython. I think at least some of the older > Java tools for groking zips don't work with contatenation. The Java "jar" tool mostly ignores the central directory -- it seems to read the archive from the front, using the local header records, and ignoring the central directory (of course it writes one when it creates an archive). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Thu Dec 9 19:32:15 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 13:32:15 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 12:43:57 EST." <384FEA5D.A07F23EC@interet.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <384FEA5D.A07F23EC@interet.com> Message-ID: <199912091832.NAA06287@eric.cnri.reston.va.us> > In general Zip archives store whole branches of a file > system. A Python ./Lib zip archive would contain: > > N:/python/Python-1.5.2/Lib/string.pyc > N:/python/Python-1.5.2/Lib/os.pyc > N:/python/Python-1.5.2/Lib/copy.pyc > N:/python/Python-1.5.2/Lib/test/testall.pyc > > Zip archives are isomorphic to branches of a file system. > That means there must be a sys.path for each zip archive file. > How would this be specified? Not true. It's easy (using the proper Zip tools) to creat an archive containing this instead: string.pyc os.pyc copy.pyc testall.pyc Thus the entire archive is considered the directory. The Java "jar" tool uses this approach. It's also easy to have packages in there (again this is what Java does): test/ test/__init__.pyc test/pystone.pyc test_support.pyc (etc.) > The archive format stores modules as dotted names, just as they > appear in the import statement. The search path is "." in every > archive file by definition. The import statement "import foo" > just results in a dictionary lookup for key "foo", not a search > through a zip directory along a local search path for "foo.something" > where "something" can be pyc, pyo, py, etc. > > The intent was to link the archives to the import statement, not > re-create a directory tree. It borrowed this feature from > the archive formats of Greg and Gordon. Maybe you've gone overboard. The time it takes to translate the dots into slashes really isn't the big deal. > Are there any zip experts out there? Can zip files satisfy all the > design requirements I listed in pylib.html? Is there zip code > available? All my code is in Python. Yes (all of us here at CNRI), yes, yes (we have the spaghetti code). While zip files support compression, they support uncompressed files as well and we could go either way. Their most popular compression format is gzip compatible and can be read and written with the zlib module, which is in the standard Python distribution (even on Windows) -- though to build it you need the zlib C library which is of course external (but solid open source). --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 9 19:41:22 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 13:41:22 -0500 (EST) Subject: [Python-Dev] Virtual filesystem APIs In-Reply-To: <199912091821.NAA06209@eric.cnri.reston.va.us> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> <199912091821.NAA06209@eric.cnri.reston.va.us> Message-ID: <14415.63442.92911.748132@weyr.cnri.reston.va.us> Guido van Rossum writes: > os.path.sep and friends (e.g. os.path.normcase behavior) were set per Hah! Caught you in public! "sep" & friends are defined in the os module; this is where the separation breaks down. I think these should be located in os.path, and os can just pick them up from there to be backward compatible. os.pathsep is a problem, somewhat; it is related to os.sep, but is very different in many ways. I don't think there's a good way to deal with it. > filesystem; what would happen if you mounted a Unix filesystem in an > NT tree? Doing the translations is hard too; e.g. on a Mac fs, the > separator is ':' and a '/' can be part of a filename -- do you simply > swap them? What if a Mac file has both '/' and '\' and you mount it > on a Windows FS? I'd rather stay away from this. And this is tightly related to the sep/pathsep problem as well. I agree, we should stay away from it. > I think that my proposal above should cover this. (We looked briefly > at doing a similar thing for Java, and found that it's actually harder > there -- they have all these nice objects representing paths, but it's > not easily subclassable to represent paths in some virtual But it was easy to create a set of interfaces with a reasonable API; getting back to the "typical" Java classes was what really changed the most. For those of us not working on the KOE: I set up Filesystem and FSFile interfaces; the Filesystem represented the entire filesystem and the FSFile was very similar to the java.io.File class, but had additional methods to get input and output stream objects (of the standard Java flavor); all the buffering and such could be wrapped on top of that just like any other Java I/O. The specific application was to provide access to an isolated directory structure which untrusted code "owned", but ensured that parent directories were unreachable. Additional security checks can be worked into such a structure as applicable. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake at acm.org Thu Dec 9 20:06:32 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 14:06:32 -0500 (EST) Subject: [Python-Dev] posix module test suite Message-ID: <14415.64952.780974.8124@weyr.cnri.reston.va.us> There's not a test for the posix or os modules; if anyone would like to contribute one, this would be a good time! ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jcw at equi4.com Thu Dec 9 21:51:11 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Thu, 09 Dec 1999 21:51:11 +0100 Subject: [Python-Dev] Virtual filesystem APIs References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> <199912091821.NAA06209@eric.cnri.reston.va.us> Message-ID: <3850163F.80BDCB75@equi4.com> Guido van Rossum wrote: > [... horrors of cross-OS mounts and ":\/" separators ...] I agree, this has some very hairy sides to it. But VFS is really more about mounting non-FS things in a "root" FS (presumably the real one). > On the other hand the VFS concept could be used as a totally different > solution to the sys.importers vs. sys.path Heck, I'll be the "enfant terrible" once more: yes, and this stuff could well be implemented generically across scripting languages. Of course the act of "importing" is a very Pythonic issue - but FS/VFS traversal and the actual shared library load need not be. Anyway, enough of that. > Take for example the Windows registry -- looks a lot like a > filesystem, doesn't it? Yet it has one fundamental property that a > typical FS doesn't: directory nodes can have data *and* children... What you're saying is that dir = set-of-subdirs + set-of-files, and that this is a more general requirement than plain FS's. Doesn't that simply mean that the more general model is needed as basis to handle both? > Trees are a universal concept, but code sharing is still elusive... Ah, but think of the implications: archives, networks, XML, the world! -- Jean-Claude From fdrake at acm.org Thu Dec 9 22:16:00 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 16:16:00 -0500 (EST) Subject: [Python-Dev] forwarded message from Fred L. Drake Message-ID: <14416.7184.255000.342231@weyr.cnri.reston.va.us> OK, I've checked in some changes to the posix module to add support for a few of the POSIX interfaces Andrew expressed interest in seeing (and some he said weren't such a good idea, or at least not necessary, but about which I decided I disagreed after all). For those of you who aren't on the checkins list (??), I've attached the message so you'll know what functions were added. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives -------------- next part -------------- An embedded message was scrubbed... From: "Fred L. Drake" Subject: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.115,2.116 Date: Thu, 9 Dec 1999 16:13:10 -0500 (EST) Size: 3800 URL: From guido at CNRI.Reston.VA.US Thu Dec 9 22:19:57 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 16:19:57 -0500 Subject: [Python-Dev] forwarded message from Fred L. Drake In-Reply-To: Your message of "Thu, 09 Dec 1999 16:16:00 EST." <14416.7184.255000.342231@weyr.cnri.reston.va.us> References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> Message-ID: <199912092119.QAA06731@eric.cnri.reston.va.us> > OK, I've checked in some changes to the posix module to add support > for a few of the POSIX interfaces Andrew expressed interest in seeing > (and some he said weren't such a good idea, or at least not necessary, > but about which I decided I disagreed after all). I wish you'd made your disagreement public before checking it in... But it's not too late... --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Thu Dec 9 22:32:26 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Thu, 9 Dec 1999 16:32:26 -0500 (EST) Subject: [Python-Dev] forwarded message from Fred L. Drake In-Reply-To: <14416.7184.255000.342231@weyr.cnri.reston.va.us> References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> Message-ID: <14416.8170.18298.33796@amarok.cnri.reston.va.us> Fred L. Drake, Jr. writes (in a CVS checkin): >Added support for abort(), ctermid(), tmpfile(), tempnam(), tmpnam(), >and TMP_MAX. For those of you following along, the tmpfile(), tempnam(), tmpnam() functions were ones I listed as probably not worth adding. On the other hand, David Beazley wrote: > I think that the POSIX module should strive to be as >complete as possible--even if certain functions are closely related >other functionality in the library (tmpfile for instance). I suspect ... and that's a good point, too. The POSIX functions may provide adaptability that a Python analog doesn't; for example, you could read /etc/passwd in pure Python, but that wouldn't handle NIS or shadow passwords. So I guess I'll vote for completeness over lack of overlap; leave tmpfile() & friends in. -- A.M. Kuchling http://starship.python.net/crew/amk/ This supports reflection, which is the 90s way of writing self-modifying code. -- John Aycock at IPC7, during his parsing talk From guido at CNRI.Reston.VA.US Thu Dec 9 22:38:42 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 16:38:42 -0500 Subject: [Python-Dev] forwarded message from Fred L. Drake In-Reply-To: Your message of "Thu, 09 Dec 1999 16:32:26 EST." <14416.8170.18298.33796@amarok.cnri.reston.va.us> References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> <14416.8170.18298.33796@amarok.cnri.reston.va.us> Message-ID: <199912092138.QAA06790@eric.cnri.reston.va.us> > ... and that's a good point, too. The POSIX functions may provide > adaptability that a Python analog doesn't; for example, you could read > /etc/passwd in pure Python, but that wouldn't handle NIS or shadow > passwords. So I guess I'll vote for completeness over lack of > overlap; leave tmpfile() & friends in. OK, I agree now. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 9 23:30:52 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 17:30:52 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <14416.11676.888918.511932@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > After poking around in the O'Reilly POSIX book, here's a list of POSIX Ok, here's my comments on the remainder of these. > Worth adding? > ============= > opendir(), readdir(), closedir() -- > most of their functionality is available through > os.listdir(), but it might be useful to have a direct > interface. Downside is that this would require a new > extension type for the C DIR struct. My (lazy) inclination > is to not bother. [rewinddir() and seekdir() should be considered as well, where supported.] There's more tedium than anything in implementing a new C type. I'm a little concerned that there might not be any real value here, but it's hard to be sure about that. Is there any real reason not to use os.listdir(). > Worth adding: > ============= ... > fpathconf(fd, name) -- Get configuration limit for a file > -- would need constants from unistd.h This is mostly a matter of setting up the constants; not hard, just more distracting than I want to deal with right now. > getlogin() -- returns user's login name > -- could do something similar with pwd.getpwuid( os.getuid() )[0], but > getlogin() apparently looks in utmp Per Guido's comments, I'm not sure how valuable it is. It may make sense strictly for completeness, but I've never heard of utmp being considered reliable in any way. Maybe I'm too new at all this. > getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs This should be easy enough. > pathconf(path, name) -- Gets config variables for a path > -- would need constants from unistd.h (Same as for fpathconf().) > sysconf(int name) -- Gets system configuration information > -- would need constants from unistd.h > > Not worth adding: > ================= Aside from the ones I've already added, I agree. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim at digicool.com Fri Dec 10 00:31:40 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 09 Dec 1999 18:31:40 -0500 Subject: [Python-Dev] Thankyou for fsync :) Message-ID: <38503BDC.CB91FB29@digicool.com> I found recently that I needed fsync and was pleasantly surprized to find that it is provided in the posix module, where available. Can I count on it staying in the posix module, when available, for the forseeable future? Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gstein at lyra.org Fri Dec 10 01:32:33 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 9 Dec 1999 16:32:33 -0800 (PST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <14416.11676.888918.511932@weyr.cnri.reston.va.us> Message-ID: On Thu, 9 Dec 1999, Fred L. Drake, Jr. wrote: > Andrew M. Kuchling writes: >... > > opendir(), readdir(), closedir() -- > > most of their functionality is available through > > os.listdir(), but it might be useful to have a direct > > interface. Downside is that this would require a new > > extension type for the C DIR struct. My (lazy) inclination > > is to not bother. > > [rewinddir() and seekdir() should be considered as well, where > supported.] > > There's more tedium than anything in implementing a new C type. I'm > a little concerned that there might not be any real value here, but > it's hard to be sure about that. Is there any real reason not to use > os.listdir(). No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic number if you're worried about mixing CObjects. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at CNRI.Reston.VA.US Fri Dec 10 03:03:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 21:03:04 -0500 Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: Your message of "Thu, 09 Dec 1999 18:31:40 EST." <38503BDC.CB91FB29@digicool.com> References: <38503BDC.CB91FB29@digicool.com> Message-ID: <199912100203.VAA07410@eric.cnri.reston.va.us> > I found recently that I needed fsync and was pleasantly surprized > to find that it is provided in the posix module, where available. > > Can I count on it staying in the posix module, when available, > for the forseeable future? Since we seem to be on an adding spree, I don't see why not -- as long as POSIX keeps it available :) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Fri Dec 10 07:28:56 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 10 Dec 1999 00:28:56 -0600 (CST) Subject: [Python-Dev] posix module test suite In-Reply-To: <14415.64952.780974.8124@weyr.cnri.reston.va.us> References: <14415.64952.780974.8124@weyr.cnri.reston.va.us> Message-ID: <14416.40360.611743.143624@dolphin.mojam.com> Fred> There's not a test for the posix or os modules; if anyone would Fred> like to contribute one, this would be a good time! ;-) Not having ever written any tests for the core Python modules, it seems natural to ask if there are any guidelines for the construction of such tests or the test equivalent of the Modules/xxmodule.c file. Are there standard behaviors expected for passing and failing a test? Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From tim_one at email.msn.com Fri Dec 10 09:48:59 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 10 Dec 1999 03:48:59 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <14415.23676.775163.786028@dolphin.mojam.com> Message-ID: <000501bf42eb$66529860$412d153f@tim> [Skip Montanaro] > Alright! Now I understand what all the hubbub is about! My eyes have > mostly been glazing over trying to follow all this Windows > registry/path/ini stuff. MS believes that Python is the application. > Those of us writing Python programs view those programs as the > applications, not the Python interpreter per se. Eww -- that's a helpful and insightful way to put it, Skip! Now maybe *I* can understand what the hubbub is about . > Is there some way that people writing applications in Python can set > up registry entries that are specific to their application (e.g. > tabnanny.py) instead of only specific to the Python interpreter? Yes, but they can't get Python to look at those before it's too late. I spent a whole evening a month or two ago just trying to figure out where all the cruft in my Windows sys.path *came* from. This is out-of-the-box; I haven't added anything myself: ['', 'D:\\Python\\win32', 'D:\\Python\\win32\\lib', 'D:\\Python', 'D:\\Python\\Pythonwin', 'D:\\Python\\Lib\\plat-win', 'D:\\Python\\Lib', 'D:\\Python\\DLLs', 'D:\\Python\\Lib\\lib-tk', 'D:\\PYTHON\\DLLs', 'D:\\PYTHON\\lib', 'D:\\PYTHON\\lib\\plat-win', 'D:\\PYTHON\\lib\\lib-tk', 'D:\\PYTHON'] That's bizarre on the face of it, and tracking it all down was draining. I've forgotten the details. I do remember concluding that it was impossible to do what I wanted to do without changing the implementation, though, and nobody on Python-Dev disputed that at the time. In a pragmatic crunch, I wrote the little app I needed to distribute at the time in Perl instead, meaning to come back to this. I haven't had time. IIRC, the ultimate problem wasn't really that Python looked at the registry to get *some* path info, it was a combination of A) It looked at the registry so early that it was impossible to stop it from executing whatever site.py the registry pointed at (well, I could with the -S option -- but then there was no way to get it to do the site.py that was *wanted* instead). B) No way to override what was in the registry; e.g., I was greatly surprised to discover that setting a PYTHONPATH envar didn't override anything, it simply plunked the PYTHONPATH entries into sys.path along with everything else -- and too late to stop anything anyway. In a long msg I haven't yet read all the way thru, Guido at least suggested associating different registry path info with different Python versions. That would address a number of otherwise currently intractable problems. I suspect it still wouldn't help with the problem I was facing, though. That is, I wanted to be able to tell people to run \\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py which is just a Windows way of saying "run a Python executable from a shared network location". When they tried that, though, the network Python looked in *their* individual registries for its Python path info, and some of the hackers with mondo customized Python setups on their own machines watched things go down in flames. This certainly can't be a common problem, but it speaks to an unforgiving rigidity in the current approach. There seemed to be nothing I could do to guarantee this would work, short of telling users to edit their registries before running this tool (that's a non-starter on Windows -- editing the registry is dangerous) or putting a customized Python on the network pointing to a bogus registry key (it was faster to write the app in Perl! Perl doesn't *try* to be so infernally helpful , so doesn't get in the way either). I'm left wondering what purpose putting Python library path info into the Windows registry serves. Is there anyone on Windows who *doesn't* have their Python Lib/ etc as direct subdirectories of the directory containing python.exe? Not that I've seen. Python puts *those* in sys.path too -- but only after it (in the normal case; see my sys.path above) pulls identically redundant paths out of the registry first, or (in the cases we're griping about) pulls irrelevant or downright harmful paths out of the registry first (paths appropriate to the last Python you *installed*, not to the Python that's *running*!). Perhaps all this cruft is needed to support embedded Python, though (something I've never done). Regardless, I expect it would have been enough for me if PYTHONPATH simply worked the way I mistakenly assumed it would (that is, this is sys.path, and that's *it*; feel free to prepend the current directory when initialization is complete, but before then looking at any file not reached from PYTHONPATH is verboten). the-cleverer-the-code-the-more-vital-that-there-be-a-way-to- short-circuit-it-ly y'rs - tim From jim at interet.com Fri Dec 10 13:16:31 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 07:16:31 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000501bf42eb$66529860$412d153f@tim> Message-ID: <3850EF1F.158445B6@interet.com> Tim Peters wrote: > > [Skip Montanaro] > > Is there some way that people writing applications in Python can set > > Yes, but they can't get Python to look at those before it's too late. I > spent a whole evening a month or two ago just trying to figure out where all > the cruft in my Windows sys.path *came* from. This is out-of-the-box; I > ..... Excellent discussion Tim! > I suspect it still wouldn't help with the problem I was facing, though. > That is, I wanted to be able to tell people to run > > \\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py > > which is just a Windows way of saying "run a Python executable from a shared > network location". When they tried that, though, the network Python looked > in *their* individual registries for its Python path info, and some of the > hackers with mondo customized Python setups on their own machines watched > things go down in flames. I think a sensible way to run little apps is to put everything in an archive file including the main.py. On Windows you concattenate that to python.exe, and it Just Works. > Windows registry serves. Is there anyone on Windows who *doesn't* have > their Python Lib/ etc as direct subdirectories of the directory containing > python.exe? Not that I've seen. Point on the curve. We don't. We freeze everything except the main.py. JimA From jim at interet.com Fri Dec 10 14:38:28 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 08:38:28 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> Message-ID: <38510254.ED15D32B@interet.com> Jean-Claude Wippler wrote: > Ouch - what's wrong with zip archives? > With all due respect - I sincerely hope you will reconsider and alter > your code to work with zip files. It's probably a small adjustment? OK, you talked me into it. Ya, small adjustment, no problem ;-) JimA From jack at oratrix.nl Fri Dec 10 14:51:10 1999 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 10 Dec 1999 14:51:10 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Message by "James C. Ahlstrom" , Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com> Message-ID: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> Is it possible nowadays to have two files with the same name but different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive? That's the one thing that always struck me as very very silly about zipfiles. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From gmcm at hypernet.com Fri Dec 10 15:28:51 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Fri, 10 Dec 1999 09:28:51 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> References: Message by "James C. Ahlstrom" , Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com> Message-ID: <1267287023-386248@hypernet.com> Jack Jansen asks: > Is it possible nowadays to have two files with the same name but > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same > archive? Depends on how you do it. If the user imports foo.spam.bar, an importer will be asked for: foo (return foo.__init__) foo.spam (return foo.bar.__init__) foo.spam.bar (return foo.spam.bar) But the API allows lots of variations. This is another possible interaction: foo (return None) foo.__init__ (return foo.__init__) foo.spam (return None) foo.bar.__init__ (return foo.bar.__init__) foo.spam.bar (return foo.spam.bar) Or, by looking at different args to get_code, you could look at the requests as: foo in context of None spam in context of foo bar in context of foo.spam With another variation where the request for __init__ becomes explicit. The first way seems the natural way for archives, and makes it easy to keep foo.bar.spam distinct from foo.spam. > That's the one thing that always struck me as very very silly > about zipfiles. Huh? - Gordon From guido at CNRI.Reston.VA.US Fri Dec 10 15:51:39 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 09:51:39 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 14:51:10 +0100." <19991210135111.2F83C370CF2@snelboot.oratrix.nl> References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> Message-ID: <199912101451.JAA07786@eric.cnri.reston.va.us> > Is it possible nowadays to have two files with the same name but different > paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive? > > That's the one thing that always struck me as very very silly about zipfiles. Zip files contain the full path, there's no problem with that. Was there ever? --Guido van Rossum (home page: http://www.python.org/~guido/) From jack at oratrix.nl Fri Dec 10 15:52:26 1999 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 10 Dec 1999 15:52:26 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Message by "Gordon McMillan" , Fri, 10 Dec 1999 09:28:51 -0500 , <1267287023-386248@hypernet.com> Message-ID: <19991210145227.01F99370CF2@snelboot.oratrix.nl> > Jack Jansen asks: > > > Is it possible nowadays to have two files with the same name but > > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same > > archive? > > Depends on how you do it. Apparently I mis-phrased my question, I'll try again. When people suggested to use zip format as the standard Python archive format I was a bit worried, becuase I've had it happen to me various times that I was unable to create a ZIP archive with two files with the same name but different paths (i.e. create an archive of a directory that contains both a foo/bar.py and a foo/spam/bar.py). So, my question was: has this happened to me because the winzip I used was braindead, or is there possibly a problem with the ZIP file format that disallows two files with the same name in one archive? Most zip programs I've seen also seem to present filenames as the primary metaphore, with full pathnames somewhat "tacked on". If the latter is the case I wonder whether zip is the right format to use... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido at CNRI.Reston.VA.US Fri Dec 10 16:00:51 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 10:00:51 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 15:52:26 +0100." <19991210145227.01F99370CF2@snelboot.oratrix.nl> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> Message-ID: <199912101500.KAA07863@eric.cnri.reston.va.us> Again, the zip format does not have this problem. Some zip tools may -- then we don't use those. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Fri Dec 10 16:40:21 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 10:40:21 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: References: <14416.11676.888918.511932@weyr.cnri.reston.va.us> Message-ID: <14417.7909.511437.230915@weyr.cnri.reston.va.us> Greg Stein writes: > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic > number if you're worried about mixing CObjects. That's certainly one option, but I would have made readdir(), seekdir(), rewinddir() and closedir() into the methods read(), seek(), rewind() and close(). So it's a question of what interface you prefer; functions with magically interpreted token parameters (kind of like file descriptors, hey!), or something that is more recognizably object-oriented. I know my preference. ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From mal at lemburg.com Fri Dec 10 16:55:02 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 16:55:02 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> Message-ID: <38512256.F9287E24@lemburg.com> Jack Jansen wrote: > > > Jack Jansen asks: > > > > > Is it possible nowadays to have two files with the same name but > > > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same > > > archive? > > > > Depends on how you do it. > > Apparently I mis-phrased my question, I'll try again. > > When people suggested to use zip format as the standard Python archive format > I was a bit worried, becuase I've had it happen to me various times that I was > unable to create a ZIP archive with two files with the same name but different > paths (i.e. create an archive of a directory that contains both a foo/bar.py > and a foo/spam/bar.py). > > So, my question was: has this happened to me because the winzip I used was > braindead, or is there possibly a problem with the ZIP file format that > disallows two files with the same name in one archive? Most zip programs I've > seen also seem to present filenames as the primary metaphore, with full > pathnames somewhat "tacked on". > > If the latter is the case I wonder whether zip is the right format to use... Hmm, I've been doing the above for years now... never had a problem with it (I use Info-ZIPs tools, BTW), e.g. /home/lemburg> unzip -l projects/distribution/mxODBC-1.1.1.zip Archive: projects/distribution/mxODBC-1.1.1.zip Length Date Time Name -------- ---- ---- ---- 131316 06-09-99 14:10 ODBC/EasySoft/mxODBC.c 131316 06-09-99 14:10 ODBC/Informix/mxODBC.c ... Would be cool if I could use my packages as ZIP files :-) So here's another vote for using the ZIP format. BTW, wouldn't it make sense to include the zlib code in the core distribution much like the pcre stuff is now ? AFAIK, it is public domain and including it would remedy many of the compatibility issues with the different zlib versions around. Ok, that was my wish for Xmas :-) The rest is up to Mr. Clause... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Dec 10 17:04:24 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:04:24 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 16:55:02 +0100." <38512256.F9287E24@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> Message-ID: <199912101604.LAA14100@eric.cnri.reston.va.us> > BTW, wouldn't it make sense to include the zlib code > in the core distribution much like the pcre stuff is now ? > AFAIK, it is public domain and including it would remedy many of the > compatibility issues with the different zlib versions around. What compatibility issues? Note that the Win32 distri already comes with zlib statically linked into zlib.pyd. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Fri Dec 10 17:15:48 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:15:48 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> Message-ID: <38512734.CF6E4489@lemburg.com> Guido van Rossum wrote: > > > BTW, wouldn't it make sense to include the zlib code > > in the core distribution much like the pcre stuff is now ? > > AFAIK, it is public domain and including it would remedy many of the > > compatibility issues with the different zlib versions around. > > What compatibility issues? Note that the Win32 distri already comes > with zlib statically linked into zlib.pyd. There were issues with zlib 1.0.4 and later ones. Also, many Linux distributions don't have the zlib header files installed. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Dec 10 17:19:47 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:19:47 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 17:15:48 +0100." <38512734.CF6E4489@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> Message-ID: <199912101619.LAA14174@eric.cnri.reston.va.us> > There were issues with zlib 1.0.4 and later ones. Also, many > Linux distributions don't have the zlib header files installed. Hm. I don't recall having any problems reported to me. I'd rather not include the entire zlib distri in the Python distri -- zlib is rather big. Adding only the Unix source would be cheating. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Fri Dec 10 17:25:23 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:25:23 -0500 Subject: [Python-Dev] dbm clone with serious specs wanted Message-ID: <199912101625.LAA14216@eric.cnri.reston.va.us> Someone has asked me for a dbm clone that can store 16M keys of 350 bytes each, and runs on Linux, HPUX, and NT. That's 5.6 Gigabyte in keys alone! I presume most classic approaches won't cut it since total file size is typicall limited by the seek system call, internal data structures and/or file index format to 2Gb (signed longs) or 4Gb (unsigned longs). Does anyone have an idea where to start looking? Would a Python extension already exist? --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli at amber.org Fri Dec 10 17:29:27 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Fri, 10 Dec 1999 11:29:27 -0500 Subject: [Python-Dev] dbm clone with serious specs wanted In-Reply-To: <199912101625.LAA14216@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Fri, Dec 10, 1999 at 11:25:23AM -0500 References: <199912101625.LAA14216@eric.cnri.reston.va.us> Message-ID: <19991210112927.A14102@trump.amber.org> Guido van Rossum [guido at CNRI.Reston.VA.US] wrote: > Someone has asked me for a dbm clone that can store 16M keys of 350 > bytes each, and runs on Linux, HPUX, and NT. That's 5.6 Gigabyte in > keys alone! I presume most classic approaches won't cut it since > total file size is typicall limited by the seek system call, internal > data structures and/or file index format to 2Gb (signed longs) or 4Gb > (unsigned longs). > > Does anyone have an idea where to start looking? Would a Python > extension already exist? Assuming you mean an interface to a ddbm-style situation, you could easily use berkeley DB, I belive it is limited in the 4TB range... Chris -- | Christopher Petrilli | petrilli at amber.org From mal at lemburg.com Fri Dec 10 17:26:10 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:26:10 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> Message-ID: <385129A2.6FAF4E81@lemburg.com> Guido van Rossum wrote: > > > There were issues with zlib 1.0.4 and later ones. Also, many > > Linux distributions don't have the zlib header files installed. > > Hm. I don't recall having any problems reported to me. I'd rather > not include the entire zlib distri in the Python distri -- zlib > is rather big. Adding only the Unix source would be cheating. How about only adding those parts which would be needed to at least deflate the ZIP archive contents ? If the ZIP archive format becomes the standard for Python, we'd have to ensure that all Python users can read them. Well, at least that's what I would expect from a standard format :-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Dec 10 17:29:36 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:29:36 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 17:26:10 +0100." <385129A2.6FAF4E81@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> Message-ID: <199912101629.LAA14274@eric.cnri.reston.va.us> > How about only adding those parts which would be needed to > at least deflate the ZIP archive contents ? Ditto -- still lots of portability issues I bet. > If the ZIP archive format becomes the standard for Python, we'd > have to ensure that all Python users can read them. Well, at > least that's what I would expect from a standard format :-) There's a simple solution: don't use compression. With current disk prices it's really not worth it. Let the installer do the decompression (installers travel across networks where compression *is* worth it). --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Fri Dec 10 17:34:09 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Fri, 10 Dec 1999 11:34:09 -0500 (EST) Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: <38512734.CF6E4489@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> Message-ID: <14417.11137.562474.99270@amarok.cnri.reston.va.us> M.-A. Lemburg writes: >There were issues with zlib 1.0.4 and later ones. Also, many >Linux distributions don't have the zlib header files installed. For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm, and zlib.XXX.rpm only contains libz.so. On the other hand, anyone who's compiling Python should really have the various -devel RPMs installed. I'd argue against including it, because it might cause odd versioning problems. For example, what if I have PIL compiled against zlib1.1.2 (zlib is used for writing PNGs) and the Python binary includes zlib1.1.3? There might be hard-to-debug problems caused by calling the wrong symbol. PCRE is a special case, because we've actually hacked the code a lot; it's not the PCRE code as Philip Hazel distributes it. Just received Guido's email suggesting skipping compression in archives; not a bad idea. You'd use less CPU, but might do more I/O because you're reading more sectors off disk. There probably isn't much need for compression when the archive is on-disk; Java needed it because of applets. -- A.M. Kuchling http://starship.python.net/crew/amk/ The NSA response was, "Well, that was interesting, but there aren't any ciphers like that." -- Gus Simmons, "The History of Subliminal Channels" From petrilli at amber.org Fri Dec 10 17:39:44 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Fri, 10 Dec 1999 11:39:44 -0500 Subject: [Python-Dev] dbm clone with serious specs wanted In-Reply-To: <19991210112927.A14102@trump.amber.org>; from petrilli@amber.org on Fri, Dec 10, 1999 at 11:29:27AM -0500 References: <199912101625.LAA14216@eric.cnri.reston.va.us> <19991210112927.A14102@trump.amber.org> Message-ID: <19991210113944.B14102@trump.amber.org> Christopher Petrilli [petrilli at amber.org] wrote: > Guido van Rossum [guido at CNRI.Reston.VA.US] wrote: > > Does anyone have an idea where to start looking? Would a Python > > extension already exist? > > Assuming you mean an interface to a ddbm-style situation, you could easily > use berkeley DB, I belive it is limited in the 4TB range... I just did some checking... first Robin Dunn has an interface, but it's not currently compatible with BerkeleyDB 3.x, which just came out... it shouldn't be hard to retrofit. Anyway, the limits are based on page size... 512b page: 2TB 64K page: 256TB It uses 32bit numbers for pages, so I assume that is also a reflection of the number of keys allowed... given I belive one key must use a minimum of one page. I know that I've pushed earlier releases o around 50Gb without trouble, but you might see issues relatd to the number of keys. I'd ask Sleepycat directly, as they'r amazingly responsive. Chris -- | Christopher Petrilli | petrilli at amber.org From mal at lemburg.com Fri Dec 10 17:37:30 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:37:30 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> <199912101629.LAA14274@eric.cnri.reston.va.us> Message-ID: <38512C4A.ADB63C2B@lemburg.com> Guido van Rossum wrote: > > > How about only adding those parts which would be needed to > > at least deflate the ZIP archive contents ? > > Ditto -- still lots of portability issues I bet. Hmm, not sure: zlib is pretty portable. Its the interface changes that can break code, not so much the zlib portability. > > If the ZIP archive format becomes the standard for Python, we'd > > have to ensure that all Python users can read them. Well, at > > least that's what I would expect from a standard format :-) > > There's a simple solution: don't use compression. With current disk > prices it's really not worth it. Let the installer do the > decompression (installers travel across networks where compression > *is* worth it). That's a possibility, right. It would still let us use the many ZIP tools while not adding complexity to the core. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 10 17:43:11 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:43:11 +0100 Subject: [Python-Dev] dbm clone with serious specs wanted References: <199912101625.LAA14216@eric.cnri.reston.va.us> Message-ID: <38512D9F.2AE9DC8B@lemburg.com> Guido van Rossum wrote: > > Someone has asked me for a dbm clone that can store 16M keys of 350 > bytes each, and runs on Linux, HPUX, and NT. That's 5.6 Gigabyte in > keys alone! I presume most classic approaches won't cut it since > total file size is typicall limited by the seek system call, internal > data structures and/or file index format to 2Gb (signed longs) or 4Gb > (unsigned longs). > > Does anyone have an idea where to start looking? Would a Python > extension already exist? I'd suggest using a dbm style wrapper around the DB-API and then trying out the many cross-platform databases. IBM DB2 comes to mind... it can certainly handle these sizes given the right hardware. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake at acm.org Fri Dec 10 18:35:01 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 12:35:01 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <199912100203.VAA07410@eric.cnri.reston.va.us> References: <38503BDC.CB91FB29@digicool.com> <199912100203.VAA07410@eric.cnri.reston.va.us> Message-ID: <14417.14789.306365.439782@weyr.cnri.reston.va.us> Guido van Rossum writes: > Since we seem to be on an adding spree, I don't see why not -- as long > as POSIX keeps it available :) fsync() isn't listed in O'Reilly's POSIX book, so it's probably not in the POSIX spec. Neither is the tempnam() function I added in yesterdays spree, though tmpfile() and tmpnam() are. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim at digicool.com Fri Dec 10 19:37:53 1999 From: jim at digicool.com (Jim Fulton) Date: Fri, 10 Dec 1999 18:37:53 +0000 Subject: [Python-Dev] Thankyou for fsync :) References: <38503BDC.CB91FB29@digicool.com> <199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us> Message-ID: <38514881.5C124E36@digicool.com> "Fred L. Drake, Jr." wrote: > > Guido van Rossum writes: > > Since we seem to be on an adding spree, I don't see why not -- as long > > as POSIX keeps it available :) > > fsync() isn't listed in O'Reilly's POSIX book, so it's probably not > in the POSIX spec. It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;) I'd still like it to stay, where available. :) Jim -- Jim Fulton mailto:jim at digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fdrake at acm.org Fri Dec 10 19:36:44 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 13:36:44 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <38514881.5C124E36@digicool.com> References: <38503BDC.CB91FB29@digicool.com> <199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us> <38514881.5C124E36@digicool.com> Message-ID: <14417.18492.932392.608912@weyr.cnri.reston.va.us> Jim Fulton writes: > It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;) I don't have that one, but I certainly don't have any plans on ripping out fsync(). Not today, at any rate. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim at interet.com Fri Dec 10 19:37:50 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 13:37:50 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> Message-ID: <3851487E.F610BE17@interet.com> Jack Jansen wrote: > > Is it possible nowadays to have two files with the same name but different > paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive? Yes, I just made one with WinZip. JimA From gmcm at hypernet.com Fri Dec 10 19:41:56 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Fri, 10 Dec 1999 13:41:56 -0500 Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <38514881.5C124E36@digicool.com> Message-ID: <1267271840-1299809@hypernet.com> Fred L. Drake, Jr. wrote: > > Guido van Rossum writes: > > Since we seem to be on an adding spree, I don't see why not > > -- as long as POSIX keeps it available :) > > fsync() isn't listed in O'Reilly's POSIX book, so it's > probably not > in the POSIX spec. > It's in the other O'Reilly POSIX book, p 348 of POSIX.4. - Gordon From fdrake at acm.org Fri Dec 10 19:43:56 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 13:43:56 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <1267271840-1299809@hypernet.com> References: <38514881.5C124E36@digicool.com> <1267271840-1299809@hypernet.com> Message-ID: <14417.18924.461115.906914@weyr.cnri.reston.va.us> Gordon McMillan writes: > It's in the other O'Reilly POSIX book, p 348 of POSIX.4. Ah, I don't have that either. I thought POSIX.4 was real-time stuff. (If anyone wants to send a copy along, I'd be glad to consider adding reasonable interfaces for Python. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim at interet.com Fri Dec 10 19:43:18 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 13:43:18 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> Message-ID: <385149C6.DF942F36@interet.com> Jack Jansen wrote: > When people suggested to use zip format as the standard Python archive format > I was a bit worried, becuase I've had it happen to me various times that I was > unable to create a ZIP archive with two files with the same name but different > paths (i.e. create an archive of a directory that contains both a foo/bar.py > and a foo/spam/bar.py). No problem. But most zip tools will create an archive with either no path (file name is "bar.py") or full path (filename "foo/bar.py". If paths are different Ok, not sure about duplicate bare names. The difference is an option and has nothing to do with how the file name is specified to the utility. JimA From jim at interet.com Fri Dec 10 19:48:47 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 13:48:47 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> Message-ID: <38514B0F.84A546C6@interet.com> "M.-A. Lemburg" wrote: > How about only adding those parts which would be needed to > at least deflate the ZIP archive contents ? > > If the ZIP archive format becomes the standard for Python, we'd > have to ensure that all Python users can read them. Well, at > least that's what I would expect from a standard format :-) I think that for now we will need to create archives with compression method zero: no compression. That is a valid compression method all ZIP utilities support. The point is that zlib just isn't part of Python. Jim From jcw at equi4.com Fri Dec 10 19:57:00 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Fri, 10 Dec 1999 19:57:00 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> <38514B0F.84A546C6@interet.com> Message-ID: <38514CFC.47C8A8E0@equi4.com> "James C. Ahlstrom" wrote: [...] > I think that for now we will need to create archives with > compression method zero: no compression. That is a valid > compression method all ZIP utilities support. Sounds good. This is also exactly how Java started out with jar. -jcw From gmcm at hypernet.com Fri Dec 10 20:06:59 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Fri, 10 Dec 1999 14:06:59 -0500 Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us> References: <1267271840-1299809@hypernet.com> Message-ID: <1267270337-1390160@hypernet.com> Fred wrote: > Gordon McMillan writes: > > It's in the other O'Reilly POSIX book, p 348 of POSIX.4. > > Ah, I don't have that either. I thought POSIX.4 was real-time > stuff. Well, it says it is, but having done some stuff with automated warehouses, I'm always amazed at how people will use the term "real-time". I'd say "pretty likely to be responsive" ;-). > (If anyone wants to send a copy along, I'd be glad to consider > adding reasonable interfaces for Python. ;) Only around 70 documented functions, but many of them appear to be tweaks, or redocumenting stuff in view of new kernel behaviors. - Gordon From fdrake at acm.org Fri Dec 10 20:18:16 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 14:18:16 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <1267270337-1390160@hypernet.com> References: <1267271840-1299809@hypernet.com> <1267270337-1390160@hypernet.com> Message-ID: <14417.20984.151867.630871@weyr.cnri.reston.va.us> Gordon McMillan writes: > Well, it says it is, but having done some stuff with automated > warehouses, I'm always amazed at how people will use the > term "real-time". I'd say "pretty likely to be responsive" ;-). Oh, a manager's interpretation of real-time: "I want this by close of business next Wednesday!" > Only around 70 documented functions, but many of them > appear to be tweaks, or redocumenting stuff in view of new > kernel behaviors. Anything that should be added anywhere? Failing all else, I can probably read the man pages if I know what to look for. ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake at acm.org Fri Dec 10 22:40:29 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 16:40:29 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <14417.29517.238124.767279@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > fpathconf(fd, name) -- Get configuration limit for a file ... > pathconf(path, name) -- Gets config variables for a path ... > sysconf(int name) -- Gets system configuration information > -- would need constants from unistd.h I'm almost done with these, and also confstr (from POSIX.2). I don't have time to finish them today; I'll check them in next week. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From skip at mojam.com Sat Dec 11 00:20:21 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 10 Dec 1999 17:20:21 -0600 (CST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us> References: <38514881.5C124E36@digicool.com> <1267271840-1299809@hypernet.com> <14417.18924.461115.906914@weyr.cnri.reston.va.us> Message-ID: <14417.35509.284749.924066@dolphin.mojam.com> Fred> I thought POSIX.4 was real-time stuff. This all seems to be happening in real-time to me... ;-) Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From andy at robanal.demon.co.uk Sat Dec 11 01:11:28 1999 From: andy at robanal.demon.co.uk (Andy Robinson) Date: Sat, 11 Dec 1999 00:11:28 GMT Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: <199912101619.LAA14174@eric.cnri.reston.va.us> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> Message-ID: <38519531.15439641@post.demon.co.uk> On Fri, 10 Dec 1999 11:19:47 -0500, you wrote: >> There were issues with zlib 1.0.4 and later ones. Also, many >> Linux distributions don't have the zlib header files installed. > >Hm. I don't recall having any problems reported to me. I'd rather >not include the entire zlib distri in the Python distri -- zlib >is rather big. Adding only the Unix source would be cheating. > Minor data point on the importance of zlib. I spent a long time figuring out what Adobe PDF's "flate filter" was before I discovered it was the inverse of "deflate" (yes, there were loud sounds of head-slapping when I clicked) and discovered that zlib.compress() was EXACTLY what you need to create compressed streams in PDF documents. Being a Windows person, I naively assumed zlib was in the standard distribution everywhere, and subsequently discovered Mac and Unix users were not so happy. So if you want to make PDFs, having zlib around is very useful indeed... - Andy From akuchlin at mems-exchange.org Sat Dec 11 01:35:58 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Fri, 10 Dec 1999 19:35:58 -0500 (EST) Subject: [Python-Dev] Enabling more modules by default In-Reply-To: <38519531.15439641@post.demon.co.uk> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <38519531.15439641@post.demon.co.uk> Message-ID: <14417.40046.850655.491684@amarok.cnri.reston.va.us> Andy Robinson writes: >... So if you want to make PDFs, having zlib >around is very useful indeed... This raises a good point, though I still dislike the idea of including the zlib library. It would be nice if Setup.in would be autogenerated to compile all the modules it can -- bsddb if it finds libdb, zlib if it finds libz.a. I vaguely recall once working on a Python script that would generate a customized Setup.in file, though I can't find it at the moment. Given that someone has already suggested automatically enabling threads on those platforms that support it, why not go all the way? (But a Python script that generates a Setup.in isn't going to work, unless we compile a minipython first and then create a more complete Setup file.) -- A.M. Kuchling http://starship.python.net/crew/amk/ The most merciful thing in the world... is the inability of the human mind to correlate all its contents. -- H.P. Lovecraft From petrilli at amber.org Sat Dec 11 06:54:41 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Sat, 11 Dec 1999 00:54:41 -0500 Subject: [Python-Dev] Enabling more modules by default In-Reply-To: <14417.40046.850655.491684@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Fri, Dec 10, 1999 at 07:35:58PM -0500 References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <38519531.15439641@post.demon.co.uk> <14417.40046.850655.491684@amarok.cnri.reston.va.us> Message-ID: <19991211005441.A20923@trump.amber.org> Andrew M. Kuchling [akuchlin at mems-exchange.org] wrote: > Andy Robinson writes: > >... So if you want to make PDFs, having zlib > >around is very useful indeed... > > This raises a good point, though I still dislike the idea of including > the zlib library. It would be nice if Setup.in would be autogenerated > to compile all the modules it can -- bsddb if it finds libdb, zlib if > it finds libz.a. I vaguely recall once working on a Python script that > would generate a customized Setup.in file, though I can't find it at > the moment. Given that someone has already suggested automatically > enabling threads on those platforms that support it, why not go all > the way? WEll, one warning about BSDdb, is that it comes in 3 incarnations that all might be -ldb :-): 1.85 2.x 3.x and they are NOT compatible with eachother. 1.85 has serious brain damage, and honestly I'd love to see Robin Dunn's 2.x work rolled in to replace it, but not sure how viable that is---people might actually want the 1.85 breakage. Chris -- | Christopher Petrilli | petrilli at amber.org From gstein at lyra.org Sat Dec 11 12:23:30 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 03:23:30 -0800 (PST) Subject: [Python-Dev] Zip format In-Reply-To: <1267287023-386248@hypernet.com> Message-ID: On Fri, 10 Dec 1999, Gordon McMillan wrote: >... > If the user imports foo.spam.bar, an importer will be asked for: > foo (return foo.__init__) > foo.spam (return foo.bar.__init__) ^^^ foo.spam.__init__ > foo.spam.bar (return foo.spam.bar) The above sequence is what currently happens. > But the API allows lots of variations. This is another possible > interaction: > foo (return None) > foo.__init__ (return foo.__init__) > foo.spam (return None) > foo.bar.__init__ (return foo.bar.__init__) > foo.spam.bar (return foo.spam.bar) The core of imputil has no knowledge of the __init__ thingy. That is specific to the filesystem-based stuff. So in this sense, "possible" means "imputil could be changed to do this". I would argue against the change, however :-) > Or, by looking at different args to get_code, you could look at > the requests as: > foo in context of None > spam in context of foo > bar in context of foo.spam Bing! Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Sat Dec 11 12:26:59 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 03:26:59 -0800 (PST) Subject: [Python-Dev] Zip format In-Reply-To: <14417.11137.562474.99270@amarok.cnri.reston.va.us> Message-ID: On Fri, 10 Dec 1999, Andrew M. Kuchling wrote: > M.-A. Lemburg writes: > >There were issues with zlib 1.0.4 and later ones. Also, many > >Linux distributions don't have the zlib header files installed. > > For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm, > and zlib.XXX.rpm only contains libz.so. On the other hand, anyone > who's compiling Python should really have the various -devel RPMs Exactly. The distro's *have* the headers -- it all depends on what you installed. I happen to have the headers on my system (because I installed zlib-devel, as AMK mentions). > installed. I'd argue against including it, because it might cause odd > versioning problems. For example, what if I have PIL compiled against > zlib1.1.2 (zlib is used for writing PNGs) and the Python binary > includes zlib1.1.3? There might be hard-to-debug problems > caused by calling the wrong symbol. I totally agree. >... > Just received Guido's email suggesting skipping compression in > archives; not a bad idea. You'd use less CPU, but might do > more I/O because you're reading more sectors off disk. There > probably isn't much need for compression when the archive is on-disk; > Java needed it because of applets. There are all kinds of things that we can do here. Consider mmap'ing the archive into a shared memory segment, used by all the Python processes on the system... woo! :-) IMO, the standard distro can use zip files, and just bail if they are compressed, but Python cannot load zlib. Obvious failure with an obvious remedy. No big deal. As Guido also mentions, an installer can just bring along zlib if they want to use a compressed archive. i.e. their choice. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Sat Dec 11 12:33:47 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 03:33:47 -0800 (PST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <14417.7909.511437.230915@weyr.cnri.reston.va.us> Message-ID: On Fri, 10 Dec 1999, Fred L. Drake, Jr. wrote: > Greg Stein writes: > > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic > > number if you're worried about mixing CObjects. > > That's certainly one option, but I would have made readdir(), > seekdir(), rewinddir() and closedir() into the methods read(), seek(), > rewind() and close(). So it's a question of what interface you > prefer; functions with magically interpreted token parameters (kind of > like file descriptors, hey!), or something that is more recognizably > object-oriented. > I know my preference. ;-) Well, I know my preference of those two alternatives, too :-), but if we're going with the Pythonic minimalism, then I'd think you would expose the functions "as close as possible." Would I argue if you went with a method-based approach? No :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Sat Dec 11 14:07:08 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 11 Dec 1999 14:07:08 +0100 Subject: [Python-Dev] Zip format References: Message-ID: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com> Greg Stein wrote: > There are all kinds of things that we can do here. Consider mmap'ing the > archive into a shared memory segment, used by all the Python processes on > the system... woo! :-) it doesn't really look like this, but I hope we're defining interfaces here, and not just "one true solution". I'd be very annoyed if it turned out that we couldn't use works' archives with the new standard importer... > As Guido also mentions, an installer can just bring along zlib if they > want to use a compressed archive. i.e. their choice. in the pythonworks universe, the installer and the application is the same thing... From fredrik at pythonware.com Sat Dec 11 14:12:12 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 11 Dec 1999 14:12:12 +0100 Subject: [Python-Dev] Thankyou for fsync :) References: <38503BDC.CB91FB29@digicool.com><199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us> Message-ID: <006c01bf43d9$57bc0f90$f29b12c2@secret.pythonware.com> Fred L. Drake, Jr. wrote: > fsync() isn't listed in O'Reilly's POSIX book, so it's probably not > in the POSIX spec. Neither is the tempnam() function I added in > yesterdays spree, though tmpfile() and tmpnam() are. instead of guessing, you can get a complete list from: http://www.unix-systems.org/apis.html reading up on the "single unix specification" should also help: http://www.unix-systems.org/online.html (registration required; contains complete man pages for all functions covered by the UNIX95 and UNIX98 specification) From gstein at lyra.org Sat Dec 11 14:10:00 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 05:10:00 -0800 (PST) Subject: [Python-Dev] Zip format In-Reply-To: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com> Message-ID: On Sat, 11 Dec 1999, Fredrik Lundh wrote: > Greg Stein wrote: > > There are all kinds of things that we can do here. Consider mmap'ing the > > archive into a shared memory segment, used by all the Python processes on > > the system... woo! :-) > > it doesn't really look like this, but I hope we're defining > interfaces here, and not just "one true solution". I'd be Oh, I was just having fun there :-). I don't see "one true solution" at all. Just some standards. > very annoyed if it turned out that we couldn't use works' > archives with the new standard importer... get_code() and its processing is not going anywhere. Some stuff will change under the covers, and we'll be using sys.path (typically) rather than chaining (although chaining will still exist!). I would think that your Importer subclass would be directly usable, but the installation could/would be a bit different. Heck, worst case, nothing is going to invalidate your archive format -- feel free to berate me if I ever break that! Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim at interet.com Mon Dec 13 15:50:11 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 13 Dec 1999 09:50:11 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <38510254.ED15D32B@interet.com> Message-ID: <385507A3.9F6AAF0F@interet.com> > Jean-Claude Wippler wrote: > > > Ouch - what's wrong with zip archives? > > > With all due respect - I sincerely hope you will reconsider and alter > > your code to work with zip files. It's probably a small adjustment? OK, I now have a new module "zipfile" which reads and writes ZIP files. It is written in Python and has been tested on Windows and Linux. I tested it with WinZip and found that the files it creates are read OK with WinZip, and WinZip files are read OK with zipfile. So I am withdrawing my Python archive file format, and re-writing all my stuff using zipfile. It should all be done in a week. Basically everything works fine. But there are some problems. Python seems to lack a CRC-32 function, so I wrote one in Python. It is slow. We need to add a CRC-32 function to some Python built-in module that it always present, like md5 or binascci. The zlib module is not necessarily present. I can't seem to get WinZip to record a partial path. That is, I want the ./Lib/test package to have these ZIP paths: test/__init__.pyc test/testall.pyc ... but WinZip creates files with either no path at all or the fully specified path. Am I missing something? Do all other ZIP tools do this too? JimA Return-Path: Delivered-To: python-dev at dinsdale.python.org Received: from python.org (parrot.python.org [132.151.1.90]) by dinsdale.python.org (Postfix) with ESMTP id EFDA11CDB9 for ; Mon, 13 Dec 1999 10:21:56 -0500 (EST) Received: from cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1] (may be forged)) by python.org (8.9.1a/8.9.1) with ESMTP id KAA06423 for ; Mon, 13 Dec 1999 10:21:55 -0500 (EST) Received: from kaluha.cnri.reston.va.us (kaluha.cnri.reston.va.us [132.151.7.31]) by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id KAA04774 for ; Mon, 13 Dec 1999 10:21:56 -0500 (EST) Received: from eric.cnri.reston.va.us (eric.cnri.reston.va.us [10.27.10.23]) by kaluha.cnri.reston.va.us (8.9.1b+Sun/8.9.1) with ESMTP id KAA04556 for ; Mon, 13 Dec 1999 10:22:34 -0500 (EST) Received: from CNRI.Reston.VA.US (localhost [127.0.0.1]) by eric.cnri.reston.va.us (8.9.3+Sun/8.9.1) with ESMTP id KAA18858 for ; Mon, 13 Dec 1999 10:22:34 -0500 (EST) Resent-Message-Id: <199912131522.KAA18858 at eric.cnri.reston.va.us> Message-Id: <199912131522.KAA18858 at eric.cnri.reston.va.us> To: "James C. Ahlstrom" Subject: Re: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-reply-to: Your message of "Mon, 13 Dec 1999 09:50:11 EST." <385507A3.9F6AAF0F at interet.com> References: <000301bf4206$b39e5b80$36a2143f at tim> <384FC47A.BB4DA517 at interet.com> <384FDAF5.C25C447C at equi4.com> <38510254.ED15D32B at interet.com> <385507A3.9F6AAF0F at interet.com> Date: Mon, 13 Dec 1999 10:22:12 -0500 From: Guido van Rossum Resent-Cc: python-dev at python.org Resent-Date: Mon, 13 Dec 1999 10:22:34 -0500 Resent-From: Guido van Rossum Sender: python-dev-admin at python.org Errors-To: python-dev-admin at python.org X-BeenThere: python-dev at python.org X-Mailman-Version: 1.2 (experimental) Precedence: bulk List-Id: Python core developers > OK, I now have a new module "zipfile" which reads and > writes ZIP files. It is written in Python and has been tested > on Windows and Linux. I tested it with WinZip and found that > the files it creates are read OK with WinZip, and WinZip > files are read OK with zipfile. So I am withdrawing my > Python archive file format, and re-writing all my stuff > using zipfile. It should all be done in a week. Ah, good! (This saves me the trouble of cleaning up our own zip code :-) > Basically everything works fine. But there are some problems. > > Python seems to lack a CRC-32 function, so I wrote one > in Python. It is slow. We need to add a CRC-32 function > to some Python built-in module that it always present, like > md5 or binascci. The zlib module is not necessarily present. > > I can't seem to get WinZip to record a partial path. That is, > I want the ./Lib/test package to have these ZIP paths: > test/__init__.pyc > test/testall.pyc > ... > but WinZip creates files with either no path at all or the > fully specified path. Am I missing something? Do all > other ZIP tools do this too? Unclick the "Save Extra Folder Info" and then drag the *parent* folder into the archive. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Mon Dec 13 18:00:26 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 13 Dec 1999 12:00:26 -0500 (EST) Subject: [Python-Dev] confstr(), fpathconf(), pathconf(), sysconf() Message-ID: <14421.9770.623399.673010@weyr.cnri.reston.va.us> I've just checked in bindings for these POSIX.1 and POSIX.2 functions, and thought I'd explain the interfaces for those who don't want to read the diffs. ;) These functions expect a "name" parameter (that's how it's described in the man pages and the O'Reilly book). The value for "name" is an integer that's defined in the system headers. The constants all have the form _XX_SOME_NAME where XX is PC for fpathconf()- and pathconf()-related names, SC for sysconf()-related names, and CS for confstr()-related names. Some names are defined by the standards, but additional names are defined by implementations (there are a *lot* of sysconf() names under Solaris!). We don't want to expose enormous numbers of constants in the module's interface, however, as there are already a lot of names in the posix module. That would also slow down module initialization. We also don't want to force callers to use magic numbers in code that uses these functions, especially since the values may be system-specific. The best way to call these functions, then, is to use a *string* that corresponds to the name of the C #define sysmbol with the leading underscore stripped off. For example, to get the length of the arguments to exec(), you could say: num_args = os.sysconf("SC_ARG_MAX") The string will be mapped to the appropriate numeric value defined in an internal table. If the name isn't defined for the platform, a ValueError will be raised. >>> num_args = os.sysconf("FOO_BAR") Traceback (innermost last): File "", line 1, in ? ValueError: unrecognized configuration name To allow retrieval for platform-dependent configuration information, integers can also be passed in. On Solaris, this is equivalent to using "SC_ARG_MAX": num_args = os.sysconf(1) (Ignoring the portability and readability issues, ha!) There are three separate tables used for this; one for confstr(), one for sysconf(), and one shared by fpathconf() and pathconf(). The names used to build the tables come from Linux and Solaris; we can add other names as needed. To add names, I'd need the names to add and how to test for their existence at compile time (#ifdef, etc.). -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake at acm.org Mon Dec 13 19:35:49 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 13 Dec 1999 13:35:49 -0500 (EST) Subject: [Python-Dev] CVS: python/dist/src/Modules posixmodule.c,2.116,2.117 In-Reply-To: References: <199912131637.LAA17318@weyr.cnri.reston.va.us> Message-ID: <14421.15493.28263.387680@weyr.cnri.reston.va.us> Greg Stein writes: > I'm not very familiar with these APIs, but should you let go of the > interpreter lock when you call them? > (and for the other new funcs) None of these should be doing an I/O as far as I can determine. Whenever I get to getlogin() (which AMK & I decided should be included, based on the specs that /F pointed us to), I will release the interpreter lock for the getlogin_r() variant. I'm not sure I should release it for the non-reentrant getlogin(), however; the specification for getlogin*() pretty much requires that it read from utmp. ;( -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From gstein at lyra.org Mon Dec 13 21:31:22 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 13 Dec 1999 12:31:22 -0800 (PST) Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <385507A3.9F6AAF0F@interet.com> Message-ID: On Mon, 13 Dec 1999, James C. Ahlstrom wrote: >... > OK, I now have a new module "zipfile" which reads and > writes ZIP files. It is written in Python and has been tested > on Windows and Linux. I tested it with WinZip and found that > the files it creates are read OK with WinZip, and WinZip > files are read OK with zipfile. So I am withdrawing my > Python archive file format, and re-writing all my stuff > using zipfile. It should all be done in a week. Can you post zipfile.py so that people can starting reviewing that? >... > Python seems to lack a CRC-32 function, so I wrote one > in Python. It is slow. We need to add a CRC-32 function > to some Python built-in module that it always present, like > md5 or binascci. The zlib module is not necessarily present. See zlib.crc32() This is interesting, of course, because we have previously stated that zlib (and its compression) is optional. But if we need the CRC-32 function... hehe... Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one at email.msn.com Mon Dec 13 23:11:33 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 13 Dec 1999 17:11:33 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <385507A3.9F6AAF0F@interet.com> Message-ID: <000401bf45b7$04edfaa0$96a2143f@tim> [James C. Ahlstrom] > ... > Python seems to lack a CRC-32 function, so I wrote one > in Python. It is slow. We need to add a CRC-32 function > to some Python built-in module that it always present, like > md5 or binascci. The zlib module is not necessarily present. Unfortunately, there are many different CRC functions in common use. None belong in md5; if the intent is to support just zip's version, adding a (say) zipcrc32 function to binascii would be ok; if we expect to support others as well, a new parameterized crc module would be in order. > I can't seem to get WinZip to record a partial path. That is, > I want the ./Lib/test package to have these ZIP paths: > test/__init__.pyc > test/testall.pyc > ... > but WinZip creates files with either no path at all or the > fully specified path. Am I missing something? Do all > other ZIP tools do this too? No, it's a clumsiness unique to WinZip (damn GUIs <0.9 wink>). In the Add dialog box, you need to cd to the *Lib* directory, check the "Save extra folder info" box, and then, e.g., 1. Put test\*.pyc in the Add Files line, and click Add With Wildcards. Then all test\*.pyc files will be added, with paths test/__init__.pyc etc. or 2. Put "test\__init__.pyc" "test\testall.pyc" (including the quotes!) in the Add Files line, and click Add. Since #2 can be unbearable, other useful strategies include: 3. Use #1 (e.g. with dir\*.*) then delete the files you didn't really want. 4. Use #1 repeatedly, cleverly using a number of wildcard patterns that cover the files of interest. 5. Mixtures of #3 and #4. 6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has an "experimental" cmdline add-on too, but haven't tried it). From jim at interet.com Tue Dec 14 14:13:03 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 08:13:03 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: Message-ID: <3856425F.8C5E7A42@interet.com> Greg Stein wrote: > > Can you post zipfile.py so that people can starting reviewing that? Yes, it will be available by next Monday. I just want to get it really working and pretty, and with documentation. JimA From jim at interet.com Tue Dec 14 14:26:50 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 08:26:50 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000401bf45b7$04edfaa0$96a2143f@tim> Message-ID: <3856459A.BF5A798A@interet.com> Tim Peters wrote: > > [James C. Ahlstrom] > > ... > > Python seems to lack a CRC-32 function, so I wrote one > > Unfortunately, there are many different CRC functions in common use. None > belong in md5; if the intent is to support just zip's version, adding a > (say) zipcrc32 function to binascii would be ok; if we expect to support > others as well, a new parameterized crc module would be in order. OK, a CRC-32 in binascii it is. The CRC-32 I have comes with these comments which seem to indicate it is a more "official standard" CRC-32 than average: # * Crc - 32 BIT ANSI X3.66 CRC checksum files #*********************************************************************\ #* *| #* Demonstration program to compute the 32-bit CRC used as the frame *| #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71 *| #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level *| #* protocol). The 32-bit FCS was added via the Federal Register, *| #* 1 June 1982, p.23798. I presume but don't know for certain that *| #* this polynomial is or will be included in CCITT V.41, which *| #* defines the 16-bit CRC (often called CRC-CCITT) polynomial. FIPS *| #* PUB 78 says that the 32-bit FCS reduces otherwise undetected *| #* errors by a factor of 10^-5 over 16-bit FCS. *| #* *| #********************************************************************* #* Copyright (C) 1986 Gary S. Brown. You may use this program, or #* code or tables extracted from it, as desired without restriction. I can submit this as a patch to binascii, or if the Copyright bothers anyone, maybe it is better for Guido to use his CRC-32 from his ZIP code. Preference? > > I can't seem to get WinZip to record a partial path. That is, > > dialog box, you need to cd to the *Lib* directory, check the "Save extra > folder info" box, and then, e.g., Thanks. I knew there had to be some magic incantation to do it. > 6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has > an "experimental" cmdline add-on too, but haven't tried it). Actually pkzip 2.04g doesn't work because it writes names in upper case and is limited to 8.3 names (I think). My zipfile.py can be used as a basis for a command line tool. Actually I use makefiles with imbedded Python programs and find this easier than command line tools. JimA From guido at CNRI.Reston.VA.US Tue Dec 14 15:53:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 09:53:04 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Tue, 14 Dec 1999 08:26:50 EST." <3856459A.BF5A798A@interet.com> References: <000401bf45b7$04edfaa0$96a2143f@tim> <3856459A.BF5A798A@interet.com> Message-ID: <199912141453.JAA23429@eric.cnri.reston.va.us> > OK, a CRC-32 in binascii it is. The CRC-32 I > have comes with these comments which seem to indicate it is a > more "official standard" CRC-32 than average: > > # * Crc - 32 BIT ANSI X3.66 CRC checksum files > #*********************************************************************\ > #* *| > #* Demonstration program to compute the 32-bit CRC used as the frame *| > #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71 *| > #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level *| > #* protocol). The 32-bit FCS was added via the Federal Register, *| > #* 1 June 1982, p.23798. I presume but don't know for certain that *| > #* this polynomial is or will be included in CCITT V.41, which *| > #* defines the 16-bit CRC (often called CRC-CCITT) polynomial. FIPS *| > #* PUB 78 says that the 32-bit FCS reduces otherwise undetected *| > #* errors by a factor of 10^-5 over 16-bit FCS. *| > #* *| > #********************************************************************* > #* Copyright (C) 1986 Gary S. Brown. You may use this program, or > #* code or tables extracted from it, as desired without restriction. > > I can submit this as a patch to binascii, or if the Copyright bothers > anyone, maybe it is better for Guido to use his CRC-32 from his ZIP > code. Preference? I looked, but "my" crc32 in the zlib module (which was actually contributed by Andrew Kuchling) is just a wrapper around the crc32 function in zlib, which is copyrighted by Mark Adler and follows the zlib rules. I propose to use Gary Brown's code. I'll defend this to CNRI's lawyers if need be. Jim, have you checked that this is the right CRC to use for zip's CRC? (This in the light of Tim's assertion that there are many CRCs around.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at interet.com Tue Dec 14 16:22:56 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 10:22:56 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000401bf45b7$04edfaa0$96a2143f@tim> <3856459A.BF5A798A@interet.com> <199912141453.JAA23429@eric.cnri.reston.va.us> Message-ID: <385660D0.C6C0C7B9@interet.com> Guido van Rossum wrote: > I propose to use Gary Brown's code. I'll defend this to CNRI's > lawyers if need be. > > Jim, have you checked that this is the right CRC to use for zip's CRC? > (This in the light of Tim's assertion that there are many CRCs around.) The CRC it calculates agrees with the CRC of WinZip for all files I have tried. The original Gary Brown code was much longer and included file reading. Here is the shortened version: JimA # * Crc - 32 BIT ANSI X3.66 CRC checksum files #*********************************************************************\ #* *| #* Demonstration program to compute the 32-bit CRC used as the frame *| #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71 *| #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level *| #* protocol). The 32-bit FCS was added via the Federal Register, *| #* 1 June 1982, p.23798. I presume but don't know for certain that *| #* this polynomial is or will be included in CCITT V.41, which *| #* defines the 16-bit CRC (often called CRC-CCITT) polynomial. FIPS *| #* PUB 78 says that the 32-bit FCS reduces otherwise undetected *| #* errors by a factor of 10^-5 over 16-bit FCS. *| #* *| #********************************************************************* # #* Copyright (C) 1986 Gary S. Brown. You may use this program, or #* code or tables extracted from it, as desired without restriction. # First, the polynomial itself and its table of feedback terms. The # polynomial is # X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X^1+X^0 # Note that we take it "backwards" and put the highest-order term in # the lowest-order bit. The X^32 term is "implied"; the LSB is the # X^31 term, etc. The X^0 term (usually shown as "+1") results in # the MSB being 1. # Note that the usual hardware shift register implementation, which # is what we're using (we're merely optimizing it by doing eight-bit # chunks at a time) shifts bits into the lowest-order term. In our # implementation, that means shifting towards the right. Why do we # do it this way? Because the calculated CRC must be transmitted in # order from highest-order term to lowest-order term. UARTs transmit # characters in order from LSB to MSB. By storing the CRC this way, # we hand it to the UART in the order low-byte to high-byte; the UART # sends each low-bit to hight-bit; and the result is transmission bit # by bit from highest- to lowest-order term without requiring any bit # shuffling on our part. Reception works similarly. # The feedback terms table consists of 256, 32-bit entries. Notes: # # 1. The table can be generated at runtime if desired; code to do so # is shown later. It might not be obvious, but the feedback # terms simply represent the results of eight shift/xor opera- # tions for all combinations of data and CRC register values. # # 2. The CRC accumulation logic is the same for all CRC polynomials, # be they sixteen or thirty-two bits wide. You simply choose the # appropriate table. Alternatively, because the table can be # generated at runtime, you can start by generating the table for # the polynomial in question and use exactly the same "updcrc", # if your application needn't simultaneously handle two CRC # polynomials. (Note, however, that XMODEM is strange.) # # 3. For 16-bit CRCs, the table entries need be only 16 bits wide; # of course, 32-bit entries work OK if the high 16 bits are zero. # # 4. The values must be right-shifted by eight bits by the "updcrc" # logic; the shift must be unsigned (bring in zeroes). On some # hardware you could probably optimize the shift in assembler by # using byte-swap instructions. # Converted to Python by James C. Ahlstrom crc_32_tab = [ # CRC polynomial 0xedb88320 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, 0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, 0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, 0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7, 0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, 0x14015c4f, 0x63066cd9, 0xfa0f3d63, 0x8d080df5, 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, 0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, 0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59, 0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423, 0xcfba9599, 0xb8bda50f, 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, 0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, 0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433, 0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d, 0x91646c97, 0xe6635c01, 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, 0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, 0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65, 0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7, 0xa4d1c46d, 0xd3d6f4fb, 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, 0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, 0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f, 0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81, 0xb7bd5c3b, 0xc0ba6cad, 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, 0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, 0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1, 0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb, 0x196c3671, 0x6e6b06e7, 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, 0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, 0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b, 0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55, 0x316e8eef, 0x4669be79, 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, 0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, 0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d, 0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f, 0x72076785, 0x05005713, 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, 0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, 0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777, 0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69, 0x616bffd3, 0x166ccf45, 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, 0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, 0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9, 0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693, 0x54de5729, 0x23d967bf, 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, 0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d ] def crc32(string): crc = 0xFFFFFFFF for ch in string: crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) & 0xFFFFFF) return ~crc From tim_one at email.msn.com Tue Dec 14 18:06:36 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 14 Dec 1999 12:06:36 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912141453.JAA23429@eric.cnri.reston.va.us> Message-ID: <000101bf4655$94e40840$3a2d153f@tim> [Guido] > I propose to use Gary Brown's code. I'll defend this to CNRI's > lawyers if need be. If there's a hassle, I can do a clean-room implementation easily enough -- although I'd rather not. > Jim, have you checked that this is the right CRC to use for zip's CRC? If WinZip unzips Jim's files without griping, the odds that he's got the wrong CRC are about 1 in 2**36 . > (This in the light of Tim's assertion that there are many CRCs > around.) There are, and several others are hiding in assorted communications stds (e.g., Ethernet uses a different 32-bit CRC); but the zip CRC is the one you'll find most commonly described on the Web. All the same, once Jim releases his code, I'll do an anal verification that it's the right one. From jim at interet.com Tue Dec 14 18:54:35 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 12:54:35 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000101bf4655$94e40840$3a2d153f@tim> Message-ID: <3856845B.6C3C7330@interet.com> Tim Peters wrote: > If WinZip unzips Jim's files without griping, the odds that he's got the > wrong CRC are about 1 in 2**36 . You mean 2**32, right? Oh, sorry, you must be using a DEC-10 . JimA From gstein at lyra.org Tue Dec 14 20:23:36 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 11:23:36 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <3856425F.8C5E7A42@interet.com> Message-ID: On Tue, 14 Dec 1999, James C. Ahlstrom wrote: > Greg Stein wrote: > > > > > Can you post zipfile.py so that people can starting reviewing that? > > Yes, it will be available by next Monday. I just want to > get it really working and pretty, and with documentation. My point was that people could possibly use it *before* then. Not everybody needs it to be pretty, needs doc, or needs it fully working. Maybe people would like to provide feedback on the API. Maybe they'd like to start their own modules that use your library. This goes back to my years-old statement: release it now rather than later -- people can always use it now, and there might not be a later. Release early. Release often. :-) People are too hesitant to release code. Why? Just send it out there. When you update it, send out another. It doesn't hurt anybody to have more than one release. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one at email.msn.com Wed Dec 15 05:20:25 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 14 Dec 1999 23:20:25 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <3856845B.6C3C7330@interet.com> Message-ID: <000501bf46b3$b6184f40$05a0143f@tim> [Tim] > If WinZip unzips Jim's files without griping, the odds that he's > got the wrong CRC are about 1 in 2**36 . [JimA] > You mean 2**32, right? Nope! For each of the 2**32 polynomials you may have pulled out of thin air, there are about a dozen common variations in the details of CRC algorithms. For example, a CRC used for hashing usually initializes "the register" to 0, but a CRC used to protect against transmission errors usually initializes to a block of 1 bits (since leading zeroes don't affect the result, and a common transmission error is dropping a prefix of the msg). Similarly, algorithms vary in the order they scan the data; in whether they use the raw data or its complement; and in whether they return the actual remainder, the complement of the remainder, or a checksum cleverly computed so that "the other end" always sees a fixed remainder other than 0 (or ~0). > Oh, sorry, you must be using a DEC-10 . I used a Univac 1108 in college, back when ASCII was in its infancy. They couldn't decide on the natural size for a character, so the 36-bit 1108 could be configured to treat each word as either 6 6-bit bytes or 4 9-bit ones. If they had been thinking ahead, they would have defined it as two Unicode characters plus a 4-bit tag field for the Python implementation to play with . now-they-make-their-living-suing-.gif-bandits-ly y'rs - tim From tim_one at email.msn.com Wed Dec 15 08:40:11 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 15 Dec 1999 02:40:11 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <385660D0.C6C0C7B9@interet.com> Message-ID: <000b01bf46cf$9ebe27e0$05a0143f@tim> [JimA posts his Python rendering of Gary Brown's code] Yup! That's the zip algorithm, right down to the absurdly bit-reversed polynomial. > def crc32(string): > crc = 0xFFFFFFFF > for ch in string: > crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) & > 0xFFFFFF) > return ~crc Note that the last line is better (whether in Python or C!) as return crc ^ 0xffffffff Else you'll get a surprising result in a 64-bit Python, and in some 64-bit C implementations. it's-a-32-bit-algorithm-not-an-"int"-or-"long"-one-ly y'rs - tim From fredrik at pythonware.com Wed Dec 15 10:31:29 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 15 Dec 1999 10:31:29 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000101bf4655$94e40840$3a2d153f@tim> Message-ID: <002601bf46e0$06e25ca0$f29b12c2@secret.pythonware.com> > [Guido] > > I propose to use Gary Brown's code. I'll defend this to CNRI's > > lawyers if need be. > > If there's a hassle, I can do a clean-room implementation easily enough -- > although I'd rather not. or you can grab the code from PIL, which already comes with a Python compatible license... (it's based on ISO 3307, but judging from the table James posted, it's the same thing...) From fredrik at pythonware.com Wed Dec 15 10:39:19 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 15 Dec 1999 10:39:19 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000b01bf46cf$9ebe27e0$05a0143f@tim> Message-ID: <003001bf46e0$43860b20$f29b12c2@secret.pythonware.com> Tim Peters wrote: > Yup! That's the zip algorithm, right down to the absurdly bit-reversed > polynomial. also known as ISO 3307, according to some strange comments in PIL's sources... From jim at interet.com Wed Dec 15 16:53:34 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 15 Dec 1999 10:53:34 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <3857B97E.3684224F@interet.com> Greg Stein wrote: > Release early. Release often. :-) You are right of course. OK, the zipfile.py code and docs are at: ftp://ftp.interet.com/pub/pylib.html Despite the ftp URL, clicking on it should display the html. Please don't panic if is seems to be slow. It uses a Python CRC-32 which is slow. You may want to hack it to use zlib.crc32() if you have it. I am testing with WinZip. If you have another zip tool, it would be interesting to see how compatible it is. JimA From guido at CNRI.Reston.VA.US Wed Dec 15 17:38:47 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 15 Dec 1999 11:38:47 -0500 Subject: [Python-Dev] Writers wanted for Linux Journal Python special issue Message-ID: <199912151638.LAA02522@eric.cnri.reston.va.us> Linux Journal is preparing a special issue devoted to Python (actually more like a pullout section or whatever I think). They are looking for writers, e.g. to write a piece about Python's history and/or an introduction. And probably anything else Python related. If you're interested, please write to Marjorie Richardson , who is coordinating. Also direct any questions to her. This is for the June issue which will be on newsstands mid-May and mailed to subscribers even earlier, I believe. The deadline is February 1st (magazine production takes forever!). --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Wed Dec 15 19:17:53 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Wed, 15 Dec 1999 13:17:53 -0500 (EST) Subject: [Python-Dev] fwd. from Paul Prescod Message-ID: <14423.56145.877163.395736@amarok.cnri.reston.va.us> This is a forwarded e-mail from the XML-SIG mailing list, in which Paul makes some good points. Some context: I've been arguing against adding more XML stuff to the base Python distribution, because 1) it's bloat for those people don't care about XML, and 2) the Distutils is supposed to fix this by making installing things easier. Paul's response, below, has shaken my conviction a bit (*only* a bit, though). If it's deemed valuable, perhaps the XML-SIG could concentrate on the minimal set of parser + SAX + DOM that could be included in 1.6. Please join the XML-SIG to follow the specifics of this thread further, as it relates only to XML. As a more general philosophical question for python-dev: do we want to add things to 1.6 following the "batteries included" philosophy? Or should we wave in the direction of the distutils and say they'll fix the problem? (In which case they should be given high priority, as in "1.6 doesn't ship until they're done".) -- A.M. Kuchling http://starship.python.net/crew/amk/ And after all, why should I go to bed every night? Sleep is only a habit. -- Cornelius Van Horne Paul Prescod writes: >"Andrew M. Kuchling" wrote: >> >> Huh? There's obviously a good deal of stuff in there, some of it >> perhaps too esoteric, but I don't see where there's overlap. > >Well, there are several parsers and parser wrappers. How is a user >supposed to choose? And there is PyDOM, Minidom and qp_dom. > >> Or are >> you talking about Python tools in general, where there are 3 DOM >> implementations? (PyDOM, 4DOM, and ZDOM hiding inside Zope.) > >That too. > >> I lean against shoveling more stuff into 1.6; better to get the >> Distutils widely used, which makes it easier to install *all* Python >> extensions. > >I don't think that XML is any more of an "add-on" to a modern scripting >language than URL support or regular expression support. I'm in the >"batteries included" camp for this and several other reasons: > > * standard Python libraries may soon need XML support. If WebDAV takes >off then there should be a libWebDAV right alongside libftp and libhttp. >And libWebDAV will require XML > > * there is a difference between theory and practice. In theory, >distutils will be done soon and everything will be easy. In practice, it >is the end of 1999 and at every conference I have to install the XML sig >package on the machines of several people who haven't been able to get >it going themselves. In practice, we can't wait for distutils because >people are choosing their XML tools now. > >> >Ideally we would have one (or at most two!) implementation of each of >> >the major specs: >> >XML >SAX >Unicode >XPath >XPointer >XSLT >DOM >> >> Do you mean "one implementation of each in a single package", or "one >> implementation existing for Python, distributed separately"? > >With the possible exception of XSLT, one implementation of each *in >Python 1.6*. > >> We need to come up with a position paper for developer's day, stating >> what needs to be discussed. Suggestions? I'd propose focusing on >> getting the XML-SIG package to 1.0, but that's just an idea. > >I don't see how the XML-SIG package can ever get to 1.0. Anybody can >contribute code at anytime and thus far we've been totally flexible >about putting it in. I think that's great. It just won't ever lead to a >stable, carefully maintained, tightly interoperable package. Some of the >maintainers of the individual pieces have probably lost interest and >there is probably nobody that understands it all enough to integrate it >nicely. > >-- > Paul Prescod - ISOGEN Consulting Engineer speaking for himself > From fdrake at acm.org Wed Dec 15 20:47:01 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 15 Dec 1999 14:47:01 -0500 (EST) Subject: [Python-Dev] posix module Message-ID: <14423.61493.90107.433664@weyr.cnri.reston.va.us> Ok, I think I'm done with the posix module updates, modulo bugs and additional symbols for the *conf*() tables. That leaves us with the following status for interfaces that Andrew brought up in the message that started this spate of additions: Worth adding? ============= opendir(), readdir(), closedir() -- not added The only thing these give us that os.listdir() doesn't is the inode numbers. Unless someone actually wants those, it's not worth having. Worth adding: ============= abort() -- added ctermid(), ctermid_r() -- added fpathconf(fd, name) -- added getlogin() -- added getgroups(gidsetsize, grouplist) -- added pathconf(path, name) -- added sysconf(int name) -- added; also added confstr(int name) Not worth adding: ================= clearerr() -- not added cuserid() -- not added difftime -- not added tmpfile(), tmpnam() -- added, also tempnam() mblen(), mbstowcs(), mbtowc(), wcstombs(), wctomb() -- not added -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jeremy at cnri.reston.va.us Wed Dec 15 20:58:16 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Wed, 15 Dec 1999 14:58:16 -0500 (EST) Subject: [Python-Dev] zipfile.py In-Reply-To: References: <3856425F.8C5E7A42@interet.com> Message-ID: <14423.62168.576273.719577@goon.cnri.reston.va.us> >>>>> "GS" == Greg Stein writes: GS> On Tue, 14 Dec 1999, James C. Ahlstrom wrote: >> Greg Stein wrote: > >> >> > Can you post zipfile.py so that people can starting reviewing >> that? >> >> Yes, it will be available by next Monday. I just want to get it >> really working and pretty, and with documentation. GS> My point was that people could possibly use it *before* GS> then. Not everybody needs it to be pretty, needs doc, or needs GS> it fully working. Maybe people would like to provide feedback GS> on the API. Maybe they'd like to start their own modules that GS> use your library. GS> This goes back to my years-old statement: release it now rather GS> than later -- people can always use it now, and there might not GS> be a later. Ok. I think we need some kind of zip file support in the core so that it can be used as a standard distribution format. I'd be happy if Jim's zipfile module ended up being it. We've got some zip code that we developed at CNRI; it's a bit of a mess, but it might be helpful to see what we did. Our code is at ftp://www.python.org/pub/tmp/zip.zip Jeremy From jim at interet.com Thu Dec 16 16:41:56 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 16 Dec 1999 10:41:56 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> Message-ID: <38590844.769C3025@interet.com> Did anyone look at this yet? ftp://ftp.interet.com/pub/pylib.html ftp://ftp.interet.com/pub/zipfile.py JimA From skip at mojam.com Thu Dec 16 16:46:28 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 16 Dec 1999 09:46:28 -0600 (CST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38590844.769C3025@interet.com> References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> Message-ID: <14425.2388.529932.61119@dolphin.mojam.com> JA> Did anyone look at this yet? JA> ftp://ftp.interet.com/pub/pylib.html JA> ftp://ftp.interet.com/pub/zipfile.py I thought it wasn't supposed to be out until Monday? You're looking for, perhaps, a time machine? ;-) (More seriously, it won't have any effect on my "gotta have this done yesterday" list, so I will let others comment...) Skip From jim at interet.com Thu Dec 16 18:16:21 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 16 Dec 1999 12:16:21 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> Message-ID: <38591E65.4885A39D@interet.com> "James C. Ahlstrom" wrote: > ftp://ftp.interet.com/pub/pylib.html I just changed zipfile.py so that regular zip compression works. And if zlib is available, its crc32() is used instead of the Python version. I should mention that the current code rejects zip files which have an archive comment added to the end. Accepting them would require a search, and I am not sure it is worth it. JimA From fdrake at acm.org Thu Dec 16 18:19:23 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 16 Dec 1999 12:19:23 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: References: <199912151831.NAA02685@weyr.cnri.reston.va.us> Message-ID: <14425.7963.347400.763562@weyr.cnri.reston.va.us> [Note that Greg's message went to python-checkins since he responded to a checkin message, but I suspect he meant to change the header to point to python-dev. ;) If not, too bad!] Greg Stein writes: > But this means that your tables no long reside in "const" space. Yet More > Per-Process Memory... > > It would be nice to have those tables marked as "const". Perhaps; as Guido points out, there haven't been a lot of complaints about this issue. I will note that only the tables aren't constant; the strings that are pointed to are still constant. I'm inclined to let the compiler/ linker care about this, and not change the code without a really clear need to do so. Here are the sizes of those tables and the strings they point to (including terminating null bytes for the strings): pathconf_names: 14 entries, 112 bytes, 176 string bytes confstr_names: 25 entries, 200 bytes, 576 string bytes sysconf_names: 108 entries, 864 bytes, 1774 string bytes Figures are for Solaris7. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From gstein at lyra.org Thu Dec 16 19:10:14 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 10:10:14 -0800 (PST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: <14425.7963.347400.763562@weyr.cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Fred L. Drake, Jr. wrote: > [Note that Greg's message went to python-checkins since he responded > to a checkin message, but I suspect he meant to change the header to > point to python-dev. ;) If not, too bad!] I didn't really care too much where it went. I would actually suggest that the Reply-To: on the checkin list is set to python-dev if that is where replies are Supposed To Go. [ I do this with mod_dav checkins; replies to dav-checkins mail goes to dav-dev. ] > Greg Stein writes: > > But this means that your tables no long reside in "const" space. Yet More > > Per-Process Memory... > > > > It would be nice to have those tables marked as "const". > > Perhaps; as Guido points out, there haven't been a lot of complaints > about this issue. > I will note that only the tables aren't constant; the strings that > are pointed to are still constant. I'm inclined to let the compiler/ > linker care about this, and not change the code without a really clear > need to do so. > Here are the sizes of those tables and the strings they point to > (including terminating null bytes for the strings): > > pathconf_names: 14 entries, 112 bytes, 176 string bytes > confstr_names: 25 entries, 200 bytes, 576 string bytes > sysconf_names: 108 entries, 864 bytes, 1774 string bytes > > Figures are for Solaris7. Ah. I just replied to that. Guess that one went to python-checkins :-) True, this is a small amount of memory. But they start to add up. non-const globals also pain me when I start to work on free-threading stuff (each must be examined to see if synchronization is needed), so reducing the number there is important. Regarding the memory itself: as I mentioned in the other note, I just want to ensure that Python's working set remains low (reasons given in that email). Cheers, -g -- Greg Stein, http://www.lyra.org/ From skip at mojam.com Thu Dec 16 19:09:11 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 16 Dec 1999 12:09:11 -0600 (CST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: References: <199912161553.KAA08428@eric.cnri.reston.va.us> Message-ID: <14425.10951.169751.843764@dolphin.mojam.com> >>>>> "Greg" == Greg Stein writes: Greg> On Thu, 16 Dec 1999, Guido van Rossum wrote: >> I don't think there's much of a need to worry about this. Why are >> you always bringing up this subject? No-one else that I know has >> ever had this concern... Greg> Somebody has to :-) Greg> Keeping the working set low is more efficient from a system Greg> standpoint. Not to mention the not-all-that-occasional-anymore requests to have Python on various itty-bitty things like Palm Pilots and WinCE devices. It's one thing to add size to modules people can live without for many applications, but I think the posix module and its other platform-specific relations are fairly heavily used. (I realize this specific example isn't likely to apply to PP/WinCE.) Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From gstein at lyra.org Thu Dec 16 19:21:54 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 10:21:54 -0800 (PST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released In-Reply-To: <199912161527.KAA08308@eric.cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Guido van Rossum wrote: >... > I realize it's just a rant. In this case (distutils) your advice is > correct. (I usually paraphrase it as "release early, release often".) True. I prefer that phrase, too, but I used it on JimA earlier in the day or the previous day. I didn't want to sound like a broken record :-). But that is why I moved into mode... it seems like the mindset was spreading :-) I've railed at AMK for it, too :-), when he was talking about 0.5.1pre1 or whatever, rather than just releasing 0.5.1 and doing an 0.5.2 if there was a problem. > However there are other situations, like core Python itself, where > it's really useful to have stable releases -- if only for those users > who won't touch anything with "beta" in its name. I still hear from > people who haven't upgraded to 1.5.2. But this doesn't explain why there isn't a 1.5.3b1, 1.5.3b2, etc. Or 1.6.0a1 or whatever (maybe "d" or "r" for dev release, as opposed to alpha). There are some people would like the releases rather than using CVS. Some people can't even use CVS because of firewall issues. Of course, an alternative is snapshot-tarballs of the CVS repository. But a snapshot could *really* be broken; something like 1.6.0d1 says "well, it's a development release, but I've hit a good point between some changes." > I wonder if perhaps for those cases (where there's a demand for stable > releases) some other strategy could be used? Such as labeling > releases "stable" after the fact? Or what Linus seems to do with the > Linux kernel (even = stable, odd = development; or was it the other > way around?). Yes: even are stable (e.g. 1.0, 1.2, 2.0, 2.2). The odd numbers are for development. Linus is currently working 2.3.x, but declared in the past couple days that things will be wrapping up to move towards 2.4. Once he thinks it is ready, he'll start off with 2.4.0pre1, pre2, pre3... At some point the "pre" suffix will drop and 2.4.0 will be released. You might have a bit of problem using that mechanism since the current stable release is 1.5 :-). Once 1.6 hits the street, then you could start doing 1.9 releases (dev) and shift to 2.0 once it is "stable". Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul at prescod.net Thu Dec 16 19:02:55 1999 From: paul at prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 10:02:55 -0800 Subject: [Python-Dev] Re: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> Message-ID: <3859294F.138FF398@prescod.net> "Andrew M. Kuchling" wrote: > > * Python revisions come out slowly, once every year or two. XML > standards have been revolving faster , and we don't want to wait > until 1.7 for SAX2, or DOM Level2, or other new revisions. > Keeping the modules out of the core lets them be updated at their > own pace. A counterargument is that the XML specs are slowing > down -- add namespace support to SAX, and finalize DOM > Level 2, and I don't think any other standards are very important > to basic XML programming. I agree with your counterargument. :) Anyhow, isn't there a logical fallacy in your original argument? Why can't we offer a DOM 3 module or extension after Python ships with DOM 2? > * We really want a C-based parser to be commonly available. > sgmlop is the only reasonable choice for this, because I'd be > against including Expat. To replay some arguments I made against > including the zlib library in 1.6, what if a C extension requires > a newer version of the library? Symbol conflicts if you're lucky, > hard-to-debug problems if you're not. I don't understand this issue. Why would a C extension build on sgmlop which is designed to make XML information available to *Python* programmers? > * We can drop various marginal bits of the CVS tree; the xmlarch > support is probably not of very wide interest, for example. How about "expat", "mac", "pyexpat", "utils", "windows". There is just too much stuff there! And I daresay that alot of it has not been "quality controlled" to the level that we would expect if it were a part of the real Python library. In other words, there is no single place to go to get only XML-processing software that works well and works together. > I think I'm on the record as saying that Python's major problems now > aren't language-related, but are with the development environment. > Language changes (from minor, like 'for i in 1..9', to major, like > fixing the type/class dichotomy or adding static types) aren't going > to bring in piles of new users, useful though they might be to > experienced Pythoneers, large projects, or some other specific > application. (irrelevant aside: I agree 100% that making things easier to install will actually improve newbies experience more than (e.g.) static type checking but I do not agree that it is a better "sales tool". Most people are sold based on the language and its libraries before they start trying to install extensions.) > If installing things is a problem, then we need to > buckle down and finish the distutils. So, overall, I'd still vote > against inclusion in 1.6. So are you saying that Python 2 might have only five packages and everything else must be downloaded? No httplib, no pickle, no random or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? When people download Python and go to the library documentation that impressive array of BUILT-IN-FEATURES is part of what sells them on Python. Hell, I can download all of that stuff for Scheme but what makes Python beautiful is that I don't have to download it for Python. It's just there. But if an XML person comes to Python after hearing us rant about how great it is for processing XML and all they find is xmllib...they will be underwhelmed. > No, it's *got* to reach 1.0. The point of the package is that it's > exactly *one* thing to install that gives basic XML tools; you don't > need to chase down the SAX modules from Lars' page, PyExpat from > ftp.cwi.nl, sgmlop from pythonware.com, and so forth. If the > Distutils made it as easy as: > > python fetchpackage.py SAX PyExpat DOM sgmlop > > > > etc... > > then much of the need for a single package goes away, but, as you > point out, that isn't currently the case. I'm a little lost here. We need xmllib to continue because distutils doesn't do what we need yet but we don't need to put the stuff in the Python library because disutils will work well enough soon. But there is an important issue that disutils will not solve. One of the beautiful things about the Python library is that everything is at the same version level. When you install it you know that everything works together or else it WILL in the next patch level if you report the incompatibility. When the xml package gets versioned incompatibly with the Python library you don't have that safe feeling. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From akuchlin at mems-exchange.org Thu Dec 16 19:50:48 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Thu, 16 Dec 1999 13:50:48 -0500 (EST) Subject: [Python-Dev] Re: [XML-SIG] Developer's Day In-Reply-To: <3859294F.138FF398@prescod.net> References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> <3859294F.138FF398@prescod.net> Message-ID: <14425.13448.737831.460241@amarok.cnri.reston.va.us> (Responding to the python-dev related portion of this...) Paul Prescod writes: >I don't understand this issue. Why would a C extension build on sgmlop >which is designed to make XML information available to *Python* >programmers? No, no; I'm arguing against shipping with Expat; sgmlop good! Consider this scenario: * Python includes Expat 1.0 * Some C library (for DAV or whatever) uses Expat 1.1 * Someone writes a Python interface to this C library and attempts to compile it statically. * Two versions of Expat in the same binary; symbol conflicts and core dumps, oh my! >So are you saying that Python 2 might have only five packages and >everything else must be downloaded? No httplib, no pickle, no random or >math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? I'm not arguing for dropping existing packages; I'm against adding many more of them. Existing library modules can stay where they are. But I wouldn't mind a minimalist Python too much, if it came with a script fetch-basic-packages: python fetch-packages.py httplib python fetch-packages.py imaplib ... 200 more lines ... >I'm a little lost here. We need xmllib to continue because distutils >doesn't do what we need yet but we don't need to put the stuff in the >Python library because disutils will work well enough soon. Basically, yes. -- A.M. Kuchling http://starship.python.net/crew/amk/ And now let us hasten to the station. I have commanded the rain to fall at exactly one-fifteen and I would hate to get my shoes wet. -- Lord Lavender, in SEBASTIAN O #2 From bwarsaw at cnri.reston.va.us Thu Dec 16 19:50:49 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 16 Dec 1999 13:50:49 -0500 (EST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released References: <199912161527.KAA08308@eric.cnri.reston.va.us> Message-ID: <14425.13449.954026.960703@anthem.cnri.reston.va.us> >> I wonder if perhaps for those cases (where there's a demand for >> stable releases) some other strategy could be used? Such as >> labeling releases "stable" after the fact? Or what Linus seems >> to do with the Linux kernel (even = stable, odd = development; >> or was it the other way around?). I really dislike the odd/even distinction for exactly this reason. -Barry From guido at CNRI.Reston.VA.US Thu Dec 16 20:02:16 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 16 Dec 1999 14:02:16 -0500 Subject: [Python-Dev] Batteries Included? Message-ID: <199912161902.OAA11345@eric.cnri.reston.va.us> I like the batteries included approach, but I also feel resistence against including stuff I cannot maintain. The XML code base is a point in case; I don't understand enough about XML. (I just read that xmllib.py is "illegal". Jeez! What happened? Did Congress pass a law against it?) I think it may be time for separate Python distributions, like Linux -- I can concentrate on the core, and keep it really small; others can make all-encompassing distributions. There are currently some drawbacks to this approach: non-core modules have less status; and the documentation process is fundamentally different for core and non-core modules. There's also the version dependency stuff, but I think resolving that is the responsibility of the distribution makers. I think the status problem will be gone once there is a respected distribution -- then you derive status from being in that distribution, rather than from being in the core distribution. (Well, you would still derive status from being in the core, but it would be much harder to obtain, since I can set a much higher standard.) The documentation problem is the one that's left. I think the doc-sig may be on its way as we speak to solve this, though. Fred? This isn't rocket science. Red Hat Python? I'm all for it! :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Thu Dec 16 20:05:05 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 16 Dec 1999 13:05:05 -0600 (CST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released In-Reply-To: <14425.13449.954026.960703@anthem.cnri.reston.va.us> References: <199912161527.KAA08308@eric.cnri.reston.va.us> <14425.13449.954026.960703@anthem.cnri.reston.va.us> Message-ID: <14425.14305.907618.978628@dolphin.mojam.com> >>> Or what Linus seems to do with the Linux kernel (even = stable, odd >>> = development; or was it the other way around?). BAW> I really dislike the odd/even distinction for exactly this reason. It's one saving grace is that it is a uniform format. There are no "optional" tokens like "pre", "alpha", "beta", etc for the most part. To remember which way it is, I find it useful to execute "uname -r", check the second digit, then look down at my shirt for a pocket protector. The two pieces of information together work for me. I currently get "2.2.13-4mdk" from uname. I don't even have a pocket, let alone a pocket protector, so even numbers must be stable releases... ;-) Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From fdrake at acm.org Thu Dec 16 20:05:22 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 16 Dec 1999 14:05:22 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: <14425.10951.169751.843764@dolphin.mojam.com> References: <199912161553.KAA08428@eric.cnri.reston.va.us> <14425.10951.169751.843764@dolphin.mojam.com> Message-ID: <14425.14322.355507.500813@weyr.cnri.reston.va.us> Skip Montanaro writes: > fairly heavily used. (I realize this specific example isn't likely to apply > to PP/WinCE.) Or any version of Windows, I suspect; perhaps Mark Hammond can elaborate. Appearantly none of the pathconf() constants are defined on that platform, at least not as #define constants. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jcw at equi4.com Thu Dec 16 20:09:42 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Thu, 16 Dec 1999 20:09:42 +0100 Subject: [Python-Dev] Re: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> <3859294F.138FF398@prescod.net> Message-ID: <385938F6.C4164756@equi4.com> Paul Prescod wrote: [...] > (irrelevant aside: [...] Most people are sold based on the language > and its libraries before they start trying to install extensions.) > > [AMK] > > If installing things is a problem, then we need to > > buckle down and finish the distutils. So, overall, I'd still vote > > against inclusion in 1.6. > > So are you saying that Python 2 might have only five packages and > everything else must be downloaded? No httplib, no pickle, no random > or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? > > When people download Python and go to the library documentation that > impressive array of BUILT-IN-FEATURES is part of what sells them on > Python. Hell, I can download all of that stuff for Scheme but what > makes Python beautiful is that I don't have to download it for Python. > It's just there. But if an XML person comes to Python after hearing us > rant about how great it is for processing XML and all they find is > xmllib...they will be underwhelmed. (Nodding in agreement) Could this perhaps be solved with a large batteries-included standard distribution, plus a real easy/effective way to strip Python down and wrap things up for deployment? In other words, aim for two very distinct goals: everything within easy reach for development + fully signed-sealed-delivered products. The first goal can evolve to do fancy net-bourne distribution, even if it is a brittle process, because this is for Python developers. They want it all, so open the floodgate to give it all to them. The second becomes a matter or pruning down and wrapping up. All the way down to an single installation-less executable, if possible. I may well be wrong (and I'm not tracking distutils), but might it not be simpler to focus on 1) power users + 2) production-grade deployment, instead of trying to streamline a tangled-web-of-module-dependencies into a distribution system which tries to meet a wide range of needs? > [...] One of the beautiful things about the Python library is that > everything is at the same version level. When you install it you know > that everything works together or else it WILL in the next patch level > if you report the incompatibility. [...] More nods. So why not allow the Python distribution to become very large - with every release moving to a better-tuned combination of all the different parts (occasional mishaps can quickly be fixed)? Plus some tools to dist(ut)il(l) a turnkey solution from this big soup. Sort-of-from-violin-to-quartet-all-the-way-to-symphony-orchestra... -- Jean-Claude From gstein at lyra.org Thu Dec 16 21:02:46 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 12:02:46 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38590844.769C3025@interet.com> Message-ID: On Thu, 16 Dec 1999, James C. Ahlstrom wrote: > Did anyone look at this yet? > > ftp://ftp.interet.com/pub/pylib.html > > ftp://ftp.interet.com/pub/zipfile.py I went to look for it, but I think that was before you put zipfile up. Looking at it now... The writepy() as a method is questionable, I think. I think it should open the file at instantiation time. I don't see a reason to allow that to be deferred. Especially given that some of the methods fail if open() hasn't been called. It would be good to have symbolic names for the 0 and 8 compression constants, and to fail if 8 is passed and zlib is not available (otherwise, it doesn't fail until read/write time, and with a NameError). There should probably be a __del__ that calls close(). Oh, and a "closed" attribute that can be checked and an error raised if an operation is done after the file has been closed. I think dir() should return the contents, rather than print them. read() and write() ought to fail if the mode is incorrect. Oh, some symbolic constants for things like "PK\005\006" would be nice. Do you have a ZipImporter written? Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Thu Dec 16 21:12:30 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 12:12:30 -0800 (PST) Subject: [Python-Dev] Re: [XML-SIG] Developer's Day In-Reply-To: <14425.13448.737831.460241@amarok.cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Andrew M. Kuchling wrote: > Paul Prescod writes: > >I don't understand this issue. Why would a C extension build on sgmlop > >which is designed to make XML information available to *Python* > >programmers? > > No, no; I'm arguing against shipping with Expat; sgmlop good! > Consider this scenario: > > * Python includes Expat 1.0 > * Some C library (for DAV or whatever) uses Expat 1.1 > * Someone writes a Python interface to this C library and > attempts to compile it statically. > * Two versions of Expat in the same binary; symbol conflicts > and core dumps, oh my! We should ship pyexpat, not Expat. (IMO) > >So are you saying that Python 2 might have only five packages and > >everything else must be downloaded? No httplib, no pickle, no random or > >math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? > > I'm not arguing for dropping existing packages; I'm against adding > many more of them. Existing library modules can stay where they are. > But I wouldn't mind a minimalist Python too much, if it came with a > script fetch-basic-packages: > > python fetch-packages.py httplib > python fetch-packages.py imaplib > ... 200 more lines ... Considering that it would probably use HTTP to fetch the packages, I think you wouldn't be fetching httplib :-) But yes: I agree with the basic sentiment. Cheers, -g -- Greg Stein, http://www.lyra.org/ From petrilli at amber.org Thu Dec 16 21:55:16 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Thu, 16 Dec 1999 15:55:16 -0500 Subject: [Python-Dev] Batteries Included? In-Reply-To: <199912161902.OAA11345@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Thu, Dec 16, 1999 at 02:02:16PM -0500 References: <199912161902.OAA11345@eric.cnri.reston.va.us> Message-ID: <19991216155516.A28037@trump.amber.org> Guido van Rossum [guido at CNRI.Reston.VA.US] wrote: > I think it may be time for separate Python distributions, like Linux > -- I can concentrate on the core, and keep it really small; others can > make all-encompassing distributions. My fear is what we face in the Zope world---different distributions break in totally diffrent ways, and sometimes we have to ask 30 questions to figure out what might be going wrong :/ The nice thing is hat if someone installes Python from the source, we know what's going to happen. I don't know if this is solvable, honestly. > This isn't rocket science. Red Hat Python? I'm all for it! :-) I think Guido just wants to IPO and retire :-) Chris -- | Christopher Petrilli | petrilli at amber.org From gward at cnri.reston.va.us Thu Dec 16 22:03:26 1999 From: gward at cnri.reston.va.us (Greg Ward) Date: Thu, 16 Dec 1999 16:03:26 -0500 Subject: [Python-Dev] distutils-sig/python-dev crosstalk Message-ID: <19991216160325.H4289@cnri.reston.va.us> Most recent threads on distutils-sig seem to have migrated to python-dev pretty quickly. This means that a) there are python-dev people on distutils-sig (duh), b) they think what goes on there is important enough to interest the other core developers (good!), and c) they assume there are people on python-dev who are not also on distutils-sig. Is this last assumption true? If you read python-dev, are interested in distutils issues, but do *not* read distutils-sig, please drop me a note. If no one says anything, I will (politely, tentatively) propose that we keep the distutils threads on distutils-sig and leave python-dev for, well, core Pythond development. If you think that the two are inextricably linked and I might as well just cross-post everything on distutils-sig to python-dev, let me know about that too. ;-) Greg -- Greg Ward - software developer gward at cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913 From gstein at lyra.org Thu Dec 16 22:18:50 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:18:50 -0800 (PST) Subject: [Python-Dev] distutils-sig/python-dev crosstalk In-Reply-To: <19991216160325.H4289@cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Greg Ward wrote: >... > If you think that the two are inextricably linked and I might as well > just cross-post everything on distutils-sig to python-dev, let me know > about that too. ;-) :-) I think distutils is about the mechanics. And it is a large and sophisticated problem (which why it has a SIG :-). You could almost view it as a spinoff of the python-dev grand problem set. When we get into the question of "what does Python ship with?", then I think it belongs in python-dev, as that is a discussion of what constitutes Python itself. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Thu Dec 16 22:21:12 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:21:12 -0800 (PST) Subject: [Python-Dev] distutils-sig/python-dev crosstalk In-Reply-To: <19991216160325.H4289@cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Greg Ward wrote: > Most recent threads on distutils-sig seem to have migrated to python-dev > pretty quickly. This means that a) there are python-dev people on > distutils-sig (duh), b) they think what goes on there is important > enough to interest the other core developers (good!), and c) they assume > there are people on python-dev who are not also on distutils-sig. Oh. One more thing. Actually, what I am somewhat worried about is whether there was relevant discussion on python-dev that should have been visible to the distutils people. Not sure if there was, but that is always a potential problem. Same with the recent xml-sig / python-dev crosstalk. Specifically, Paul Prescod is not on python-dev, so he may have missed a response or two. Cheers, -g -- Greg Stein, http://www.lyra.org/ From mal at lemburg.com Thu Dec 16 22:23:30 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 16 Dec 1999 22:23:30 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> Message-ID: <38595852.E8054741@lemburg.com> "James C. Ahlstrom" wrote: > > "James C. Ahlstrom" wrote: > > > ftp://ftp.interet.com/pub/pylib.html > > I just changed zipfile.py so that regular zip compression > works. And if zlib is available, > its crc32() is used instead of the Python version. > > I should mention that the current code rejects zip files which have > an archive comment added to the end. Accepting them would require > a search, and I am not sure it is worth it. I don't think it is needed for our purposes, but maybe a subclass could provide it ? FYI, I've tested the module against mxStack-0.3.0.zip which you can find on my Python Pages. It was created using Info-ZIP's zip 2.2 on Linux. Unfortunately, I always get the following traceback when trying to print the directory: >>> z.open('../projects/distribution/mxStack-0.3.0.zip','rb') >>> z.dir() File Name Modified Size Stack/mxStack/mxStack.h 1999-04-16 10:50:06 4368 Stack/mxStack/mxstdlib.h 1999-04-13 15:37:52 5433 Traceback (innermost last): File "", line 1, in ? File "/home/lemburg/lib/zipfile.py", line 120, in dir bytes = self.read(name) # Just to check CRC-32 File "/home/lemburg/lib/zipfile.py", line 133, in read bytes = zlib.decompress(bytes, -15) zlib.error: Error -5 while decompressing data Some notes on the API: ---------------------- * I would find it more convenient if the filename and mode would be constructor parameters, e.g. zfile = zipfile('myfile.zip','rb') with compression defaulting to 8 rather than 0 (most zip files will be deflated since this is the ZIP default). * Also, I would like a method much like the os.listdir() which returns a list of filenames rather than print it to stdout. * .is_zipfile() should probably be a separate function: it doesn't use any of the class' features. More wishes to come ;-) So far: Great Work ! Aside: I found that you are using undocumented arguments to zlib.compressobj() ... are these extra arguments left out of the documentation on purpose or by simple oversight ? I couldn't find them in the HTML docs and neither in the docstrings. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 15 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein at lyra.org Thu Dec 16 22:32:09 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:32:09 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38595852.E8054741@lemburg.com> Message-ID: On Thu, 16 Dec 1999, M.-A. Lemburg wrote: >... > Some notes on the API: > ---------------------- > * I would find it more convenient if the filename and mode > would be constructor parameters, e.g. > > zfile = zipfile('myfile.zip','rb') > > with compression defaulting to 8 rather than 0 (most zip files > will be deflated since this is the ZIP default). > > * Also, I would like a method much like the os.listdir() > which returns a list of filenames rather than print it > to stdout. The above two items were in my ramble, just not as clear as MAL :-) > * .is_zipfile() should probably be a separate function: it > doesn't use any of the class' features. Ah! Good call. It is even more important to shift it out if the constructor now opens a file. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fdrake at acm.org Thu Dec 16 22:33:36 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 16 Dec 1999 16:33:36 -0500 (EST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38595852.E8054741@lemburg.com> References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> Message-ID: <14425.23216.636687.704436@weyr.cnri.reston.va.us> M.-A. Lemburg writes: > Aside: I found that you are using undocumented arguments to > zlib.compressobj() ... are these extra arguments left out of > the documentation on purpose or by simple oversight ? I couldn't > find them in the HTML docs and neither in the docstrings. The documentation is way out of date and Jeremy Hylton and Andrew Kuchling haven't updated it. I'm not sure which of them changed the signatures for that module, but I've pestered Jeremy about it a few times. If anyone would like to update the documentation, I'd certainly appreciate it. I don't know the details of those interfaces, and this is somewhere where the details are pretty critical. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw at cnri.reston.va.us Fri Dec 17 00:10:11 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 16 Dec 1999 18:10:11 -0500 (EST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released References: <199912161527.KAA08308@eric.cnri.reston.va.us> <14425.13449.954026.960703@anthem.cnri.reston.va.us> <14425.14305.907618.978628@dolphin.mojam.com> Message-ID: <14425.29011.429867.485070@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> To remember which way it is, I find it useful to execute SM> "uname -r", check the second digit, then look down at my shirt SM> for a pocket protector. The two pieces of information SM> together work for me. I currently get "2.2.13-4mdk" from SM> uname. I don't even have a pocket, let alone a pocket SM> protector, so even numbers must be stable releases... What do you do if it's the second Thursday after the full moon, and the local hockey team has just skated to a 3-3 tie? -Barry From mal at lemburg.com Thu Dec 16 22:53:36 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 16 Dec 1999 22:53:36 +0100 Subject: [Python-Dev] Batteries Included? References: <199912161902.OAA11345@eric.cnri.reston.va.us> Message-ID: <38595F60.7C1B34FF@lemburg.com> Guido van Rossum wrote: > > I like the batteries included approach, but I also feel resistence > against including stuff I cannot maintain. > ... > This isn't rocket science. Red Hat Python? I'm all for it! :-) I think we should wait for distutils to get up and running perfectly for everyone before taking such a step. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 15 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein at lyra.org Fri Dec 17 09:31:38 1999 From: gstein at lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 00:31:38 -0800 (PST) Subject: [Python-Dev] Batteries Included? In-Reply-To: <38595F60.7C1B34FF@lemburg.com> Message-ID: On Thu, 16 Dec 1999, M.-A. Lemburg wrote: > Guido van Rossum wrote: > > I like the batteries included approach, but I also feel resistence > > against including stuff I cannot maintain. This is an interesting comment, and is similar to the Apache sentiment. Nothing gets added to the standard distribution unless somebody in the Group is willing to maintain it. It provides a good mechanism for keeping the module set to a reasonable size and a set that can/will actually be maintained. > > ... > > This isn't rocket science. Red Hat Python? I'm all for it! :-) > > I think we should wait for distutils to get up and running > perfectly for everyone before taking such a step. You can also operate on the assumption that it will be done by the time 1.6 is ready to be released. In other words: do the work (distutils and minimizing the release) in parallel, rather than in sequence. I would also think that a large distro isn't going to be assembled with distutils. Somebody will sit down, pull all the components together, and make a big release. However, I do see the distutils as being needed for the people who grab the minimal distro. They need it to grab add'l packages. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Fri Dec 17 10:06:20 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Dec 1999 10:06:20 +0100 Subject: [Python-Dev] zipfile.py References: Message-ID: <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com> James C. Ahlstrom wrote: > > Did anyone look at this yet? > > > > ftp://ftp.interet.com/pub/pylib.html > > > > ftp://ftp.interet.com/pub/zipfile.py > > I went to look for it, but I think that was before you put zipfile up. just a few comments (from reading the docs): -- it would be great if "open" could take an open file object as well as a file name. (in this case, you also need to document what you expect from the underlying file object: read, write, seek, tell should be enough, right? haven't looked at the code -- assuming it works, I'm only interested in the interface) -- or you could nuke "open" and pass those arguments to the constructor instead. -- I assume "open" adds "b" to the given mode argument. -- "dir" looks a bit strange. and hey, there's no "listdir" in there. I'd prefer a recursive "listdir" method, which takes an optional "depth" argument (e.g. 0=this dir, 1=this dir and first subdir, None=infinity, i.e. the full tree). that's all for now. From fredrik at pythonware.com Fri Dec 17 13:21:03 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Dec 1999 13:21:03 +0100 Subject: [Python-Dev] posix module References: <14423.61493.90107.433664@weyr.cnri.reston.va.us> Message-ID: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com> > Ok, I think I'm done with the posix module updates, modulo bugs and > additional symbols for the *conf*() tables. gcc -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H -c ./posixmodule.c ./posixmodule.c:3789: `_SC_AIO_LIST_MAX' undeclared here (not in a function) ./posixmodule.c:3789: initializer element for `posix_constants_sysconf[10].value' is not constant make[1]: *** [posixmodule.o] Error 1 make[1]: Leaving directory `/data/repository/BleedingEdge/python/dist/src/Modules' (current CVS stuff, on Red Hat 5.2) From jim at interet.com Fri Dec 17 15:33:31 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 09:33:31 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <385A49BB.4D064240@interet.com> Greg Stein wrote: > > On Thu, 16 Dec 1999, James C. Ahlstrom wrote: > > Did anyone look at this yet? > > > > ftp://ftp.interet.com/pub/pylib.html > > > > ftp://ftp.interet.com/pub/zipfile.py > > Looking at it now... The writepy() as a method is questionable, I think. > I think it should open the file at instantiation time. I don't see a > reason to allow that to be deferred. Especially given that some of the > methods fail if open() hasn't been called. I eliminated open and added its args to the constructor. > It would be good to have > symbolic names for the 0 and 8 compression constants, and to fail if 8 is > passed and zlib is not available (otherwise, it doesn't fail until > read/write time, and with a NameError). There should probably be a > __del__ that calls close(). Oh, and a "closed" attribute that can be > checked and an error raised if an operation is done after the file has > been closed. All done. > I think dir() should return the contents, rather than print > them. I added listdir() and documented self.TOC. I kept printdir() as example code. > read() and write() ought to fail if the mode is incorrect. Oh, some > symbolic constants for things like "PK\005\006" would be nice. All done. JimA From guido at CNRI.Reston.VA.US Fri Dec 17 15:43:23 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 17 Dec 1999 09:43:23 -0500 Subject: [Python-Dev] Batteries Included? In-Reply-To: Your message of "Thu, 16 Dec 1999 22:53:36 +0100." <38595F60.7C1B34FF@lemburg.com> References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> Message-ID: <199912171443.JAA12414@eric.cnri.reston.va.us> > Guido van Rossum wrote: > > > > I like the batteries included approach, but I also feel resistence > > against including stuff I cannot maintain. > > ... > > This isn't rocket science. Red Hat Python? I'm all for it! :-) MAL: > I think we should wait for distutils to get up and running > perfectly for everyone before taking such a step. Fair enough -- but in the mean time, no more pushing for new modules in the core distribution (distutils excluded). --Guido van Rossum (home page: http://www.python.org/~guido/) From gward at cnri.reston.va.us Fri Dec 17 15:59:09 1999 From: gward at cnri.reston.va.us (Greg Ward) Date: Fri, 17 Dec 1999 09:59:09 -0500 Subject: [Python-Dev] Batteries Included? In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us>; from guido@cnri.reston.va.us on Fri, Dec 17, 1999 at 09:43:23AM -0500 References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> <199912171443.JAA12414@eric.cnri.reston.va.us> Message-ID: <19991217095908.B8799@cnri.reston.va.us> On 17 December 1999, Guido van Rossum said: > Fair enough -- but in the mean time, no more pushing for new modules > in the core distribution (distutils excluded). So anyone who wants a new module snuck into the core just has to convince me to add it the distutils package, right? >snicker< Greg From jeremy at cnri.reston.va.us Fri Dec 17 19:30:37 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Fri, 17 Dec 1999 13:30:37 -0500 (EST) Subject: [Python-Dev] Batteries Included? In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us> References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> <199912171443.JAA12414@eric.cnri.reston.va.us> Message-ID: <14426.33101.757523.853781@goon.cnri.reston.va.us> >>>>> "GvR" == Guido van Rossum writes: >> Guido van Rossum wrote: I like the batteries included >> approach, but I also feel resistence against including stuff I >> cannot maintain. ... This isn't rocket science. Red Hat >> Python? I'm all for it! :-) >> MAL wrote: >> I think we should wait for distutils to get up and running >> perfectly for everyone before taking such a step. GvR> Fair enough -- but in the mean time, no more pushing for new GvR> modules in the core distribution (distutils excluded). Perhaps the right long-term solution (post-distutils) is to split Python into a core architected by Guido and a bazaar-style standard library maintained in a more apache-style. Jeremy From jim at interet.com Fri Dec 17 16:25:10 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 10:25:10 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> Message-ID: <385A55D6.A8A05EB9@interet.com> "M.-A. Lemburg" wrote: > Unfortunately, I always get the following traceback when trying > to print the directory: OK, I changed the decompress code (10:23 AM), please re-try. > with compression defaulting to 8 rather than 0 (most zip files > will be deflated since this is the ZIP default). The compress mode only applies to writing. On read, the method recorded in the file controls. JimA From jim at interet.com Fri Dec 17 15:49:20 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 09:49:20 -0500 Subject: [Python-Dev] zipfile.py References: <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com> Message-ID: <385A4D70.A162C584@interet.com> Fredrik Lundh wrote: > > James C. Ahlstrom wrote: > > > > > > ftp://ftp.interet.com/pub/pylib.html > -- it would be great if "open" could take an open file > object as well as a file name. I put these arguments into the constructor now. > (in this case, you also need to document what you > expect from the underlying file object: read, write, > seek, tell should be enough, right? haven't looked > at the code -- assuming it works, I'm only interested > in the interface) OK, docs updated. > -- I assume "open" adds "b" to the given mode argument. Correct. The mode can be either "w" or "wb" etc., and it works. > -- "dir" looks a bit strange. and hey, there's no "listdir" > in there. I'd prefer a recursive "listdir" method, which > takes an optional "depth" argument (e.g. 0=this dir, > 1=this dir and first subdir, None=infinity, i.e. the full > tree). I added a plain listdir() and changed dir() to printdir(). I also documented self.TOC which gets you the values too. JimA From jim at interet.com Fri Dec 17 15:39:51 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 09:39:51 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> Message-ID: <385A4B37.333B9443@interet.com> "M.-A. Lemburg" wrote: > > "James C. Ahlstrom" wrote: > > > ftp://ftp.interet.com/pub/pylib.html > > > Unfortunately, I always get the following traceback when trying > to print the directory: Yes, compression isn't there yet. I am looking into it. > Some notes on the API: > ---------------------- > * I would find it more convenient if the filename and mode > would be constructor parameters, e.g. > > zfile = zipfile('myfile.zip','rb') OK, done. > with compression defaulting to 8 rather than 0 (most zip files > will be deflated since this is the ZIP default). Until compression works, and zlib ships with Python I would rather default to no compression (method 0). Otherwise this is not useful as a Python import archive. > * Also, I would like a method much like the os.listdir() > which returns a list of filenames rather than print it > to stdout. OK, done. > * .is_zipfile() should probably be a separate function: it > doesn't use any of the class' features. OK, done. > Aside: I found that you are using undocumented arguments to > zlib.compressobj() ... are these extra arguments left out of > the documentation on purpose or by simple oversight ? I couldn't > find them in the HTML docs and neither in the docstrings. I am following the CNRI code blindly here. I don't have docs either. JimA From jack at oratrix.nl Fri Dec 17 23:54:03 1999 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 17 Dec 1999 23:54:03 +0100 Subject: [Python-Dev] Batteries Included? In-Reply-To: Message by Jeremy Hylton , Fri, 17 Dec 1999 13:30:37 -0500 (EST) , <14426.33101.757523.853781@goon.cnri.reston.va.us> Message-ID: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl> Recently, Jeremy Hylton said: > Perhaps the right long-term solution (post-distutils) is to split > Python into a core architected by Guido and a bazaar-style standard > library maintained in a more apache-style. I can't help feeling uncomfortable with this. I've had quite some work to get an Apache with SSL up and running, even though someone gave me quite precise instructions. With Perl I fared even worse, despite their distutils-like package, when I wanted to try a PalmPilot package for Unix that needed Perl. I finally had to give up after quite some effort because the addon installers kept finding the older version of Perl that the system mgr had installed in stead of my newer version. I think distutils will be wonderful for us, the Python community, but something more RedHattish is needed for the general world who just want Python plus a certain set of extensions because some application needs it, so they can just download a fresh copy of ParrotPython 3.4.4 and know the application will work, without interfering with another application that happens to use Inquisition 1a5 and lives elsewhere on the disk. And maybe the answer is a much simpler freezing process, like MacPython BuildApplication where any Python user can drop a script on it and end up with a fully self-contained app guaranteed (well.... No reports to the contrary have been heard so far, at least:-) to contain everything needed and not interfere with an existing MacPython installation (or be interfered with by it). Then a popular app will have prebuilt binaries available for all platforms quickly, made by the Python community, and the enduser interested in the app but not in Python can simply download that. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mal at lemburg.com Sat Dec 18 14:17:52 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 18 Dec 1999 14:17:52 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A4B37.333B9443@interet.com> Message-ID: <385B8980.11CDE9AC@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > > "James C. Ahlstrom" wrote: > > > > ftp://ftp.interet.com/pub/pylib.html > > > > > > Unfortunately, I always get the following traceback when trying > > to print the directory: > > Yes, compression isn't there yet. I am looking into it. Great :-) > > Some notes on the API: > > ---------------------- > > * I would find it more convenient if the filename and mode > > would be constructor parameters, e.g. > > > > zfile = zipfile('myfile.zip','rb') > > OK, done. > > > with compression defaulting to 8 rather than 0 (most zip files > > will be deflated since this is the ZIP default). > > Until compression works, and zlib ships with Python I > would rather default to no compression (method 0). Otherwise > this is not useful as a Python import archive. Point taken. Perhaps it would be even better to not have a default at all: that way people will have to think about the issue *before* implementing it, rather than debug code that produces tracebacks. > > * Also, I would like a method much like the os.listdir() > > which returns a list of filenames rather than print it > > to stdout. > > OK, done. > > > * .is_zipfile() should probably be a separate function: it > > doesn't use any of the class' features. > > OK, done. Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 13 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Sat Dec 18 16:16:44 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 18 Dec 1999 16:16:44 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> Message-ID: <385BA55C.9DFCA88D@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > Unfortunately, I always get the following traceback when trying > > to print the directory: > > OK, I changed the decompress code (10:23 AM), please re-try. Everything is fine now... it's really impressive how easy you can manipulate ZIP files with it. One thing I'd suugest is to include some way to delete and update contents, e.g. the write() method should overwrite any existing entry in the archive (if it not already does -- I haven't tested it, just read the code and it seems to raise an exception), plus maybe a .remove() method which deletes an entry. > > with compression defaulting to 8 rather than 0 (most zip files > > will be deflated since this is the ZIP default). > > The compress mode only applies to writing. On read, the > method recorded in the file controls. True. How about making the compression argument mandatory for file opened in 'wb' mode only ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 13 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From da at ski.org Sat Dec 18 18:35:00 1999 From: da at ski.org (David Ascher) Date: Sat, 18 Dec 1999 09:35:00 -0800 Subject: [Python-Dev] Year 2000 O'Reilly Python Conference Message-ID: <003501bf497e$368f6f60$e655cfc0@ski.org> I just got off the phone with someone at O'Reilly, who is starting to plan the next O'Reilly Open Source Convention. I've agreed to be the chair of the Python conference, just so that there are no delays in getting the conference organized. If someone feels that I should not be chair, speak now and we can figure out who takes the 'job'. There are short-term and long-term issues to discuss: Short term: - We need a program committee -- If you're interested in being on said committee or know someone who should be, let me know. I'd like to get representatives from various subconstituencies on there (web types, zope types, business types, scientist types, linux types, hackers, etc.) - The call for papers is going on the O'Reilly website soon. I will try and get them to pass things by me first, but if we want to emphasize specific kinds of paper submissions, we need to decide that soon. - Greg or Barry, is it possible for one of you to setup a mailman mailing list which will be used by the program committee? eGroups is easy for me to setup, but lots of people hated it last year. I don't want to pollute python-dev with conference discussions. Longer term: - The schedule for the conference is (supposedly) going to be the same as last year. conference-wide keynotes at the beginning of both days, and 4x90minute segments. - We have two parallel tracks - We have 4 half-day tutorial slots - All of the paper materials have to be 'in' by March 1. We need to decide how much time we need to go through the review/revision process ourselves. In other words, the deadline for submissions is up to us, but we don't have that much time. --david ascher From jeremy at cnri.reston.va.us Sat Dec 18 23:39:58 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Sat, 18 Dec 1999 17:39:58 -0500 (EST) Subject: [Python-Dev] zipfile.py In-Reply-To: <385A4B37.333B9443@interet.com> References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A4B37.333B9443@interet.com> Message-ID: <14428.3390.671438.663889@bitdiddle.cnri.reston.va.us> >>>>> "JCA" == James C Ahlstrom writes: >> Aside: I found that you are using undocumented arguments to >> zlib.compressobj() ... are these extra arguments left out of the >> documentation on purpose or by simple oversight ? I couldn't find >> them in the HTML docs and neither in the docstrings. JCA> I am following the CNRI code blindly here. I don't have docs JCA> either. The docs for the zlib module are quite out of date, although I think the docstrings may be better (not necessarily completely up-to-date thought :-). The specific parameters to pass to zlib don't seem to be documented anywhere either; IIRC I dug them out of some example C code somewhere that used zlib to read Zip files. Jeremy From gstein at lyra.org Sun Dec 19 00:14:02 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 15:14:02 -0800 (PST) Subject: [Python-Dev] Year 2000 O'Reilly Python Conference In-Reply-To: <003501bf497e$368f6f60$e655cfc0@ski.org> Message-ID: On Sat, 18 Dec 1999, David Ascher wrote: >... > - Greg or Barry, is it possible for one of you to setup a mailman mailing > list which will be used by the program committee? eGroups is easy for me to > setup, but lots of people hated it last year. I don't want to pollute > python-dev with conference discussions. Done. ora-pc at pythonpros.com. http://mailman.pythonpros.com/mailman/listinfo/ora-pc I also removed the old monterey-speakers mailing list :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From da at ski.org Sun Dec 19 08:24:51 1999 From: da at ski.org (David Ascher) Date: Sat, 18 Dec 1999 23:24:51 -0800 Subject: [Python-Dev] Year 2000 O'Reilly Python Conference References: Message-ID: <013301bf49f2$243946f0$df55cfc0@ski.org> From: Greg Stein > On Sat, 18 Dec 1999, David Ascher wrote: > >... > > - Greg or Barry, is it possible for one of you to setup a mailman mailing > > list which will be used by the program committee? > Done. ora-pc at pythonpros.com. > http://mailman.pythonpros.com/mailman/listinfo/ora-pc Thanks, Greg. Now, folks, please consider joining the program committee. We need a few volunteers - not too many, but somewhere between 5 and 10 would be good. You don't even have to commit to making it to the conference, if that's a concern. -- david From jim at interet.com Mon Dec 20 15:18:17 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 09:18:17 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <385E3AA9.162BE568@interet.com> Greg Stein wrote: > Do you have a ZipImporter written? Yes, it is ftp://ftp.interet.com/pub/importer.py JimA From jim at interet.com Mon Dec 20 15:35:58 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 09:35:58 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> Message-ID: <385E3ECE.F8DCDE28@interet.com> "M.-A. Lemburg" wrote: > One thing I'd suugest is to include some way to delete and > update contents, e.g. the write() method should overwrite > any existing entry in the archive (if it not already does -- > I haven't tested it, just read the code and it seems to raise > an exception), plus maybe a .remove() method which deletes > an entry. Currently, adding a file requires the "a" append mode, while the "w" mode re-writes the file. Adding a duplicate file name produces an error message. I can change this, but removing a file would either waste space, or else the file contents must be copied over the old file and all the offsets updated. I don't like this because it is complicated, and I think it is fast enough to just re-write the archive. But it could be added if people want. > True. How about making the compression argument mandatory > for file opened in 'wb' mode only ? The default of zero provides a little guidance that you should use zero. I added a warning message if 8 is used which should discourage people from using 8. Or I could disallow 8. Is that OK? JimA From jim at interet.com Mon Dec 20 16:34:02 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 10:34:02 -0500 Subject: [Python-Dev] Batteries Included? References: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl> Message-ID: <385E4C6A.BEC0F728@interet.com> Jack Jansen wrote: > And maybe the answer is a much simpler freezing process, like > MacPython BuildApplication where any Python user can drop a script on > it and end up with a fully self-contained app guaranteed (well.... No > reports to the contrary have been heard so far, at least:-) to contain > everything needed and not interfere with an existing MacPython > installation (or be interfered with by it). Then a popular app will > have prebuilt binaries available for all platforms quickly, made by > the Python community, and the enduser interested in the app but not in > Python can simply download that. IMHO the "much simpler freezing process" is archive files. A simple script can build them, imputil can import them, and the only remaining problem is to find them. Please see: ftp://ftp.interet.com/pub/bootmodule.html ftp://ftp.interet.com/pub/pylib.html JimA From jack at oratrix.nl Mon Dec 20 17:50:32 1999 From: jack at oratrix.nl (Jack Jansen) Date: Mon, 20 Dec 1999 17:50:32 +0100 Subject: [Python-Dev] Batteries Included? In-Reply-To: Message by "James C. Ahlstrom" , Mon, 20 Dec 1999 10:34:02 -0500 , <385E4C6A.BEC0F728@interet.com> Message-ID: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl> > IMHO the "much simpler freezing process" is archive files. A simple > script can build them, imputil can import them, and the only > remaining problem is to find them. Please see: Archive files solves the problem for Python modules. But that leaves the problem of dynamically loaded modules. And resources for dialogs and such, if you use native GUI stuff on Mac or Windows. And most serious applications that I've seen (GRiNS and Zope, to name two, Mailman is the only exception I can think of) depend on non-standard plugin modules. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mal at lemburg.com Mon Dec 20 15:44:42 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 20 Dec 1999 15:44:42 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> Message-ID: <385E40DA.37AD704F@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > One thing I'd suugest is to include some way to delete and > > update contents, e.g. the write() method should overwrite > > any existing entry in the archive (if it not already does -- > > I haven't tested it, just read the code and it seems to raise > > an exception), plus maybe a .remove() method which deletes > > an entry. > > Currently, adding a file requires the "a" append mode, while > the "w" mode re-writes the file. Adding a duplicate file name > produces an error message. I can change this, > but removing a file would either waste space, or else the file > contents must be copied over the old file and all the offsets > updated. I don't like this because it is complicated, and I think > it is fast enough to just re-write the archive. But it > could be added if people want. I guess it would be ok to waste space. You could provide a .cleanup() or .rewrite() method that takes care of reorganizing the file to fill up the gaps. > > True. How about making the compression argument mandatory > > for file opened in 'wb' mode only ? > > The default of zero provides a little guidance that you should > use zero. I added a warning message if 8 is used which should > discourage people from using 8. Or I could disallow 8. > Is that OK? Well the module seems to work just fine with compression on, so disallowing it or issuing a warning would reduce its value, IMHO. How about making compression a boolean value and then converting any true value to 8 ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 11 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake at acm.org Mon Dec 20 19:52:41 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 20 Dec 1999 13:52:41 -0500 (EST) Subject: [Python-Dev] posix module In-Reply-To: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com> References: <14423.61493.90107.433664@weyr.cnri.reston.va.us> <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com> Message-ID: <14430.31481.402469.896400@weyr.cnri.reston.va.us> Fredrik Lundh writes: > (current CVS stuff, on Red Hat 5.2) Ok, Guido figured it out; this is a typo in the header /usr/include/confname.h; the enum and the #define don't have the same name. Do you know a way to detect the Linux kernel version using pre-preprocessor macros? (Seems very fragile.) Would it be reasonable to only add that table entry for kernel versions >= 2.2? -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim at interet.com Mon Dec 20 20:25:27 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 14:25:27 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> Message-ID: <385E82A7.72345807@interet.com> "M.-A. Lemburg" wrote: > I guess it would be ok to waste space. You could provide > a .cleanup() or .rewrite() method that takes care of > reorganizing the file to fill up the gaps. OK, adding a duplicate name replaces the old file. > Well the module seems to work just fine with compression > on, so disallowing it or issuing a warning would reduce its value, > IMHO. Yes compression works, but 90% of Python installations don't have zlib, so it is an ERROR to create archives with compression when these archives are distributed to other sites. > How about making compression a boolean value and then > converting any true value to 8 ? It would close the door to future or other compression methods. Currently the method must be 0 or 8 or a traceback will result. JimA From jim at interet.com Mon Dec 20 20:33:11 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 14:33:11 -0500 Subject: [Python-Dev] Batteries Included? References: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl> Message-ID: <385E8477.F727E0F8@interet.com> Jack Jansen wrote: > Archive files solves the problem for Python modules. But that leaves the > problem of dynamically loaded modules. And resources for dialogs and such, if > you use native GUI stuff on Mac or Windows. Point taken. For dynamically loaded modules, I believe in following the native system's DLL path, and not adding eccentric Python logic. But many disagreed a couple week's ago when I raised this. For resources, I think the archive file can accommodate this, although it seems highly system dependent. Anyway, any file at all can live in the archive and the import mechanism for *.pyc will not be damaged nor unduly slowed down by its presence. JimA From gstein at lyra.org Mon Dec 20 21:11:50 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 20 Dec 1999 12:11:50 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <385E82A7.72345807@interet.com> Message-ID: On Mon, 20 Dec 1999, James C. Ahlstrom wrote: > "M.-A. Lemburg" wrote: > > I guess it would be ok to waste space. You could provide > > a .cleanup() or .rewrite() method that takes care of > > reorganizing the file to fill up the gaps. > > OK, adding a duplicate name replaces the old file. But it shouldn't print a warning(!). If an application wants to replace a file, then stuff shouldn't appear on stdout as a result. > > Well the module seems to work just fine with compression > > on, so disallowing it or issuing a warning would reduce its value, > > IMHO. > > Yes compression works, but 90% of Python installations don't have > zlib, so it is an ERROR to create archives with compression when > these archives are distributed to other sites. While it may be problem to distribute them to other sites, that is not up to the library. If I want compression, then I should get compression. A library module should not determine application-level policy. The warning that __init__ prints shouldn't be there. Really: there should not be a single "print" in the library (well, printdir() is fine... that's what it is supposed to do; printing in the test code would be fine). In normal, or even exceptional(!), operation there should never be a print. > > How about making compression a boolean value and then > > converting any true value to 8 ? > > It would close the door to future or other compression methods. > Currently the method must be 0 or 8 or a traceback will result. I definitely agree with JimA here. For example, maybe we want bzip compression in there. Sure, non-portable, but that's my problem :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim at interet.com Mon Dec 20 21:50:46 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 15:50:46 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <385E96A6.40CCF285@interet.com> Greg Stein wrote: > > On Mon, 20 Dec 1999, James C. Ahlstrom wrote: > > "M.-A. Lemburg" wrote: > But it shouldn't print a warning(!). If an application wants to replace a > file, then stuff shouldn't appear on stdout as a result. OK, no warning. > The warning that __init__ prints shouldn't be there. OK, it is gone. > Really: there should not be a single "print" in the library (well, No print unless _debug > 0 JimA From mal at lemburg.com Mon Dec 20 22:16:39 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 20 Dec 1999 22:16:39 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com> Message-ID: <385E9CB7.5DE4848A@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > I guess it would be ok to waste space. You could provide > > a .cleanup() or .rewrite() method that takes care of > > reorganizing the file to fill up the gaps. > > OK, adding a duplicate name replaces the old file. Cool. > > Well the module seems to work just fine with compression > > on, so disallowing it or issuing a warning would reduce its value, > > IMHO. > > Yes compression works, but 90% of Python installations don't have > zlib, so it is an ERROR to create archives with compression when > these archives are distributed to other sites. Sure, for the sake of creating Python code archives, but your module is much more versatile: e.g. I could automatically create ZIP archives of log files or sets of other files and then have Python email them to someone who uses these archives through standard tools such as WinZip -- the target doesn't always have to be a Python process :-) > > How about making compression a boolean value and then > > converting any true value to 8 ? > > It would close the door to future or other compression methods. > Currently the method must be 0 or 8 or a traceback will result. Ok. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 11 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jim at interet.com Mon Dec 20 22:37:20 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 16:37:20 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com> <385E9CB7.5DE4848A@lemburg.com> Message-ID: <385EA190.6AF511BD@interet.com> "M.-A. Lemburg" wrote: > > Sure, for the sake of creating Python code archives, but > your module is much more versatile: e.g. I could automatically > create ZIP archives of log files or sets of other files and OK, zipfile.py no longer complains about compression != 0 JimA From fdrake at acm.org Tue Dec 21 23:42:26 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 21 Dec 1999 17:42:26 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212238.RAA13660@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> Message-ID: <14432.594.33416.600794@weyr.cnri.reston.va.us> Guido van Rossum writes: > + > + class GetoptError(Exception): > + opt = '' > + msg = '' > + def __init__(self, *args): > + self.args = args > + if len(args) == 1: > + self.msg = args[0] > + elif len(args) == 2: > + self.msg = args[0] > + self.opt = args[1] > + > + def __str__(self): > + return self.msg > > ! error = GetoptError # backward compatibility This breaks as soon as the standard exceptions are strings; does this mean -X will be removed in the next release? (Please????) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw at cnri.reston.va.us Tue Dec 21 23:44:46 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 21 Dec 1999 17:44:46 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> Message-ID: <14432.734.155183.508785@anthem.cnri.reston.va.us> >>>>> "Fred" == Fred L Drake, Jr writes: Fred> This breaks as soon as the standard exceptions are Fred> strings; does this mean -X will be removed in the next Fred> release? (Please????) Pretty please? :) From guido at CNRI.Reston.VA.US Wed Dec 22 00:05:28 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:05:28 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 17:42:26 EST." <14432.594.33416.600794@weyr.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> Message-ID: <199912212305.SAA13722@eric.cnri.reston.va.us> > Guido van Rossum writes: > > + > > + class GetoptError(Exception): > > + opt = '' > > + msg = '' > > + def __init__(self, *args): > > + self.args = args > > + if len(args) == 1: > > + self.msg = args[0] > > + elif len(args) == 2: > > + self.msg = args[0] > > + self.opt = args[1] > > + > > + def __str__(self): > > + return self.msg > > > > ! error = GetoptError # backward compatibility [Fred Drake] > This breaks as soon as the standard exceptions are strings; does > this mean -X will be removed in the next release? (Please????) Not a bad idea. Anybody got a reason why -X should stay? (The next step would be to outlaw raise with a string argument; I think I can't make that for 1.6. But it would be a good idea to scan the standard library for string exceptions and convert all of them.) --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Wed Dec 22 00:21:38 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:21:38 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <14432.2946.857539.898577@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Anybody got a reason why -X should stay? Kill it. Guido> (The next step would be to outlaw raise with a string Guido> argument; I think I can't make that for 1.6. But it would Guido> be a good idea to scan the standard library for string Guido> exceptions and convert all of them.) Or require that exception classes be derived from exceptions.Exception :) -Barry From guido at CNRI.Reston.VA.US Wed Dec 22 00:23:29 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:23:29 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 18:21:38 EST." <14432.2946.857539.898577@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> Message-ID: <199912212323.SAA13803@eric.cnri.reston.va.us> [Barry] > Guido> Anybody got a reason why -X should stay? > > Kill it. You already said that. Anybody else? > Guido> (The next step would be to outlaw raise with a string > Guido> argument; I think I can't make that for 1.6. But it would > Guido> be a good idea to scan the standard library for string > Guido> exceptions and convert all of them.) > > Or require that exception classes be derived from exceptions.Exception > :) That's hard to require. But it could easily be a requirement checked by one of the hypothetical typecheckers that are being discussed in the types-sig. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Wed Dec 22 00:27:31 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:27:31 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> Message-ID: <14432.3299.404561.698836@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: BAW> Or require that exception classes be derived from BAW> exceptions.Exception :) Guido> That's hard to require. But it could easily be a Guido> requirement checked by one of the hypothetical typecheckers Guido> that are being discussed in the types-sig. Hmm, the raise could probably enforce this, but it might not be that useful. -Barry From guido at CNRI.Reston.VA.US Wed Dec 22 00:40:22 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:40:22 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 18:27:31 EST." <14432.3299.404561.698836@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> Message-ID: <199912212340.SAA13851@eric.cnri.reston.va.us> > >>>>> "Guido" == Guido van Rossum writes: > > BAW> Or require that exception classes be derived from > BAW> exceptions.Exception :) > > Guido> That's hard to require. But it could easily be a > Guido> requirement checked by one of the hypothetical typecheckers > Guido> that are being discussed in the types-sig. > > Hmm, the raise could probably enforce this, but it might not be that > useful. > > -Barry The raise could easily enforce this, but it would break lots of existing code. I wish I had done it right from the start -- then exceptions would have been classes from the start and would have required inheritance from the Exception base class. Like in Java. (And in C++?) --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at CNRI.Reston.VA.US Wed Dec 22 00:43:59 1999 From: bwarsaw at CNRI.Reston.VA.US (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:43:59 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> Message-ID: <14432.4287.543786.308468@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> The raise could easily enforce this, but it would break Guido> lots of existing code. Maybe not (I'm not sure). All the standard exceptions inherit from Exception, and of course there'd be nothing to enforce for existing user-defined string based exceptions. How pervasive are user-defined class based exceptions that don't inherit from Exception? (I don't know, and I haven't grepped, but I think we've been making that recommendation from day 1 of class-based standard exceptions, and I try to follow this recommendation in my own code). Guido> I wish I had done it right from the start -- then Guido> exceptions would have been classes from the start and would Guido> have required inheritance from the Exception base class. Guido> Like in Java. (And in C++?) All Hail, Python 2.0, our Savior and Redeemer! :) -Barry From guido at CNRI.Reston.VA.US Wed Dec 22 00:49:09 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:49:09 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 18:43:59 EST." <14432.4287.543786.308468@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> <14432.4287.543786.308468@anthem.cnri.reston.va.us> Message-ID: <199912212349.SAA13892@eric.cnri.reston.va.us> > From: "Barry A. Warsaw" > >>>>> "Guido" == Guido van Rossum writes: > > Guido> The raise could easily enforce this, but it would break > Guido> lots of existing code. > > Maybe not (I'm not sure). All the standard exceptions inherit from > Exception, and of course there'd be nothing to enforce for existing > user-defined string based exceptions. How pervasive are user-defined > class based exceptions that don't inherit from Exception? (I don't > know, and I haven't grepped, but I think we've been making that > recommendation from day 1 of class-based standard exceptions, and I > try to follow this recommendation in my own code). Yes, but class-based user exceptions existed many Python versions before class-based standard exceptions! Two examples in the standard library: ConfigParser.py and xdrlib.py. > All Hail, Python 2.0, our Savior and Redeemer! :) Or, the perfect excuse for procrastination :) (But yes, 2.0 will enforce this.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Wed Dec 22 00:53:50 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 15:53:50 -0800 (PST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: On Tue, 21 Dec 1999, Guido van Rossum wrote: >... > [Fred Drake] > > This breaks as soon as the standard exceptions are strings; does > > this mean -X will be removed in the next release? (Please????) > > Not a bad idea. > > Anybody got a reason why -X should stay? Kill it. > (The next step would be to outlaw raise with a string argument; I > think I can't make that for 1.6. But it would be a good idea to scan > the standard library for string exceptions and convert all of them.) Keep string exceptions. I think there is probably a lot of code that still uses them. I know I do :-) We can issues warnings about string exceptions via the type-checking tool. Cheers, -g -- Greg Stein, http://www.lyra.org/ From bwarsaw at CNRI.Reston.VA.US Wed Dec 22 00:54:04 1999 From: bwarsaw at CNRI.Reston.VA.US (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:54:04 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> <14432.4287.543786.308468@anthem.cnri.reston.va.us> <199912212349.SAA13892@eric.cnri.reston.va.us> Message-ID: <14432.4892.908107.421149@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Yes, but class-based user exceptions existed many Python Guido> versions before class-based standard exceptions! True, but I suspect that legacy class-based user exceptions are rare. I might be wrong, but you're absolutely right that these would all be broken. Guido> Two examples in the standard library: ConfigParser.py and Guido> xdrlib.py. Fortunately these are fixed with two 11 character patches :) I'm not necessarily arguing for or against tightening this. -Barry From gmcm at hypernet.com Wed Dec 22 00:55:07 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Tue, 21 Dec 1999 18:55:07 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us> References: Your message of "Tue, 21 Dec 1999 18:27:31 EST." <14432.3299.404561.698836@anthem.cnri.reston.va.us> Message-ID: <1266302877-22249299@hypernet.com> [Guido] > I wish I had done it right from the start -- then exceptions > would have been classes from the start and would have required > inheritance from the Exception base class. Like in Java. (And > in C++?) In C++ you can throw anything at all. Strings, ints, that Warsaw blockhead... off-topic-ly y'rs - Gordon From tismer at appliedbiometrics.com Wed Dec 22 01:57:27 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 22 Dec 1999 01:57:27 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> Message-ID: <386021F7.4F94C458@appliedbiometrics.com> Guido van Rossum wrote: > > [Barry] > > Guido> Anybody got a reason why -X should stay? > > > > Kill it. > > You already said that. > > Anybody else? I'd say kill -X, but keep allowing string exceptions if it doesn't cost too much. I think of C++, like Gordon said. Also I'd take the chance and move the exceptions Python module back into the core, as a frozen mdule or whatever. Reason: At the moment, the CVS version of the Python library is incompatible to 1.5.2, which makes testing against the standard dist quite inconvenient. A compiled CVS Python does not run under PythonWin when I put it into my standard installation. Or is there an easy way to switch all settings to a completely different path? Anyway, I'm most probably off until Y2K. See ya all then, provided we survive - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido at CNRI.Reston.VA.US Wed Dec 22 02:01:16 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 20:01:16 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 01:57:27 +0100." <386021F7.4F94C458@appliedbiometrics.com> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <386021F7.4F94C458@appliedbiometrics.com> Message-ID: <199912220101.UAA14109@eric.cnri.reston.va.us> > I'd say kill -X, but keep allowing string exceptions if > it doesn't cost too much. I think of C++, like Gordon said. Agreed. > Also I'd take the chance and move the exceptions Python > module back into the core, as a frozen mdule or whatever. > > Reason: At the moment, the CVS version of the Python library > is incompatible to 1.5.2, which makes testing against the > standard dist quite inconvenient. A compiled CVS Python > does not run under PythonWin when I put it into my standard > installation. Or is there an easy way to switch all settings > to a completely different path? Point the PYTHONHOME variable to the top of your install directory. (On Windows you may have to kill the registry settings -- this is a bug.) > Anyway, I'm most probably off until Y2K. Ditto. > See ya all then, provided we survive - chris Best wishes to all, --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at digicool.com Wed Dec 22 14:54:41 1999 From: jim at digicool.com (Jim Fulton) Date: Wed, 22 Dec 1999 08:54:41 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <3860D821.576B3146@digicool.com> Guido van Rossum wrote: > > (The next step would be to outlaw raise with a string argument; I > think I can't make that for 1.6. But it would be a good idea to scan > the standard library for string exceptions and convert all of them.) This would be waaaaay to big a change for Python 1.x. There are alot of Python modules outside the standard distribution that use string exceptions. This would be a huge backward incompatability. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fdrake at acm.org Wed Dec 22 15:23:29 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 22 Dec 1999 09:23:29 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <14432.57057.535205.558@weyr.cnri.reston.va.us> Guido van Rossum writes: > (The next step would be to outlaw raise with a string argument; I > think I can't make that for 1.6. But it would be a good idea to scan > the standard library for string exceptions and convert all of them.) I don't know if requiring class-based exceptions will make the runtime any simpler, but that seems the only reason to do it. The only reason to remove -X, and possibly the string exception fallback code, is to ensure that we *can* subclass Exception and friends without having to catch TypeError and do something different. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake at acm.org Wed Dec 22 15:25:33 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 22 Dec 1999 09:25:33 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <14432.2946.857539.898577@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> Message-ID: <14432.57181.944364.427093@weyr.cnri.reston.va.us> Barry A. Warsaw writes: > Or require that exception classes be derived from exceptions.Exception > :) Ok, it's early, and maybe I haven't had enough coffee(!). But is this serious? Does JPython gain some benefit from this, is it your preference, or are you just yanking on my leg? ("Pulling my arm" as my 5-year-old says!) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From guido at CNRI.Reston.VA.US Wed Dec 22 15:40:39 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 09:40:39 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 09:23:29 EST." <14432.57057.535205.558@weyr.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.57057.535205.558@weyr.cnri.reston.va.us> Message-ID: <199912221440.JAA16198@eric.cnri.reston.va.us> > From: "Fred L. Drake, Jr." > > Guido van Rossum writes: > > (The next step would be to outlaw raise with a string argument; I > > think I can't make that for 1.6. But it would be a good idea to scan > > the standard library for string exceptions and convert all of them.) > > I don't know if requiring class-based exceptions will make the > runtime any simpler, but that seems the only reason to do it. Do what? *Require* class exceptions? You're probably right, and I think the gain is minimal. There's another reason to scan the std library though -- not to set a bad example. I want to eventually (in 2.0) move to a class-derived-from-Exception-only scheme. > The only reason to remove -X, and possibly the string exception > fallback code, is to ensure that we *can* subclass Exception and > friends without having to catch TypeError and do something different. And that's a very good reason indeed. Let me repeat my plans for 1.6. - Remove -X; the standard exceptions are always class-based. - Change all standard library and other example code to use class-based exceptions with a standard exception as base class, to set an example. - Still allow string exceptions in user code. - Still allow class exceptions that don't use a standard exception base class in user code. --Guido van Rossum (home page: http://www.python.org/~guido/) From marangoz at python.inrialpes.fr Wed Dec 22 19:09:47 1999 From: marangoz at python.inrialpes.fr (Vladimir Marangozov) Date: Wed, 22 Dec 1999 19:09:47 +0100 (CET) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912221440.JAA16198@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 09:40:39 AM Message-ID: <199912221809.TAA25322@python.inrialpes.fr> Guido van Rossum wrote: > > [Fred Drake] > > I don't know if requiring class-based exceptions will make the > > runtime any simpler, but that seems the only reason to do it. > > Do what? *Require* class exceptions? You're probably right, and I > think the gain is minimal. Yes. Besides, I still think that string-based exceptions are just convenient for quick & dirty, throw-away test scripts. > > Let me repeat my plans for 1.6. > > - Remove -X; the standard exceptions are always class-based. > > - Change all standard library and other example code to use > class-based exceptions with a standard exception as base class, to set > an example. > > - Still allow string exceptions in user code. > > - Still allow class exceptions that don't use a standard exception > base class in user code. Sounds okay. --- PS: I'm particularly happy today :-) because I've finally published the new version of our Web site http://www.inrialpes.fr. Two things I'd like to mention: (1) it shouldn't have been possible without quick Python scripts ;) (2) I'll find the time to reinvoke some of the topics discussed here instead of being mute as a fish. That said, Merry Christmas and a Happy New Year to all of you! -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From guido at CNRI.Reston.VA.US Wed Dec 22 19:23:45 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 13:23:45 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 19:09:47 +0100." <199912221809.TAA25322@python.inrialpes.fr> References: <199912221809.TAA25322@python.inrialpes.fr> Message-ID: <199912221823.NAA16517@eric.cnri.reston.va.us> Vladimir.Marangozov at inrialpes.fr: > Yes. Besides, I still think that string-based exceptions are just > convenient for quick & dirty, throw-away test scripts. They have a hard-to-understand quirk though: the id() of the string is used to check rather than its value, so that except "foo" doesn't necessarily catch raise "foo"; but due to various optimization, this usually works, and people get bent out of shape when it doesn't. Since you have to give your exception a name, how hard is it to say class MyError(Exception): pass rathern than MyError = "MyError" ? --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Wed Dec 22 19:33:19 1999 From: gstein at lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 10:33:19 -0800 (PST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us> Message-ID: On Wed, 22 Dec 1999, Guido van Rossum wrote: > Vladimir.Marangozov at inrialpes.fr: > > Yes. Besides, I still think that string-based exceptions are just > > convenient for quick & dirty, throw-away test scripts. > > They have a hard-to-understand quirk though: the id() of the string is > used to check rather than its value, so that except "foo" doesn't > necessarily catch raise "foo"; but due to various optimization, this > usually works, and people get bent out of shape when it doesn't. > Since you have to give your exception a name, how hard is it to say > > class MyError(Exception): pass > > rathern than > > MyError = "MyError" > > ? It is very hard. My fingers do the typing for me, and they fill in strings. I'm trying to teach them otherwise, but they insist. You're also assuming that MyError gets defined. Sometimes, my little fingers like typing: try: foo except: raise "foo broke for some reason" Quick and dirty, indeed! :-) Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From fdrake at acm.org Wed Dec 22 20:59:55 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 22 Dec 1999 14:59:55 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> Message-ID: <14433.11707.607533.698901@weyr.cnri.reston.va.us> Guido van Rossum writes: > I wish I had done it right from the start -- then exceptions would > have been classes from the start and would have required inheritance > from the Exception base class. Like in Java. (And in C++?) I've seen this said or hinted at in a couple of places (the specific requirement that exception derive from Exception), but I've seen nothing that indicates any reason or derived value for this. Could someone please clarify? -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From guido at CNRI.Reston.VA.US Wed Dec 22 21:05:52 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 15:05:52 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 14:59:55 EST." <14433.11707.607533.698901@weyr.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> <14433.11707.607533.698901@weyr.cnri.reston.va.us> Message-ID: <199912222005.PAA17291@eric.cnri.reston.va.us> > From: "Fred L. Drake, Jr." > Guido van Rossum writes: > > I wish I had done it right from the start -- then exceptions would > > have been classes from the start and would have required inheritance > > from the Exception base class. Like in Java. (And in C++?) > > I've seen this said or hinted at in a couple of places (the specific > requirement that exception derive from Exception), but I've seen > nothing that indicates any reason or derived value for this. Could > someone please clarify? It's simply an extra bit of checking that your program is reasonable -- if you accidentally raise a non-exception class, there's probably something wrong with your program, and it gives the reader a hint about the intended use of the class. Other languages (e.g. Modula-3) have a specific exception type that can be used only for that one purpose. However it's useful to allow methods an subclassing of exceptions, so they might as well be classes. So, all exceptions are classes. But not all classes are exceptions. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Wed Dec 22 21:11:43 1999 From: gstein at lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 12:11:43 -0800 (PST) Subject: [Python-Dev] Please test new dynamic load behavior Message-ID: Hi all, I reorganized Python's dynamic load/import code over the past few days. Gudio provided some feedback, I did some more mods, and now it is checked into CVS. The new loading behavior has been tested on Linux, IRIX, and Solaris (and probably Windows by now). For people with CVS access, I'd like to ask that you grab an updated copy and shake out the new code. There have been updates to the "configure" process, so you'll need to run configure again. Make sure that you alter your Modules/Setup to build some shared modules, and then try it out. Here are some of the platforms that I believe need specific testing: - NetBSD, FreeBSD, OpenBSD, ... - AIX - HP/UX - BeOS - NeXT - Mac - OS/2 - Win16 I believe it should work for most people, but we may be looking for the wrong "init" symbol on some platforms. We might even be selecting the wrong import mechanism (or missing it altogether!) on some platforms. If you get a chance to test this, then please drop me a note with your platform and whether it succeeded or failed (and how it failed). Thanx! -g p.s. you can tell if dynamic loading is missing by watching for DYNLOADFILE in the configure process and seeing if it used dynload_stub. alternatively, you can import the "imp" module and see if "load_dynamic" is missing. -- Greg Stein, http://www.lyra.org/ From gvwilson at nevex.com Thu Dec 23 04:43:40 1999 From: gvwilson at nevex.com (gvwilson at nevex.com) Date: Wed, 22 Dec 1999 22:43:40 -0500 (EST) Subject: [Python-Dev] re: Open Source design competition / Python / software tools Message-ID: Hi, folks. I hope you don't mind another mail out of the blue, but I got notice on Saturday that the Department of Energy is giving me $860K over two years to support development of easier-to-use software engineering tools. All of the work will be Open Source, and will be done in Python, with a strong emphasis on design, testing, and documentation. The project's long-term objective is to encourage scientists and engineers to treat programs in the same way as they do other experiments, i.e. to calibrate, test, peer review, and so on. To kick-start things, we're going to be holding a two-round design competition. Anyone (individual or team, professional or student) can submit a short entry for the first round; the judges will pick four candidates to go forward in each of four categories, and those individuals or teams will be asked to submit full entries. The four categories are: * an issue tracking system to replace Gnats and Bugzilla; * a build system to replace make; * a platform inspection and configuration system to replace autoconf; and * a testing framework to replace XUnit, Expect, and DejaGnu. Would you be interested in participating in any way---judging, entering a design, critiquing things from the pointer of view of end users, or anything else? I realize that you're probably up past your eyeballs with work, and that the money on offer is nothing special, but I think this could be a lot of fun, and could help to shift the emphasis of the Open Source community from hacking to design (both by drawing attention to, and rewarding, design, and by creating a corpus of examples and commentary for programmers to refer to). It could also make life a lot easier for computational scientists and engineers... Please let me know if you'd like to be involved, or if you'd like more information than is contained in the FAQ (attached). Timescales are a bit tight---I'd like to be able to make an announcement on January 14---but I'll be reading email at this address several times a day during the holiday. I look forward to hearing from you, Greg Wilson p.s. please note that the attached FAQ is a first draft; I'd be grateful if you could show it to anyone you think might be interested, but I'd also be grateful if you wouldn't broadcast it until it's gone through one more editing pass. -------------- next part -------------- Software Carpentry FAQ

Software Carpentry FAQ

General information

  1. What is the Software Carpentry project?
    The aim of the Software Carpentry project is to make it easier for programmers in general, and scientific programmers in particular, to adopt better software development practices. The project will achieve this by creating tools that are easier to learn and use, and by documenting those tools and the practices they embody.
  2. Where does the name come from?
    The name is a play on "software engineering", and is meant to indicate that this project is initially concerned with medium-sized teams (up to a dozen or two programmers) and medium-term timescales (a year or two).
  3. How did the project get started?
    The project has its origins in a series of articles that Greg Wilson organized for the Fall 1996 and Winter 1996 issues of IEEE Computational Science and Engineering. These articles outlined what their authors thought computer scientists should teach to physical scientists and engineers. Most authors recommended numerical methods or the standard Unix toolset, but Steve McConnell argued that better programming practices would have the greatest impact on productivity.
    As a result of that observation, Greg Wilson, Brent Gorda, and Steve McConnell put together a 3-day course on software engineering for scientists and engineers, which they taught several times at the Los Alamos National Laboratory. Feedback on the course was very positive, but many participants felt that the tools being taught---Perl, Make, CVS, and so on---were unnecessarily difficult to install, learn, and use. They were also frustrated by the scarcity of examples of design documents, testing plans, and all of the other things the course was trying to teach them.
  4. Why Open Source?
    There are three reasons why the Software Carpentry project is following the Open Source model:
    1. Leveraging existing knowledge.
      A closed project can only take advantage of a few minds. As Linux and other projects have shown, a well-run Open Source project can harness the experience and insight of thousands of people.
    2. Lowering barriers to adoption.
      Freely-available tools are more likely to be picked up than their commercial equivalents. This is particularly true when the tool in question does something novel (at least from the point of the person adopting it), and in academia (where budgets are limited).
    3. Encouraging peer review.
      Dan Gezelter?s talk at the first Open Source/Open Science conference discussed how the scientific tradition of peer review fits with the philosophy of the Open Source movement. By designing and building these tools in the open, the Software Carpentry project will both encourage peer review of the tools themselves, and demonstrate how this ought to be done for scientific and commercial software.
  5. Where does the funding come from?
    The funding comes from the U.S. Department of Energy, through the Advanced Computing Laboratory at Los Alamos National Laboratory. The project is being administered by Code Sourcery. US$480,000 has been provided for 2000, and US$380,000 for 2001.
  6. Why would the Department of Energy fund something like this?
    The funding has been provided partly because the DoE would like scientists and engineers to be more productive, and partly because it would like to find out whether the Open Source model and community can meet the special needs of high-performance computational science. The last few years have seen most manufacturers of special-purpose supercomputers disappear or be bought out, and the rise of clusters based on commercial off-the-shelf (COTS) hardware, Linux, MPI, the GNU compiler toolset, and so on. There is a growing feeling that these machines could bring scalable supercomputing into the mainstream, but this will only happen if good tools and practices are accessible enough.
  7. I'm not a scientist or engineer---what's in it for me?
    The things that make many existing Open Source software development tools difficult to learn and use---obscure syntax, arbitrary or hard-to-follow behavior, and poor documentation---affect professional programmers and computer science students just as much as they do computational scientists and engineers. If the Open Source movement can build tools that are simple enough to be learned by people who have problems of their own to solve, and yet powerful enough to support distributed development of hundreds of thousands of lines of complex numerical and visualization code, then those tools will probably also help people who want to build Internet chat rooms and order-tracking systems.
    This project should also be interesting to the general programming community because it is going to place more emphasis on design and early feedback than most Open Source projects have to date. Instead of growing someone?s pet project, Software Carpentry is going to organize---and pay for---a design competition. If this works, it could be an interesting model for other Open Source projects to adopt.
  8. I think [tool] is good enough already---why are you re-inventing the wheel?
    The short answer to this is Alan Cooper's:
    The phrase "computer literate user" really means the person has been hurt so many times that the scar tissue is thick enough so he no longer feels the pain.
    -- Alan Cooper, The Inmates are Running the Asylum
    The longer answer is that the "accidental complexity" of the standard Unix command-line toolset is a major barrier to its adoption by people who are not full-time programmers, or for whom programming is just something that has to be done in order to do something else. Many professional programmers---particularly those who enjoy programming enough to be involved in the Open Source movement---have been using these tools for so long that they simply don't remember how hard it is to configure Gnats, or pass variable bindings between recursive calls to Make.
    And let's face it: if Make or Autoconf were built from scratch today, they would be written as extensible, embeddable modules in a high-level scripting language. This would not only make them easier to use, it would also make them easier to learn, since they would employ one syntax for all purposes. Microsoft Visual Basic has shown just how useful it can be to have a single general-purpose "glue" language capable of binding disparate tools together; the aim of the first half of this project is to bring those benefits to the Open Source community.

Development

  1. What projects are currently under way?
    Software Carpentry will start by producing:
    1. a platform inspection tool similar to Autoconf;
    2. a build management tool similar to Make;
    3. an issue tracking system similar to Gnats or Bugzilla; and
    4. a unit and regression testing harness with the functionality of XUnit, Expect, and DejaGnu.
  2. Why were those tools chosen?
    These four tools were chosen as initial targets for several reasons. First, the working practices they support are essential to medium-scale software engineering. Second, the tools they are intended to replace are generally recognized as being outdated or flawed. This creates demand, and increases the odds that rational reimplementations will be adopted. Third, enough people have enough experience with the tools that are to be replaced to participate in the design competition described later.
  3. Why isn?t [tool] on this list?
    There are several other tools that could have been on this list, and will be added if the first round of work goes well. A cross-platform version control system that corrects the many deficiencies in CVS, for example, is an obvious candidate, but is probably too large to be tackled initially, and any work done by Software Carpentry could well be superseded by BitKeeper. Similarly, the world needs a good Open Source project management tool with the functionality of Microsoft Project, but probably needs the four tools listed above more urgently.
  4. What languages and tools will be used?
    All development work will be done in Python.
  5. Why Python?
    This is actually three questions:
    1. Why mandate a language?
      Building everything in a single language will encourage projects to share code, which will both keep the total volume of code manageable and raise the quality of the implementations (since the shared code will be exercised, and tested, in many different ways). Using a single language will also improve the comprehensibility, and hence the maintainability and extensibility, of the tools. The varying syntax of Make, Autoconf, and other tools is a large practical barrier to their adoption by people who have better (or at least more pressing) things to do than learn yet another syntax. Microsoft?s Visual Basic has shown how powerful it is to use a single, flexible language everywhere.
    2. Why use a scripting language?
      A lot of anecdotal evidence shows that "relaxed" high-level languages (like Python, Perl, and Visual Basic) are more productive vehicles for process management, text processing, and similar tasks than their "strict" equivalents (like C++ and Java).
    3. Why use Python?
      The four candidates considered were Visual Basic, Perl, Tcl, and Python.
      1. Visual Basic
        Visual Basic is proprietary, and there is no indication that a credible Open Source implementation will appear any time soon.
      2. Perl
        Perl was a strong contender, primarily because of the many libraries that have been developed for it, and because of the number of books that document it. However, our experience teaching at Los Alamos was that Perl?s syntax is hard to learn, its behavior often arbitrary, and its size intimidating. While full-time professional programmers with several other languages under their belts might (and often do) say that it all makes sense once you know it, we want to make the learning curve as gentle as possible.
      3. Tcl
        Tcl is easier to learn and read than Perl, but is not as well documented, and doesn?t come with as many libraries. Had Python not existed, Tcl would probably have been chosen for this project.
      4. Python
        Python provides the same functionality as Perl or Tcl, but has proved to be easier to learn, read, and remember. (For example, words like "except" and "unless" appear much less often in Python reference material than they do in Perl reference material.) Python is not yet as extensively documented as Perl, but the number of books is growing, as is the number of modules and libraries. Finally, the Python community is still small enough for a project like this one to attract the attention of a significant proportion of it.
  6. How will development be organized and coordinated?
    Everything the project produces---designs, critiques of those designs, test suites, and examples, as well as actual source code---will be available through the project?s Web site at software-carpentry.codesourcery.com. Each project will have a coordinator, whose job it will be to moderate discussion, synchronize releases, track work items, and report on progress. The coordinator will also be responsible for collating and editing feedback from judges during the design competition.

Design competition

  1. Why a design competition?
    Most Open Source packages have their roots in someone?s pet hobby project, which others have picked up, extended, and modified. This kind of organic growth has a lot of good features, but a well-documented design is not one of them. As a result, programmers often have to rely on folklore and reverse engineering if they want to add to, or fix, these tools. In addition, there is a dearth of examples of good design for new programmers to learn from.
    The Software Carpentry project hopes to address both problems by running a two-stage design competition. The best entries in both rounds will be published, along with commentary from the competition?s judges. This material will serve both to inform and guide further development, and to show novices what experienced programmers think about before they start coding.
  2. Who can enter?
    Everyone: individuals and teams, students and professionals, from anywhere in the world.
  3. What are the rules?
    The full rules are available at:
    software-carpentry.codesourcery.com/design-competition/rules.html
    Basically, initial submissions must be written in English, and can be up to 10 pages long. Examples count against this limit, but diagrams and a Unix-style man page do not. Any person or team may submit only one entry in any given category, but can submit in as many of the four categories as desired.
    The best four entries in each category will be awarded US$2500, and asked to submit full designs. Participants will be strongly encouraged to pool their efforts for the second round. The best second-round submission will be awarded an additional US$7500, while the others will receive another US$2500 each. The real reward will be seeing the design implemented, and being in a good position to bid on the implementation work.
  4. What should first-round submissions contain?
    An example of what a submission should contain, and how it should be formatted is available at:
    software-carpentry.codesourcery.com/design-competition/example.html
    First-round entries should focus primarily on what the tool will do, and how it will be used: command-line options, input and output file formats, sketches of Web and GUI interfaces (where appropriate), and so on. Second-round submissions will then be expected to describe how it?s all going to be implemented.
  5. Who will the judges be?
    Need to firm up the list of judges ASAP.
  6. When are the deadlines?
    The deadline for first-round submissions is March 31, 2000. The five best proposals in each category will be announced on April 30, 2000. Full submissions are due on June 1, 2000, and winners will be announced on June 30, 2000.
  7. Won't prizes discourage co-operation?
    We don?t know. On the one hand, people might want to hoard their best ideas; on the other hand, the best designs in both rounds are going to be published, along with the judges? commentary, and we will be encouraging participants to pool their efforts. Most of the money that will be paid out will go to fund implementation, testing, and documentation; we hope that people will collaborate in the early stages, and treat the prizes as recognition for their effort, rather than treating US$10,000 as their retirement fund.

Documentation

  1. What documentation will be produced?
    The Software Carpentry project will produce several different kinds of documentation:
    1. Design documentation.
      As stated above, the best designs in each category will be published, along with the judges? commentary. This material ought to play the role that music criticism has played in the development of music, by giving newcomers (and experienced programmers) better insight into how good designers think.
    2. User guides.
      The project will pay for the development of man pages, user guides, online help, and all the other documentation needed to turn a program into a product.
    3. Test suites.
      The project will also pay for the development of industrial-strength test suites for all four tools. These suites will be published, both to serve as a starting point for other projects and to demonstrate good practice.
    4. Case studies.
      It is often easier to show someone how to do something than to explain it to them. The Software Carpentry project will pay for case studies that describe how these tools, and (more importantly) the working practices they support, have been deployed in practice. Checklists, templates for forms, and other errata can be submitted.
  2. What format(s) will be used?
    The primary format for all documentation will be HTML. The project will migrate to XML when and as feasible.
  3. What restrictions are there on using the documentation?
    Only those that also apply to the software, under the terms of its Open Source license. You can copy and distribute the documentation in any form, but only if its author(s) and origin are clearly shown, and if you include a description of how readers can access the originals. In particular, the documentation can be reproduced in books, but only if the authors, origin, and location of the originals is printed clearly on each page.
From jack at oratrix.nl Thu Dec 23 11:24:26 1999 From: jack at oratrix.nl (Jack Jansen) Date: Thu, 23 Dec 1999 11:24:26 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Message by Guido van Rossum , Wed, 22 Dec 1999 13:23:45 -0500 , <199912221823.NAA16517@eric.cnri.reston.va.us> Message-ID: <19991223102426.CCB75370CF2@snelboot.oratrix.nl> > Vladimir.Marangozov at inrialpes.fr: > > > Yes. Besides, I still think that string-based exceptions are just > > convenient for quick & dirty, throw-away test scripts. > > They have a hard-to-understand quirk though: the id() of the string is > used to check rather than its value, so that except "foo" doesn't > necessarily catch raise "foo"; but due to various optimization, this > usually works, and people get bent out of shape when it doesn't. I sort-of use this feature when I'm debugging: if I want to know what happens in an exception that is usually caught somewhere higher up in the call stack I simply put quotes around the exception name and the exception will happen uncaught. The same trick works for except: clauses. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From harri.pasanen at trema.com Thu Dec 23 12:44:04 1999 From: harri.pasanen at trema.com (Harri Pasanen) Date: Thu, 23 Dec 1999 13:44:04 +0200 Subject: [Python-Dev] Re: [PSA MEMBERS] Please test new dynamic load behavior References: Message-ID: <38620B04.7CC64485@trema.com> Greg Stein wrote: > > Hi all, > > I reorganized Python's dynamic load/import code over the past few days. > Gudio provided some feedback, I did some more mods, and now it is checked > into CVS. The new loading behavior has been tested on Linux, IRIX, and > Solaris (and probably Windows by now). > ... What was the motivation behind this modification? Just curious, -Harri From marangoz at python.inrialpes.fr Thu Dec 23 13:12:40 1999 From: marangoz at python.inrialpes.fr (Vladimir Marangozov) Date: Thu, 23 Dec 1999 13:12:40 +0100 (CET) Subject: [Python-Dev] Please test new dynamic load behavior In-Reply-To: from "Greg Stein" at Dec 22, 1999 12:11:43 PM Message-ID: <199912231212.NAA26572@python.inrialpes.fr> Greg Stein wrote: > > Hi all, > > I reorganized Python's dynamic load/import code over the past few days. > Gudio provided some feedback, I did some more mods, and now it is checked > into CVS. The new loading behavior has been tested on Linux, IRIX, and > Solaris (and probably Windows by now). > Great work Greg! > Here are some of the platforms that I believe need specific testing: > > - NetBSD, FreeBSD, OpenBSD, ... > - AIX > - HP/UX > - BeOS > - NeXT > - Mac > - OS/2 > - Win16 AFAICT, the AIX version works perfectly okay. -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From jim at digicool.com Thu Dec 23 15:41:23 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 09:41:23 -0500 Subject: [Python-Dev] str(1L) -> '1' ? Message-ID: <38623493.E6BA6D6F@digicool.com> In November there was an interesting discussion on comp.lang.python about the meaning of __str__ and __repr__. One tidbit that came out of this discussion was that __str__ for longs should drop the trailing 'L'. Was there a decision on this? I'd really like this to happen. We do alot of work with RDBMS systems and long integers seem to come up alot with these systems (as do other fix-decimal number, but that's another topic ;). For example, our latest Sybase and Oracle support in Zope returns long integers for RDBMS types like NUMBER(10,0). The trailing 'L' in the string representation is causeing us some headaches. This seems also to be an issue when using the current standard ODBC interface with Oracle, as indicated in a DB-SIG post today. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido at CNRI.Reston.VA.US Thu Dec 23 15:46:58 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 09:46:58 -0500 Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: Your message of "Thu, 23 Dec 1999 09:41:23 EST." <38623493.E6BA6D6F@digicool.com> References: <38623493.E6BA6D6F@digicool.com> Message-ID: <199912231446.JAA22086@eric.cnri.reston.va.us> [Jim F] > In November there was an interesting discussion on comp.lang.python > about the meaning of __str__ and __repr__. One tidbit that came out > of this discussion was that __str__ for longs should drop the trailing > 'L'. Was there a decision on this? I'd really like this to happen. Yes, I'd like it to happen. I'd also like repr() of a float to return the full precision (using the "%.17g" sprintf format). I haven't done it for lack of time -- feel free to send a patch (don't forget the disclaimer from http://www.python.org/1.5/bugrelease.html). We haven't decided yet what to do with the greater topic of that discussion (or was it a different one?) -- whether the values printed by typing a bare expression in interactive mode should use str(), repr(), or str-special-casing-the-snot-out-of-strings(). --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at digicool.com Thu Dec 23 15:51:14 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 09:51:14 -0500 Subject: [Python-Dev] Fixed-decimal types Message-ID: <386236E2.F97109D3@digicool.com> While on the subject of RDBMS systems, a common need is to be able to work with fixed-decimal data. I think a standard Python fixed-decimal type would help to make Python database interfaces alot more robust. I even wonder if the Python long type might be hijacked for this purpose by adding a "scale" that indicates the number of digits to the right of the decimal point. For example, an expression like: 1000000000.2500L would create a fixed decimal number with a scale of 4. People have built Python classes for fixed-decimal types, but when working with RDBMS data, one often deals with lots of data and efficiency matters. I also suspect that adding scale to longs wouldn't be that hard and would be a fairly natural extension. In any case, a "standard" (being in the standard library would be sufficient) fixed-decimal type would probably lead to better database interfaces that (at least more) properly handled fixed-decimal data. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido at CNRI.Reston.VA.US Thu Dec 23 15:56:33 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 09:56:33 -0500 Subject: [Python-Dev] Fixed-decimal types In-Reply-To: Your message of "Thu, 23 Dec 1999 09:51:14 EST." <386236E2.F97109D3@digicool.com> References: <386236E2.F97109D3@digicool.com> Message-ID: <199912231456.JAA22134@eric.cnri.reston.va.us> What would be scale of the product of two fixed-decimal numbers? E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are arguments for either. Same question for division (harder, I think). I like the idea of using the dd.ddL notation for this. I have no time to implement it but would not be unwilling to accept patches. They would have to be accompanied with a wet signature, see http://www.python.org/1.5/wetsign.html. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at digicool.com Thu Dec 23 16:00:25 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 10:00:25 -0500 Subject: [Python-Dev] re: Open Source design competition / Python / software tools References: Message-ID: <38623909.CDF41014@digicool.com> gvwilson at nevex.com wrote: > > Hi, folks. I hope you don't mind another mail out of the blue, but I got > notice on Saturday that the Department of Energy is giving me $860K over > two years to support development of easier-to-use software engineering > tools. All of the work will be Open Source, and will be done in Python, > with a strong emphasis on design, testing, and documentation. The > project's long-term objective is to encourage scientists and engineers to > treat programs in the same way as they do other experiments, i.e. to > calibrate, test, peer review, and so on. > > To kick-start things, we're going to be holding a two-round design > competition. Anyone (individual or team, professional or student) can > submit a short entry for the first round; the judges will pick four > candidates to go forward in each of four categories, and those > individuals or teams will be asked to submit full entries. The four > categories are: > > * an issue tracking system to replace Gnats and Bugzilla; > > * a build system to replace make; > > * a platform inspection and configuration system to replace autoconf; > and > > * a testing framework to replace XUnit, Expect, and DejaGnu. > > Would you be interested in participating in any way Are these categories fixed? I see a very strong need for an open-source UML modeling tool. UML is extremely powerful, but current UML tools largely suck and are very expensive. We are contemplating launching an open-source development effort to build UML modeling tools using Zope or the Zope object database as a repository. A contest like this could help to kick-start this effort, but tools to automate requirements and design seem to be missing. This is odd, considering that up-front activities like requirements and design have the largest impact on software-engineering project success. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From captainrobbo at yahoo.com Thu Dec 23 16:13:22 1999 From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=) Date: Thu, 23 Dec 1999 07:13:22 -0800 (PST) Subject: [Python-Dev] Fixed-decimal types Message-ID: <19991223151322.5698.qmail@web604.mail.yahoo.com> --- Guido van Rossum wrote: > What would be scale of the product of two > fixed-decimal numbers? > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to > 4.00L? There are > arguments for either. Same question for division > (harder, I think). Most commonly one is trying to avoid rounding errors when dealing with money - a few cents rounding error tends to result in a few billable hours with the accountants at the end of the year! SQL dialects and type-safe languages would make you specify the precision of the variable to be assigned, so the issue does not arise for other languages. For the work I do, simply taking the precision of the most precise input (4.00L)would do the trick, but your answer (4.0000L) is purer. We should provide a rounding function, and in practice anyone using such a function would round (or floor, or ceiling) to get to the desired precision immediately. I'm not sure on division either but I'm sure there are precedents to look at. On the subject of adding new types to the standard library, what are the plans on dates and times? Would a cut-down mxDateTime ever be considered? It is fully Open Source (unlike mxODBC) and was designed for the DBAPI. Regards, Andy ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. __________________________________________________ Do You Yahoo!? Thousands of Stores. Millions of Products. All in one place. Yahoo! Shopping: http://shopping.yahoo.com From guido at CNRI.Reston.VA.US Thu Dec 23 16:23:43 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 10:23:43 -0500 Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) In-Reply-To: Your message of "Thu, 23 Dec 1999 07:13:22 PST." <19991223151322.5698.qmail@web604.mail.yahoo.com> References: <19991223151322.5698.qmail@web604.mail.yahoo.com> Message-ID: <199912231523.KAA22232@eric.cnri.reston.va.us> > On the subject of adding new types to the standard > library, what are the plans on dates and times? Would > a cut-down mxDateTime ever be considered? It is fully > Open Source (unlike mxODBC) and was designed for the > DBAPI. I don't know much about date/time types, or about mxDateTime. My intuition is that there are too many ways to do it, and that being compatible with commercial databases may not be the right way to do it for core Python. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 23 16:27:59 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 23 Dec 1999 10:27:59 -0500 (EST) Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: <38623493.E6BA6D6F@digicool.com> References: <38623493.E6BA6D6F@digicool.com> Message-ID: <14434.16255.58344.646524@weyr.cnri.reston.va.us> Jim Fulton writes: > In November there was an interesting discussion on comp.lang.python > about the meaning of __str__ and __repr__. One tidbit that came out > of this discussion was that __str__ for longs should drop the trailing > 'L'. Was there a decision on this? I'd really like this to happen. I liked that result as well, and thought about it just the other day. Luckily, you sent a note this morning and made me think about again. I'll have something checked into CVS shortly. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From Mike.Da.Silva at uk.fid-intl.com Thu Dec 23 17:30:07 1999 From: Mike.Da.Silva at uk.fid-intl.com (Da Silva, Mike) Date: Thu, 23 Dec 1999 16:30:07 -0000 Subject: [Python-Dev] Fixed Decimal types Message-ID: Andy Robinson wrote: For the work I do, simply taking the precision of the most precise input (4.00L)would do the trick, but your answer (4.0000L) is purer. We should provide a rounding function, and in practice anyone using such a function would round (or floor, or ceiling) to get to the desired precision immediately. I'm not sure on division either but I'm sure there are precedents to look at. The AS400 provides a useful example of the right way to do scaled decimals. In the RPG programming language, all internal calculations (i.e. multiplication, division) are performed to the maximum precision of the intermediate result (in the multiplication example below), the intermediate result would be 4.0000L. When the intermediate result is assigned to the target scaled decimal number, the decimal precision is automatically extended or truncated to fit the target precision. One extra wrinkle in all of this is the option to "half-adjust" the intermediate value on assignment; that is to apply automatic 5/4 rounding to the precision of the target. So, if the target field is defined as numeric(4,2), the result will be 4.00L. These are probably the kind of semantics that a scaled decimal type would require in Python also; i.e. allow unlimited precision in intermediate calculations, with a sensible set of rules for assignment to a variable of different scale and precision. However, unlike RPG, we should probably ensure that attempts to overflow or underflow the scale result in NaN or Overflow conditions, rather than assuming the user is right and losing the significant digits. Regards, Mike da Silva From jim at digicool.com Thu Dec 23 17:37:10 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 11:37:10 -0500 Subject: [Python-Dev] Fixed-decimal types References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> Message-ID: <38624FB6.ED903F@digicool.com> Guido van Rossum wrote: > > What would be scale of the product of two fixed-decimal numbers? > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are > arguments for either. Same question for division (harder, I think). I'd be inclined to start by doing some research to see if some standard (SQL?) defines this somewhere. It would be nice if someone has already done the requirements work for us. :) > I like the idea of using the dd.ddL notation for this. > > I have no time to implement Me neither. > it but would not be unwilling to accept patches. Cool. If no one else volunteers, then I'll try to find a way to get this done (not necessarily by me). I think it is pretty important. > They would have to be accompanied with a wet signature, see > http://www.python.org/1.5/wetsign.html. Yup. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From captainrobbo at yahoo.com Thu Dec 23 17:38:50 1999 From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=) Date: Thu, 23 Dec 1999 08:38:50 -0800 (PST) Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) Message-ID: <19991223163850.15619.qmail@web604.mail.yahoo.com> Sorry, should have replied to the list... --- Andy Robinson wrote: > Date: Thu, 23 Dec 1999 08:37:18 -0800 (PST) > From: Andy Robinson > Reply-to: andy at robanal.demon.co.uk > Subject: Re: [Python-Dev] Date and timetypes (was: > Fixed-decimal types) > To: Guido van Rossum > > --- Guido van Rossum > wrote: > > I don't know much about date/time types, or about > > mxDateTime. > > My intuition is that there are too many ways to do > > it, and that being > > compatible with commercial databases may not be > the > > right way to do it > > for core Python. > > > > OK. Let me rephrase it. Say we form a consensus on > 'the right way'. Are you amenable to some solution > which goes back before 1970 and after 2038 going > into > the standard library? > > And does your answer change if it involves some > compiled code as well? > > I mention mxDateTime because it was agreed by a > Python > SIG, is mature and stable, and I find it very > useful. > And the core type is pretty small - much of the > helper > stuff in the package now could be kept separate from > the main Python distribution. > > - Andy > > > ===== > Andy Robinson > Robinson Analytics Ltd. > ------------------ > My opinions are the official policy of Robinson > Analytics Ltd. > They just vary from day to day. > > _________________________________________________________ > Do You Yahoo!? > Get your free @yahoo.com address at > http://mail.yahoo.com > ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com From guido at CNRI.Reston.VA.US Thu Dec 23 17:42:33 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 11:42:33 -0500 Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) In-Reply-To: Your message of "Thu, 23 Dec 1999 08:38:50 PST." <19991223163850.15619.qmail@web604.mail.yahoo.com> References: <19991223163850.15619.qmail@web604.mail.yahoo.com> Message-ID: <199912231642.LAA22598@eric.cnri.reston.va.us> > > OK. Let me rephrase it. Say we form a consensus on 'the right > > way'. Are you amenable to some solution which goes back before > > 1970 and after 2038 going into the standard library? No problem. > > And does your answer change if it involves some > > compiled code as well? I'd rather not. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Thu Dec 23 18:05:52 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 23 Dec 1999 11:05:52 -0600 (CST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <14434.22128.639699.738932@dolphin.mojam.com> Guido> (The next step would be to outlaw raise with a string argument; I Guido> think I can't make that for 1.6. But it would be a good idea to Guido> scan the standard library for string exceptions and convert all Guido> of them.) Agreed. I know Zope uses (at least, my Zope-using code uses) stuff like raise 'Redirect', url to map names onto HTTP response codes. Makes it easier on people to remember names instead of numeric codes. I suspect it will take the Zopers awhile to convert to using class-based exceptions if they haven't already. (For all I know I may be using a deprecated feature.) Skip From gvwilson at nevex.com Thu Dec 23 18:24:05 1999 From: gvwilson at nevex.com (gvwilson at nevex.com) Date: Thu, 23 Dec 1999 12:24:05 -0500 (EST) Subject: [Python-Dev] re: Open Source design competition / Python / software tools In-Reply-To: <38623909.CDF41014@digicool.com> Message-ID: Hi, everyone. I'm sending my reply to Jim's message to the whole python-dev list; I'll send follow-ups to individuals if people would prefer. > > * an issue tracking system to replace Gnats and Bugzilla; > > > > * a build system to replace make; > > > > * a platform inspection and configuration system to replace autoconf; > > and > > > > * a testing framework to replace XUnit, Expect, and DejaGnu. > Jim Fulton asked: > Are these categories fixed? For the first round, yes --- I have to prove that this model can solve small problems before I'll be given the funding to tackle larger ones, and I think that a UML modeling tool is definitely "large" :-). I also have to demonstrate uptake, and I think more people will adopt a sane replacement for Autoconf in the next 18 months than would adopt a UML modeler. However, decent Open Source CASE tools are very (very) high on my personal list --- if this works, I'd like to tackle them (along with providing support for DDD, and a few other thingsl ike that). Greg From gstein at lyra.org Thu Dec 23 19:26:44 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 23 Dec 1999 10:26:44 -0800 (PST) Subject: [Python-Dev] Re: Please test new dynamic load behavior In-Reply-To: <38620B04.7CC64485@trema.com> Message-ID: On Thu, 23 Dec 1999, Harri Pasanen wrote: > Greg Stein wrote: > > Hi all, > > > > I reorganized Python's dynamic load/import code over the past few days. > > Gudio provided some feedback, I did some more mods, and now it is checked > > into CVS. The new loading behavior has been tested on Linux, IRIX, and > > Solaris (and probably Windows by now). > > ... > > What was the motivation behind this modification? Harri - With the new code structure, it is much easier to maintain Python's loading code. Each platform has its own file (e.g. dynload_aix.c) rather than being all jammed together into importdl.c. This isn't a huge win by itself, but does increase readability/maintainability. The big improvement, however, is when you are adding support for new platforms or loading mechanisms. A new dynload_*.c can be written and one line added to configure.in, and you're done. No need to make importdl.c even uglier. (actually, importdl.c no longer contains *any* platform specific code; it has all been moved to the dynload_*.c files) Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim at digicool.com Thu Dec 23 20:39:37 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 14:39:37 -0500 Subject: [Python-Dev] Fixed-decimal types References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com> Message-ID: <38627A79.BF379672@digicool.com> Jim Fulton wrote: > > Guido van Rossum wrote: > > > > What would be scale of the product of two fixed-decimal numbers? > > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are > > arguments for either. Same question for division (harder, I think). > > I'd be inclined to start by doing some research to see if some standard > (SQL?) defines this somewhere. It would be nice if someone has already > done the requirements work for us. :) Here is what the book "SQL-99 Complete, Really" says that the SQL standard says: - for addition and subtraction of two "exact" (fixed-decimal) numbers, the result has the maximum of the scales. - for multiplication of two "exact" (fixed-decimal) numbers, the result has the sum of the scales. - punts on division - for addition, subtraction, multiplication or division between "exact" (fixed point) and "approximate" (floating point) yields an approximate result. This means that fixed-decimal coerces to float. I'm curious to see who else chips in with examples from other systems. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From jim at digicool.com Thu Dec 23 20:43:41 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 14:43:41 -0500 Subject: [Python-Dev] Fixed Decimal types References: Message-ID: <38627B6D.447A9553@digicool.com> "Da Silva, Mike" wrote: > > Andy Robinson wrote: > For the work I do, simply taking the precision of the > most precise input (4.00L)would do the trick, but your > answer (4.0000L) is purer. We should provide a > rounding function, and in practice anyone using such a > function would round (or floor, or ceiling) to get to > the desired precision immediately. > > I'm not sure on division either but I'm sure there are > precedents to look at. > > The AS400 provides a useful example of the right way to do scaled > decimals. > > In the RPG programming language, all internal calculations (i.e. > multiplication, division) are performed to the maximum precision of the > intermediate result (in the multiplication example below), the intermediate > result would be 4.0000L. When the intermediate result is assigned to the > target scaled decimal number, the decimal precision is automatically > extended or truncated to fit the target precision. One extra wrinkle in all > of this is the option to "half-adjust" the intermediate value on assignment; > that is to apply automatic 5/4 rounding to the precision of the target. Yee ha! This is great input. Anyone have any other examples of what any other systems do? Anyone got a PL/I manual handy. ;) > So, if the target field is defined as numeric(4,2), the result will > be 4.00L. Since Python doesn't have types values, this is not an issue internally, but would be an issue when binding to external databases. > These are probably the kind of semantics that a scaled decimal type > would require in Python also; i.e. allow unlimited precision in intermediate > calculations, with a sensible set of rules for assignment to a variable of > different scale and precision. > > However, unlike RPG, we should probably ensure that attempts to > overflow or underflow the scale result in NaN or Overflow conditions, rather > than assuming the user is right and losing the significant digits. Since this would be based on infinite-precision numbers, I don't think that this would be an issue. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido at CNRI.Reston.VA.US Thu Dec 23 20:44:36 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 14:44:36 -0500 Subject: [Python-Dev] Fixed-decimal types In-Reply-To: Your message of "Thu, 23 Dec 1999 14:39:37 EST." <38627A79.BF379672@digicool.com> References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com> <38627A79.BF379672@digicool.com> Message-ID: <199912231944.OAA23337@eric.cnri.reston.va.us> Jim Fulton wrote: > - for addition and subtraction of two "exact" (fixed-decimal) > numbers, the result has the maximum of the scales. One could argue that this is incorrect: if "3.1" means that I know the value to one decimal of precision, and "2.01" means that I know that value to two decimals of precision, stating the result of their sum as "5.11" suggests that I know the result to two decimals of precision, which is of course false: because I only knew one decimal of precision for one of the operands, I only know (at most!) one decimal of precision for the result. Not arguing for this interpretation, just indicating that doing fixed precision arithmetic right is hard. I'm waiting for Tim Peters' contribution, but he's on vacation so it may be a while. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Thu Dec 23 21:48:56 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 23 Dec 1999 15:48:56 -0500 Subject: [Python-Dev] Fixed Decimal types In-Reply-To: <38627B6D.447A9553@digicool.com> Message-ID: <1266141247-31971518@hypernet.com> Jim Fulton wrote: > "Da Silva, Mike" wrote: [AS400 RPG rules...] > Yee ha! This is great input. Anyone have any other examples of > what any other systems do? Anyone got a PL/I manual handy. ;) From jim at digicool.com Thu Dec 23 23:18:37 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 17:18:37 -0500 Subject: [Python-Dev] re: Open Source design competition / Python /software tools References: Message-ID: <38629FBD.3B8F47D4@digicool.com> gvwilson at nevex.com wrote: > > Hi, everyone. I'm sending my reply to Jim's message to the whole > python-dev list; I'll send follow-ups to individuals if people would > prefer. > > > > * an issue tracking system to replace Gnats and Bugzilla; > > > > > > * a build system to replace make; > > > > > > * a platform inspection and configuration system to replace autoconf; > > > and > > > > > > * a testing framework to replace XUnit, Expect, and DejaGnu. > > > Jim Fulton asked: > > Are these categories fixed? > > For the first round, yes OK. >--- I have to prove that this model can solve > small problems before I'll be given the funding to tackle larger ones, and > I think that a UML modeling tool is definitely "large" :-). Well, since you gave rational ..... :) Isn't the Open Source community especially good at large problems? Note that I'm thinking more in terms of an open source UML community of tools, based around an existing repository rather than on a single monolithic tool. I envision a community of diagramming and other small tools orbiting Zope or ZODB. The hardest part of a UML tool is the repository, and I think we've mostly got that. I think that what the Open Source community desperately needs are tools for managing and sharing the most important artifacts in the development process. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gstein at lyra.org Fri Dec 24 01:09:29 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 23 Dec 1999 16:09:29 -0800 (PST) Subject: [Python-Dev] re: Open Source design competition / Python /software tools In-Reply-To: <38629FBD.3B8F47D4@digicool.com> Message-ID: On Thu, 23 Dec 1999, Jim Fulton wrote: > gvwilson at nevex.com wrote: >... > >--- I have to prove that this model can solve > > small problems before I'll be given the funding to tackle larger ones, and > > I think that a UML modeling tool is definitely "large" :-). > > Well, since you gave rational ..... :) > > > Isn't the Open Source community especially good at large problems? Very true, I agree, but part of Greg's problem is "proving" that to the DoE. Somebody has said those four problems are sufficient to do so, and (probably) because they are reasonably constrained to allow completion within a specified timeframe. > Note that I'm thinking more in terms of an open source UML community > of tools, based around an existing repository rather than on a single > monolithic tool. I envision a community of diagramming and other small > tools orbiting Zope or ZODB. The hardest part of a UML tool is the > repository, and I think we've mostly got that. Greg's proposal is quite specific. "A community" isn't, so it might not help to create a proof to the DoE (otherwise, they could look at the Zope community, or other communities!). Jim: there isn't anything stopping or impeding the creation of an Open Source community for UML modeling. This DoE competition won't affect that... Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From jim at digicool.com Fri Dec 24 01:27:53 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 19:27:53 -0500 Subject: [Python-Dev] re: Open Source design competition / Python /softwaretools References: Message-ID: <3862BE09.9AF62090@digicool.com> Greg Stein wrote: > (snip) > Jim: there isn't anything stopping or impeding the creation of an Open > Source community for UML modeling. Of course not. > This DoE competition won't affect that... Perhaps it could help it. > Happy Holidays, You too. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From ping at lfw.org Fri Dec 24 09:55:28 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Fri, 24 Dec 1999 00:55:28 -0800 (PST) Subject: [Python-Dev] re: Open Source design competition / Python / software tools In-Reply-To: Message-ID: On Wed, 22 Dec 1999 gvwilson at nevex.com wrote: > To kick-start things, we're going to be holding a two-round design > competition. Anyone (individual or team, professional or student) can > submit a short entry for the first round; the judges will pick four > candidates to go forward in each of four categories, and those > individuals or teams will be asked to submit full entries. The four > categories are: > > * an issue tracking system to replace Gnats and Bugzilla; Hi there. At ILM we've been using a system that i hacked up quickly in Python called "Roundup". It has a number of interesting properties that have made it really useful to us, and arguably better than any of the existing open-source bug-tracking things out there that i know of. It is not just a Web app; it lives between the Web and e-mail, because we do so much of our communication that way. For example, each request item gets its own virtual mailing list, updated on the fly without the need for explicit subscription (if you cc: somebody while discussing the bug, they get subscribed). Empirically i've discovered that unsubscription is actually unnecessary (!) because conversation will stop on a topic when it gets resolved or when it ceases to be interesting. These are fine-grained discussion lists on a per-topic level. This is just to let you know i'm interested. I'm currently asking for permission to open-source Roundup; if it can't be done, or doesn't happen quickly enough, i'll just have to take a weekend and rewrite the thing. There were a few things i wanted to fix anyway. -- ?!ng "You should either succeed gloriously or fail miserably. Just getting by is the worst thing you can do." -- Larry Smith From marangoz at python.inrialpes.fr Fri Dec 24 13:07:05 1999 From: marangoz at python.inrialpes.fr (Vladimir Marangozov) Date: Fri, 24 Dec 1999 13:07:05 +0100 (CET) Subject: [Python-Dev] Exceptions In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 01:23:45 PM Message-ID: <199912241207.NAA18783@python.inrialpes.fr> Guido van Rossum wrote: > > Vladimir.Marangozov at inrialpes.fr: > > > Yes. Besides, I still think that string-based exceptions are just > > convenient for quick & dirty, throw-away test scripts. > > They have a hard-to-understand quirk though: the id() of the string is > used to check rather than its value, so that except "foo" doesn't > necessarily catch raise "foo"; but due to various optimization, this > usually works, and people get bent out of shape when it doesn't. Which brings 2 important questions: 1. In the long run, which one is better -- compare and check exceptions by reference (by name) or by value? (currently, this is done by reference on predefined object types: strings, classes or instances) I'd say, exceptions have to be compared (catched) by value, i.e. use "e1 == e2" instead of "e1 is e2". 2. Should we limit the exception "types"? I'd say, no. My Pythonic view of things says that we raise "objects", be they classes, instances, strings or, why not, ints. However, if one wants to put some order in the "unordered set" of exceptions s/he uses, then classes is the way to do it, because classes were given some nice properties, like inheritance, that allow to group and to organize logically the objects we throw and catch as exceptions (+ other bonus properties coming from classes). Note that conceptually, when we say "strings and ints", we have in mind "string instances and int instances", whose "classes" are written in C. When there will be String and Int classes of some sort as first class objects, then we'll fall back to the terminology: Exceptions can be classes or instances. If point 1 and (optionally) point 2 is implemented, the hard-to-understand quirk wouldn't be an issue and string-based exceptions would have a legal reason to stay and live. > Since you have to give your exception a name, how hard is it to say > > class MyError(Exception): pass > > rathern than > > MyError = "MyError" > > ? You know what I think about "names"... I may have defined my exception conventions and be interested in catching an exception named 404, implying that "a 404 bobo" occured deeply in my code ("deeply in my code" meaning for example: database 4, service 0, customer group 4, or just a standard HTTP "Code 404 - Not Found".) Pushing this to the extreme to catapult your thoughts into the next millenium. :) and to emphasize the importance of discussing and anwsering objectively the above questions 1) and 2). -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From mal at lemburg.com Fri Dec 24 12:03:37 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 24 Dec 1999 12:03:37 +0100 Subject: [Python-Dev] str(1L) -> '1' ? References: <38623493.E6BA6D6F@digicool.com> <199912231446.JAA22086@eric.cnri.reston.va.us> Message-ID: <38635309.2AEFF18D@lemburg.com> Guido van Rossum wrote: > > [Jim F] > > In November there was an interesting discussion on comp.lang.python > > about the meaning of __str__ and __repr__. One tidbit that came out > > of this discussion was that __str__ for longs should drop the trailing > > 'L'. Was there a decision on this? I'd really like this to happen. > > Yes, I'd like it to happen. I'd also like repr() of a float to return > the full precision (using the "%.17g" sprintf format). While we're at it: how about adding a PyLong_AsString() API to the C interface ? I currently use PyObject_Str() in mxODBC and then slice off the 'L' -- not very elegant. A PyLong_AsString() API would much better suit the task. Merry Christmas, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 7 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 24 12:11:29 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 24 Dec 1999 12:11:29 +0100 Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) References: <19991223163850.15619.qmail@web604.mail.yahoo.com> <199912231642.LAA22598@eric.cnri.reston.va.us> Message-ID: <386354E1.DA560F42@lemburg.com> Guido van Rossum wrote: > > > > OK. Let me rephrase it. Say we form a consensus on 'the right > > > way'. Are you amenable to some solution which goes back before > > > 1970 and after 2038 going into the standard library? > > No problem. > > > > And does your answer change if it involves some > > > compiled code as well? > > I'd rather not. As far as mxDateTime goes, I'd rather not see it in the core distribution. Including the mx stuff in a separate PythonPowerTools distribution would be cool though. For a start in this direction see e.g.: http://startship.skyport.net/~lemburg/PPowerTools-0.2.zip Note that I'll wrap all my mx extensions into a new mx package which will come in several flavours next year. There will no longer be separate packages due to the various naming collisions and to enable intra-mx-package dependencies. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 7 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From captainrobbo at yahoo.com Fri Dec 24 13:22:29 1999 From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=) Date: Fri, 24 Dec 1999 04:22:29 -0800 (PST) Subject: [Python-Dev] Fixed Decimal types Message-ID: <19991224122229.23506.qmail@web606.mail.yahoo.com> > >> However, unlike RPG, we should probably ensure > >> that attempts to overflow or underflow the scale > >> result in NaN or Overflow conditions, rather > >> than assuming the user is right and losing > >> the significant digits. > > > Since this would be based on infinite-precision > numbers, I don't > > think that this would be an issue. Three very general observations before I disappear for Christmas: (1) I think there is great mileage in combining the fixed-decimal concept with Martin Fowler's Quantity pattern, so that a variable could be defined as not just two decimal places but also (say) "GBP" or "USD", and it would be an error to add the two. Same applies for adding metres, kilograms and other quantities. There has also been discussion that the 'type' of a quantity should determine what math should apply. (2) If Python is going to be used increasingly in eCommerce, it should be good at dealing with money - maybe not in the core language, but we should aim for one standard package. (3) We have a python-finance list (python-finance at egroups.com), recently generalized to cover business systems, which is a good place to discuss this if anyone wants to. There are people there who have time, would love to prototype something (indeed some work started in this area 3 months back), and would use it at work too. This would be an ideal first target for that group - or indeed for a finance-sig. I'll pursue this in the New Year. Merry Christmas, Andy ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com From jack at oratrix.nl Fri Dec 24 13:34:28 1999 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 24 Dec 1999 13:34:28 +0100 Subject: [Python-Dev] Fixed Decimal types In-Reply-To: Message by =?iso-8859-1?q?Andy=20Robinson?= , Fri, 24 Dec 1999 04:22:29 -0800 (PST) , <19991224122229.23506.qmail@web606.mail.yahoo.com> Message-ID: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl> > (1) I think there is great mileage in combining the > fixed-decimal concept with Martin Fowler's Quantity > pattern, so that a variable could be defined as not > just two decimal places but also (say) "GBP" or "USD", > and it would be an error to add the two. Same applies > for adding metres, kilograms and other quantities. > There has also been discussion that the 'type' of a > quantity should determine what math should apply. Isn't this something that is ideally suited for implementation in a Python module, based on a core implementation of fixed decimal numbers? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From gstein at lyra.org Fri Dec 24 21:05:22 1999 From: gstein at lyra.org (Greg Stein) Date: Fri, 24 Dec 1999 12:05:22 -0800 (PST) Subject: [Python-Dev] Fixed Decimal types In-Reply-To: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl> Message-ID: On Fri, 24 Dec 1999, Jack Jansen wrote: > > (1) I think there is great mileage in combining the > > fixed-decimal concept with Martin Fowler's Quantity > > pattern, so that a variable could be defined as not > > just two decimal places but also (say) "GBP" or "USD", > > and it would be an error to add the two. Same applies > > for adding metres, kilograms and other quantities. > > There has also been discussion that the 'type' of a > > quantity should determine what math should apply. > > Isn't this something that is ideally suited for implementation in a Python > module, based on a core implementation of fixed decimal numbers? I'd agree with Jack here. The "simple" change of a scale for the Long values is nice. Starting to lump in features like this begins to get a little messier... Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Fri Dec 24 21:13:50 1999 From: gstein at lyra.org (Greg Stein) Date: Fri, 24 Dec 1999 12:13:50 -0800 (PST) Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: <38635309.2AEFF18D@lemburg.com> Message-ID: On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > Guido van Rossum wrote: > > [Jim F] > > > In November there was an interesting discussion on comp.lang.python > > > about the meaning of __str__ and __repr__. One tidbit that came out > > > of this discussion was that __str__ for longs should drop the trailing > > > 'L'. Was there a decision on this? I'd really like this to happen. > > > > Yes, I'd like it to happen. I'd also like repr() of a float to return > > the full precision (using the "%.17g" sprintf format). > > While we're at it: how about adding a PyLong_AsString() API > to the C interface ? I currently use PyObject_Str() in mxODBC > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > API would much better suit the task. Fred just checked in a change yesterday. PyObject_Str() on a Long no longer includes the 'L'. You're going to need to update your code :-) [ I've got some here and there to fix, too, with the idiom: if type(v) is type(1L): return str(v)[:-1] ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From mal at lemburg.com Sun Dec 26 23:29:28 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sun, 26 Dec 1999 23:29:28 +0100 Subject: [Python-Dev] str(1L) -> '1' ? References: Message-ID: <386696C8.6EBBF428@lemburg.com> Greg Stein wrote: > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > > While we're at it: how about adding a PyLong_AsString() API > > to the C interface ? I currently use PyObject_Str() in mxODBC > > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > > API would much better suit the task. > > Fred just checked in a change yesterday. PyObject_Str() on a Long no > longer includes the 'L'. Ah, ok... scanning the patches: they don't provide an externed C interface... I would like to have such a beast if possible (basically, the new long_format() as PyLong_AsString()). > You're going to need to update your code :-) > [ I've got some here and there to fix, too, with the idiom: > if type(v) is type(1L): return str(v)[:-1] > ] Your above example will effectively divide the long value by 10 which will probably break things in very subtle ways... hmm, this change ought to be made *very* visible to people upgrading to 1.6, IMHO. I'll fix mxODBC to only truncate the string value iff the 'L' is present. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 5 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From andy at robanal.demon.co.uk Mon Dec 27 11:43:17 1999 From: andy at robanal.demon.co.uk (Andy Robinson) Date: Mon, 27 Dec 1999 10:43:17 GMT Subject: [Python-Dev] Fixed Decimal types In-Reply-To: References: Message-ID: <38674259.5377973@post.demon.co.uk> On Fri, 24 Dec 1999 12:05:22 -0800 (PST), you wrote: >On Fri, 24 Dec 1999, Jack Jansen wrote: >> > (1) I think there is great mileage in combining the >> > fixed-decimal concept with Martin Fowler's Quantity >> > pattern, so that a variable could be defined as not >> > just two decimal places but also (say) "GBP" or "USD", >> > and it would be an error to add the two. Same applies >> > for adding metres, kilograms and other quantities. >> > There has also been discussion that the 'type' of a >> > quantity should determine what math should apply. >> >> Isn't this something that is ideally suited for implementation in a Python >> module, based on a core implementation of fixed decimal numbers? > >I'd agree with Jack here. > Me too - I thought I said that in point 2, but in retrospect I didn't say it clearly enough :-) - Andy From gstein at lyra.org Mon Dec 27 12:31:29 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 27 Dec 1999 03:31:29 -0800 (PST) Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: <386696C8.6EBBF428@lemburg.com> Message-ID: On Sun, 26 Dec 1999, M.-A. Lemburg wrote: > Greg Stein wrote: > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > > > While we're at it: how about adding a PyLong_AsString() API > > > to the C interface ? I currently use PyObject_Str() in mxODBC > > > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > > > API would much better suit the task. > > > > Fred just checked in a change yesterday. PyObject_Str() on a Long no > > longer includes the 'L'. > > Ah, ok... scanning the patches: they don't provide an externed > C interface... I would like to have such a beast if possible > (basically, the new long_format() as PyLong_AsString()). What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry Point. > > You're going to need to update your code :-) > > [ I've got some here and there to fix, too, with the idiom: > > if type(v) is type(1L): return str(v)[:-1] > > ] > > Your above example will effectively divide the long value by 10 > which will probably break things in very subtle ways... hmm, this Yah :-( Not a lot of fun, but I think for the best. > change ought to be made *very* visible to people upgrading to > 1.6, IMHO. Yes. Cheers, -g -- Greg Stein, http://www.lyra.org/ From mal at lemburg.com Mon Dec 27 13:51:36 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 27 Dec 1999 13:51:36 +0100 Subject: [Python-Dev] str(1L) -> '1' ? References: Message-ID: <386760D8.E897FADF@lemburg.com> Greg Stein wrote: > > On Sun, 26 Dec 1999, M.-A. Lemburg wrote: > > Greg Stein wrote: > > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > > > > While we're at it: how about adding a PyLong_AsString() API > > > > to the C interface ? I currently use PyObject_Str() in mxODBC > > > > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > > > > API would much better suit the task. > > > > > > Fred just checked in a change yesterday. PyObject_Str() on a Long no > > > longer includes the 'L'. > > > > Ah, ok... scanning the patches: they don't provide an externed > > C interface... I would like to have such a beast if possible > > (basically, the new long_format() as PyLong_AsString()). > > What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry > Point. What's wrong with a rich C API :-) ? The long_format function would be very useful for programs interacting with other software at C level. Making it external would give the programmer the ability to pass long string representations in any base to other programs, which is very useful for e.g. database interaction or crypto software. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 4 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From bkc at murkworks.com Mon Dec 27 23:04:25 1999 From: bkc at murkworks.com (Brad Clements) Date: Mon, 27 Dec 1999 17:04:25 -0500 Subject: [Python-Dev] Re: [PSA MEMBERS] Re: Please test new dynamic load behavior In-Reply-To: References: <38620B04.7CC64485@trema.com> Message-ID: <199912272204.RAA26173@anvil.murkworks.com> On 23 Dec 99, at 10:26, Greg Stein wrote: > > > I reorganized Python's dynamic load/import code over the past few days. > > > Gudio provided some feedback, I did some more mods, and now it is checked > > > into CVS. The new loading behavior has been tested on Linux, IRIX, and > > > Solaris (and probably Windows by now). FYI, I downloaded the import stuff from CVS and used it in my port of Python to NetWare. Good timing, as I was just tackling dynamic loading on NetWare when I saw your message. The new scheme is much better, and works for me. Though I do need to add some special "un-import" code similar to what BEOS does. Brad Clements, bkc at murkworks.com (315)268-1000 http://www.murkworks.com (315)268-9812 Fax netmeeting: ils://ils.murkworks.com AOL-IM: BKClements From skip at mojam.com Tue Dec 28 22:41:33 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 28 Dec 1999 15:41:33 -0600 Subject: [Python-Dev] Better text processing support in py2k? Message-ID: <199912282141.PAA31426@dolphin.mojam.com> It just occurred to me as I was replying to a request on the main list, that Python's text handling capabilities could be a bit better than they are. This will probably not come as a revelation to many of you, but I finally put it together with the standard argument against beefing things up One fix would be to add regular expressions to the language core and have special syntax for them, as Perl has done. However, I don't like this solution because Python is a general-purpose language, and regular expressions are used for the single application domain of text processing. For other application domains, regular expressions may be of no interest, and you might want to remove them to save memory and code size. and the observation that Python does support some builtin objects and syntax that are fairly specific to some much more restricted application domains than text processing. I stole the above quote from Andrew Kuchling's Python Warts page, which I also happened to read earlier today. What AMK says makes perfect sense until you examine some of the other things that are in the language, like the Ellipsis object and complex numbers. If I recall correctly both were added as a result of the NumPy package development. I have nothing against ellipses or complex numbers. They are fine first class objects that should remain in the language. But I have never used either one in my day-to-day work. On the other hand, I read files and manipulate them with regular expressions all the time. I rather suspect that more people use Python for some sort of text processing than any other single application domain. Python should be good at it. While I don't want to turn Python into Perl, I would like to see it do a better job of what most people probably use the language for. Here is a very short list of things I think need attention: 1. When using something like the simple file i/o idiom for line in f.readlines(): dofunstuff(line) the programmer should not have to care how big the file is. It should just work in a reasonably efficient manner without gobbling up all of memory. I realize this may require some change to the syntax of the common idiom. 2. The re module needs to be sped up, if not to catch up with Perl, then to catch up with the deprecated regex module. Depending how far people want to go with things, adding some language syntax to support regular expressions might be in order. I don't see that as compelling as adding complex numbers however. Another possibility, now that Barry Warsaw has opened the floodgates, is to add regular expression methods to strings. 3. I've not yet used it, but I am told the pattern matching in Marc-Andre Lemburg's mxTextTools (http://starship.python.net/crew/lemburg/) is both powerful and efficient (though it certainly appears complex). Perhaps it deserves consideration for incorporation into the core Python distribution. I'm sure other people will come up with other suggestions. Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From akuchlin at mems-exchange.org Tue Dec 28 23:00:11 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Tue, 28 Dec 1999 17:00:11 -0500 (EST) Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com> References: <199912282141.PAA31426@dolphin.mojam.com> Message-ID: <14441.13035.802146.730160@amarok.cnri.reston.va.us> Skip Montanaro writes: >What AMK says makes perfect sense until you examine some of the other things >that are in the language, like the Ellipsis object and complex numbers. If >I recall correctly both were added as a result of the NumPy package >development. True, but note that you can compile Python with WITHOUT_COMPLEX defined to remove complex numbers. > 1. When using something like the simple file i/o idiom > for line in f.readlines(): > dofunstuff(line) > the programmer should not have to care how big the file is. What about 'for line in fileinput.input()', which already exists? (Hmmm... if you have an already open file object, I don't think you can pass it to fileinput.input(); maybe that should be fixed.) On a vaguely related note, since there are many things like parser generators and XML stuff and mxTextTools, I've been speculating about a text processing topic guide. If you know of Python packages related to text processing, please send me a private e-mail with a link. -- A.M. Kuchling http://starship.python.net/crew/amk/ Constraints often boost creativity. -- Jim Hugunin, 11 Feb 1999 From skip at mojam.com Tue Dec 28 23:26:53 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 28 Dec 1999 16:26:53 -0600 (CST) Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <14441.13035.802146.730160@amarok.cnri.reston.va.us> References: <199912282141.PAA31426@dolphin.mojam.com> <14441.13035.802146.730160@amarok.cnri.reston.va.us> Message-ID: <14441.14637.682862.999776@dolphin.mojam.com> Andrew> True, but note that you can compile Python with WITHOUT_COMPLEX Andrew> defined to remove complex numbers. That's true, but that wasn't my point. I'm not arguing for or against space efficiency, just that the the rather timeworn argument about not doing anything special to support text processing because Python is a general purpose language is a red herring. >> 1. When using something like the simple file i/o idiom >> for line in f.readlines(): >> dofunstuff(line) >> the programmer should not have to care how big the file is. Andrew> What about 'for line in fileinput.input()', which already Andrew> exists? (Hmmm... if you have an already open file object, I Andrew> don't think you can pass it to fileinput.input(); maybe that Andrew> should be fixed.) Well, a couple reasons jump to mind: 1. fileinput.FileInput isn't particularly efficient. At its heart, its __getitem__ method makes a simple readline() call instead of buffering some amount of readlines(sizehint) bytes. This can be fixed, but I'm not sure what would happen to its semantics. 2. As you pointed out, it's not all that general. My point, not at all well stated, is that the programmer shouldn't have to worry (much?) about the conditions under which he does file i/o. Right now, if I know the file is small(ish), I can do for line in f.readlines(): dofunstuff(line) but I have to know that the file won't be big, because readlines() will behave badly (perhaps even generate a MemoryError exception) if the file is large. In that case, I have to fall back to the safer (and slower) line = f.readline() while line: dofunstuff(line) line = f.readline() or the more efficient, but more cumbersome lines = f.readlines(sizehint) while lines: for line in lines: dofunstuff(line) lines = f.readlines(sizehint) That's three separate idioms the programmer has to be aware of when writing code to read a text file based upon the perceived need for speed, memory usage and desired clarity: fast/memory-intensive/clear slow/memory-conserving/not-as-clear fast/memory-conserving/fairly-muddy Any particular reason that the readline method can't return an iterator that supports __getitem__ and buffers input? (Again, remember this is for py2k, so the potential breakage such a change might cause is a consideration, but not a showstopper.) Andrew> On a vaguely related note, since there are many things like Andrew> parser generators and XML stuff and mxTextTools, I've been Andrew> speculating about a text processing topic guide. If you know of Andrew> Python packages related to text processing, please send me a Andrew> private e-mail with a link. This sounds like a good idea to me. Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From captainrobbo at yahoo.com Wed Dec 29 09:34:43 1999 From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=) Date: Wed, 29 Dec 1999 00:34:43 -0800 (PST) Subject: [Python-Dev] Better text processing support in py2k? Message-ID: <19991229083443.27817.qmail@web6005.mail.yahoo.com> --- Skip Montanaro wrote: > fast/memory-intensive/clear > slow/memory-conserving/not-as-clear > fast/memory-conserving/fairly-muddy > > Any particular reason that the readline method can't > return an iterator that > supports __getitem__ and buffers input? (Again, > remember this is for py2k, > so the potential breakage such a change might cause > is a consideration, but > not a showstopper.) Why not generalize fileinput to do buffering instead? More generally, Java has the notion of 'stackable streams' - e.g. construct a 'BufferedFile' around a 'File', maybe construct a 'Line-oriented file' around that etc. Each one takes a file-like object as an argument to the constructor. Things you might want to do: - buffering - international encoding conversions - line delimiters other than CR/LF/CRLF - read/write Python objects (i.e. use pickle/marshal) - easy interfaces to parsers This took me a couple of hours to get used to (and at the time I thought 'Yuk!' when I saw first saw four nested constructors), but gives you very precise control and a lot of versatility when handling files. It's an idiom Python does not use much but maybe it should. I'd argue that maybe some enhancements to fileinput.py - adding some streams to provide building blocks for these operations - would get us the power you want and a lot more versatility besides. ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. __________________________________________________ Do You Yahoo!? Talk to your friends online with Yahoo! Messenger. http://messenger.yahoo.com From mal at lemburg.com Wed Dec 29 17:55:21 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Wed, 29 Dec 1999 17:55:21 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <19991229083443.27817.qmail@web6005.mail.yahoo.com> Message-ID: <386A3CF9.8AF0EA60@lemburg.com> Andy Robinson wrote: > > --- Skip Montanaro wrote: > > fast/memory-intensive/clear > > slow/memory-conserving/not-as-clear > > fast/memory-conserving/fairly-muddy > > > > Any particular reason that the readline method can't > > return an iterator that > > supports __getitem__ and buffers input? (Again, > > remember this is for py2k, > > so the potential breakage such a change might cause > > is a consideration, but > > not a showstopper.) > > Why not generalize fileinput to do buffering instead? > > More generally, Java has the notion of 'stackable > streams' - e.g. construct a 'BufferedFile' around a > 'File', maybe construct a 'Line-oriented file' around > that etc. Each one takes a file-like object as an > argument to the constructor. Things you might want to > do: > - buffering > - international encoding conversions > - line delimiters other than CR/LF/CRLF > - read/write Python objects (i.e. use pickle/marshal) > - easy interfaces to parsers If all goes well we'll have something like this in Python 1.6 at least for the encoding/decoding part file reading and writing. You basically take a file object and then wrap some StreamCodecs around it to get the functionality you need. Very simple and very intuitive. > This took me a couple of hours to get used to (and at > the time I thought 'Yuk!' when I saw first saw four > nested constructors), but gives you very precise > control and a lot of versatility when handling files. > It's an idiom Python does not use much but maybe it > should. > > I'd argue that maybe some enhancements to fileinput.py > - adding some streams to provide building blocks for > these operations - would get us the power you want and > a lot more versatility besides. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Get ready to party ! Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From bckfnn at pipmail.dknet.dk Wed Dec 29 19:51:52 1999 From: bckfnn at pipmail.dknet.dk (Finn Bock) Date: Wed, 29 Dec 1999 18:51:52 GMT Subject: [Python-Dev] zipfile.py In-Reply-To: <3857B97E.3684224F@interet.com> References: <3857B97E.3684224F@interet.com> Message-ID: <386a582d.6762574@pipmail.dknet.dk> James C. Ahlstrom wrote: > ftp://ftp.interet.com/pub/pylib.html I feel that it smell a bit too much like a tool and too little like an general programming api. - It can only add disk files. The ability to write data to a zip entry through a file-like object or from a string would make it more like an API, IMHO - Some kind of access to the TOC entry fields (date, size, compressed size etc) also seems like a nice feature. - The data for an entry must be available in memory. Could be a problem for huge files, but most like not in practical use. I admit that I am fond of the api from java.util.zip.ZipFile and java.util.zip.ZipOutputStream. Regards, Finn Bock From tim_one at email.msn.com Thu Dec 30 07:08:58 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 01:08:58 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com> Message-ID: <000001bf528c$5cbdb9a0$a02d153f@tim> [Skip Montanaro, wants nicer text facilities] > ... > I rather suspect that more people use Python for some sort of > text processing than any other single application domain. Hmm. You're probably right, but I'm an exception. > Python should be good at it. And I guess I'm an exception mostly *because* Perl is better at easy text crunching and Icon is better at hard text-crunching -- that is, I use the right tool for the job . > While I don't want to turn Python into Perl, I would like to see > it do a better job of what most people probably use the language > for. Here is a very short list of things I think need attention: > > 1. [*A* clear way to do memory- and time-efficient textfile > input] I agree, but unsure how to fix it. The best way to write this now is # f is some open file object. while 1: lines = f.readlines(BUFSIZE) if not lines: break for line in lines: process(line) and it's not something anyone figures out on their own -- or enjoys typing or explaining afterwards. Perl gets its line-at-a-time speed by peeking and poking C FILE structs directly in compiler- and platform-specific ways -- ways that vendors *should* have done in their own fgets implementations, but almost never do. I have no idea whether it works well with Perl's nascent notions of threading, but in the absence of that "the system" doesn't know Perl is cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one line at a time -- even mixing in C-level ungetc calls works (well, sometimes <0.1 wink -- they don't always peek and poke enough fields>)). The Python QIO extension module is much easier to port but less compatible (it doesn't use stdio, so QIO-opened files don't play well with others) and slower (although that's likely repairable -- he's got two passes over the buffer where one hairier pass should suffice). > 2. The re module needs to be sped up, if not to catch up with > Perl, then to catch up with the deprecated regex module. The irony here is that the re engine is very often unboundedly faster than the regex engine -- provided you're chewing over large strings. Some tests /F ran showed that the length-independent *overhead* of invoking re is about 10x higher than for regex. Presumably the bulk of that is due to re.py, i.e. that you get to the re engine via going thru Python layers on your way in and out, while regex was pure C. In any case, /F is working on a new engine (for Unicode), and I believe he has this all well in hand. > Depending how far people want to go with things, adding some > language syntax to support regular expressions might be in order. > ... > 3. I've not yet used it, but I am told the pattern matching in > Marc-Andre Lemburg's mxTextTools > (http://starship.python.net/crew/lemburg/) > is both powerful and efficient (though it certainly appears > complex). Perhaps it deserves consideration for > incorporation into the core Python distribution. It's not complex, it's complicated -- and *that's* what makes it un-Pythonic . Tony Ibbs has written a friendly wrapper around mxTextTools that suppresses much of the non-essential complication. OTOH, if you go into this with a regexp mindset, it will run much slower than a real regexp package, because the bulk of the latter is devoted to doing optimization; mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls if you e.g. try to implement naive backtracking). You should go to the REBOL site and look at the description of REBOL's PARSE verb in the FAQ ... mumble, mumble ... at http://www.rebol.com/faq.html#11550948 Here's an example pulled from that page (this is a REBOL code fragment): digit: charset "0123456789" expr: [term ["+" | "-"] expr | term] term: [factor ["*" | "/"] term | factor] factor: [primary "**" factor | primary] primary: [value | "(" expr ")"] value: [digit value | digit] parse "1 + 2 ** 9" expr There hasn't been a pattern scheme this clean, convenient or powerful since SNOBOL4. It exploits REBOL's Forth-like (lack of!) syntax, and Smalltalk-like penchant for passing around thunks (anonymous closures -- "[...]" in REBOL builds a lexically-scoped entity called "a block", which can be treated as code (executed) or data (manipulated like a Python list) at will). Now the example doesn't show this, but you can freely mix computations into the middle of the patterns; only *some* of the words in the blocks have special meaning to PARSE. The fragment above is already way beyond what can be accomplished with regexps, but that's just the start of it. Perl too is slamming in more & more ways to get user code to interact with its regexp engine. So REBOL has a *very* nice approach to this; I believe it's unreasonably clumsy to mimic in Python primarily because of forward references (note e.g. that the block attached to "expr" above refers to "term" before the latter has been bound -- but the stuff inside [...] is just a closure so that doesn't matter -- it only matters that term gets bound before expr is *executed*). I hit a similar snag years ago when trying to mimic SNOBOL4's approach in Python. Perl's endless abuse of regexps is making that language more absurd by the month. The other major approach to mixing patterns with computation is due to Icon, another language where a regexp mindset is fatal. On a whim, I whipped up the attached, which illustrates a bit of the Icon approach in Pythonic terms (but without language support for generators, the *heart* of it can't really be captured). Here's an example of how this could be used to implement (the simplest form of) string.split: def mysplit(str): s = Searcher(str) white = CharSet(" \t\n") result = [] s.many(white) # consume initial whitespace while s.notmany(white): # consume non-whitespace result.append(s.get_match()) s.many(white) return result >>> mysplit(" \t Hey, that's\tpretty\n\n neat! ") ['Hey,', "that's", 'pretty', 'neat!'] >>> The primary thing to note is that there's no seam between analyzing the string and doing computation on the partial results -- "the program is the pattern". This is what Icon does to perfection, Perl is moving toward, and REBOL is arriving at from a different direction. It's The Future <0.9 wink>. Without generators it's difficult to work backtracking into the Searcher class, but, as above, in my experience the backtracking feature of regexps is rarely *needed*! For example, at various points "split" wants to suck up all the whitespace characters, and that's *it* -- the backtracking possibility in the regexp \s+ is often a bug just waiting for unexpected *context* to trigger it. A hairy regexp is pure hell; but what simpler regexps can do don't require all that funky regexp machinery. BTW, the mxTextTools engine could be used to get blazing implementations of the primary Searcher methods (it excels at simple analysis). OTOH, making lots of calls to analyze short strings is slow. The only clean solutions to that are Perl's and Icon's (build everyting into one language so the compiler can optimize stuff away), and REBOL's (make no distinction between code and data, so that code can be analyzed & optimized at runtime -- and build the entire implementation around making closures and calls supernaturally fast). the-less-you-use-regexps-the-less-you-miss-'em-ly y'rs - tim class CharSet: def __init__(self, seq): self.seq = seq d = {} for ch in seq: d[ch] = 1 self.haskey = d.has_key def __call__(self, ch): return self.haskey(ch) def __add__(self, other): if isinstance(other, CharSet): other = other.seq return CharSet(self.seq + other) def _normalize_index(i, n): assert n >= 0 if i >= 0: return min(i, n) elif n == 0: return 0 # want smallest q s.t. i + q*n >= 0 # <-> q*n >= -i # <-> q >= -i/n # so q = ceiling(-i/n) = -floor(i/n) return i - (i/n)*n class Searcher: def __init__(self, str, lo=0, hi=None): """Create object to search in str[lo:hi]. lo defaults to 0. hi defaults to len(str). len(str) is repeatedly added to negative lo or hi until reaching a number >= 0. If lo > hi, a uselessly empty slice will be searched. The search cursor is initialized to lo. """ self.s = str self.lo = _normalize_index(lo, len(str)) if hi is None: self.hi = len(str) else: self.hi = _normalize_index(hi, len(str)) if self.lo > self.hi: self.hi = self.lo self.i = self.lo self.lastmatch = None, None def any(self, charset, consume=1): """Try to match single character in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i = self.i if i < self.hi and charset(self.s[i]): if consume: self.__consume(i+1) return 1 return 0 def notany(self, charset, consume=1): """Try to match single character not in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i = self.i if i < self.hi and not charset(self.s[i]): if consume: self.__consume(i+1) return 1 return 0 def many(self, charset, consume=1): """Try to match one or more characters in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i, n, s = self.i, self.hi, self.s j = i while j < n and charset(s[j]): j = j+1 if i < j: if consume: self.__consume(j) return 1 return 0 def notmany(self, charset, consume=1): """Try to match one or more characters not in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i, n, s = self.i, self.hi, self.s j = i while j < n and not charset(s[j]): j = j+1 if i < j: if consume: self.__consume(j) return 1 return 0 def match(self, str, consume=1): """Try to match string "str". Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i = self.i j = i + len(str) if self.s[i:j] == str: if consume: self.__consume(j) return 1 return 0 def get_str(self): """Return subject string.""" return self.s def get_lo(self): """Return low slice bound.""" return self.lo def get_hi(self): """Return high slice bound.""" return self.hi def get_pos(self): """Return current value of search cursor.""" return self.i def get_match_indices(self): """Return slice indices of last "consumed" match.""" return self.lastmatch def get_match(self): """Return last "consumed" matching substring.""" i, j = self.lastmatch if i is None: return ValueError("no match to return!") return self.s[i:j] def set_pos(self, pos, consume=1): """Set search cursor to new value. No return value. If optional arg "consume" is true, the last match is set to the slice between pos and the current cursor position. """ p = _normalize_index(pos, len(self.s)) if not self.lo <= p <= self.hi: raise ValueError("pos out of bounds: " + `pos`) if consume: self.__consume(p) else: self.i = p def move_pos(self, incr, consume=1): """Move the cursor by incr characters. No return value. If the new value is outside the slice bounds, it's clipped. If optional arg "consume" is true, the last match is set to the slice between the old and new cursor positions. """ newi = self.i + incr if newi < self.lo: newi = self.lo elif newi > self.hi: newi = self.hi if consume: self.__consume(newi) else: self.i = newi def __consume(self, newi): i, j = self.i, newi if i > j: i, j = j, i self.lastmatch = i, j self.i = newi From tim_one at email.msn.com Thu Dec 30 07:09:14 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 01:09:14 -0500 Subject: [Python-Dev] Fixed-decimal types In-Reply-To: <199912231944.OAA23337@eric.cnri.reston.va.us> Message-ID: <000201bf528c$657c3080$a02d153f@tim> [Guido] > ... > Not arguing for this interpretation, just indicating that doing > fixed precision arithmetic right is hard. It's not so much hard as it is arbitrary. The floating-point world is standardized now, but the fixed-point world remains a mish-mash of incompatible legacy schemes carried across generations of products for no reason other than product-specific compatibility. So despite that fixed-point has a specialty audience, whatever rules Python chooses will leave it incompatible with much of that audience's (mixed!) expectations. If fixed-point is needed, and my FixedPoint.py isn't good enough (all other fixed point pkgs I've seen for Python were braindead), then it should be implemented such that developers can control both rounding and precision propagation. I'll attach suitable kernels; they haven't been tested but any bugs discovered will be trivial to fix (there are no difficulties here, but typos are likely); the kernels supply the bulk of what's required, whether implemented in Python or C; various packages can wrap them to supply whatever policies they like; see FixedPoint.py for exact string<->FixedPoint and exact float->FixedPoint conversions; and that's the end of my involvement in fixed-point . Python should certainly *not* add a "scale factor" to its current long implementation; fixed-point should be a distinct type, as scale-factor fiddling is clumsy and pervasive (long arithmetic is challenging enough to get correct and quick without this obfuscating distraction; and by leaving scale factors out of it, it's much easier to plug in alternative bigint implementations (like GMP)). One other point: some people are going to want BCD (binary-coded decimal), which suffers the same mish-mash of legacy policies, but with a different data representation. The point is that many commercial applications spend much more time doing I/O conversions than arithmetic, and BCD accepts slow arithmetic (in the absence of special HW support) in return for fast scaling & I/O conversion. Forgetting the database-heads for a moment, decimal *floating*-point is what calculators do, so that's what "real people" are most comfortable with. The IEEE-854 std (IEEE-754's younger and friendlier brother) specifies that completely. Add a means to boost "global" precision (a la REXX), and it's a powerful tool even for experts (benefits approximating those of unbounded rational arithmetic but with bounded & user-controllable expense). can-never-have-too-many-numeric-types-but-always-have- too-few-literal-notations-ly y'rs - tim # Kernels for fixed-point decimal arithmetic. # _add, _sub, _mul, _div all have arglist # n1, p1, n2, p2, p, round=DEFAULT_ROUND # n1 and n2 are longs; p1, p2 and p ints >= 0. # The inputs are exactly n1/10**p1 and n2/10**p2. # # The return value is the integer n such that n/10**p is the best # approximation to the infinite-precision result. In other words, p1 # and p2 are the input precisions and p is the desired output # precision, where precision is the # of digits *after* the decimal # point. # # What "best approximation" means is determined by the round function. # In many cases rounding isn't required, but when it is # round(top, bot) # is returned. top and bot are longs, with bot > 0 guaranteed. The # infinite-precision result is top/bot. round must return an integer # (long) approximation to top/bot, using whichever rounding discipline # you want. By default, IEEE round-to-nearest/even is used; see the # _roundXXX functions for examples of suitable rounding functions. # # Note: The only code here that knows we're working in decimal is # function _tento; simply change the "10L" in that to do fixed-point # arithmetic in some other base. # # Example: # # >>> r7 = _div(1L, 0, 7L, 0, 20) # 1/7 # >>> r7 # 14285714285714285714L # >>> r5 = _div(1L, 0, 5L, 0, 20) # 1/5 # >>> r5 # 20000000000000000000L # >>> sum = _add(r7, 20, r5, 20, 20) # 1/7 + 1/5 = 12/35 # >>> sum # 34285714285714285714L # >>> _mul(sum, 20, 35L, 0, 20) # 1199999999999999999990L # >>> _mul(sum, 20, 35L, 0, 18) # 12000000000000000000L # >>> _mul(sum, 20, 35L, 0, 0) # 12L # >>> ################################################################### # Sample rounding functions. ################################################################### # Round to minus infinity. def _roundminf(top, bot): assert bot > 0 return top / bot # Round to plus infinity. def _roundpinf(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot if r: q = q + 1 return q # IEEE nearest/even rounding (closest integer; in case of tie closest # even integer). def _roundne(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot c = cmp(r << 1, bot) # c < 0 <-> r < bot/2, etc if c > 0 or (c == 0 and (q & 1) == 1): q = q + 1 return q # "Add a half and chop" rounding (remainder < 1/2 toward 0; remainder # >= half away from 0). def _roundhalf(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot c = cmp(r << 1, bot) # c < 0 <-> r < bot/2, etc if c > 0 or (c == 0 and q >= 0): q = q + 1 return q # Round toward 0 (throw away remainder). def _roundchop(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot if r and q < 0: q = q + 1 return q ################################################################### # Kernels for + - * /. ################################################################### DEFAULT_ROUND = _roundne def _add(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 # (n1/10**p1 + n2/10**p2) * 10**p == # (n1*10**(max-p1) + n2*10**(max-p2))/10**max * 10**p max = p1 # until proven otherwise if p1 < p2: n1 = n1 * _tento(p2 - p1) max = p2 elif p2 < p1: n2 = n2 * _tento(p1 - p2) n3 = n1 + n2 p3 = p - max if p3 > 0: n3 = n3 * _tento(p3) elif p3 < 0: n3 = round(n3, _tento(-p3)) return n3 def _sub(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 return _add(n1, p1, -n2, p2, p, round) def _mul(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 # (n1/10**p1 * n2/10**p2) * 10**p == # (n1*n2)/10**(p1+p2) * 10**p n3 = n1 * n2 p3 = p - p1 - p2 if p3 > 0: n3 = n3 * _tento(p3) elif p3 < 0: n3 = round(n3, _tento(-p3)) return n3 def _div(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 if n2 == 0: raise ZeroDivisionError("scaled integer") # (n1/10**p1 / n2/10**p2) * 10**p == # (n1/n2) * 10**(p2-p1+p) p3 = p2 - p1 + p if p3 > 0: n1 = n1 * _tento(p3) elif p3 < 0: n2 = n2 * _tento(-p3) if n2 < 0: n1 = -n1 n2 = -n2 return round(n1, n2) def _tento(i, _cache={}): assert i >= 0 try: return _cache[i] except KeyError: answer = _cache[i] = 10L ** i return answer From fredrik at pythonware.com Thu Dec 30 12:05:45 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 30 Dec 1999 12:05:45 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <000001bf528c$5cbdb9a0$a02d153f@tim> Message-ID: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com> Tim Peters is back from his vacation: > > While I don't want to turn Python into Perl, I would like to see > > it do a better job of what most people probably use the language > > for. Here is a very short list of things I think need attention: > > > > 1. [*A* clear way to do memory- and time-efficient textfile > > input] > > I agree, but unsure how to fix it. The best way to write this now is > > # f is some open file object. > while 1: > lines = f.readlines(BUFSIZE) > if not lines: > break > for line in lines: > process(line) > > and it's not something anyone figures out on their own -- or enjoys typing > or explaining afterwards. > > Perl gets its line-at-a-time speed by peeking and poking C FILE structs > directly in compiler- and platform-specific ways -- ways that vendors > *should* have done in their own fgets implementations, but almost never do. > I have no idea whether it works well with Perl's nascent notions of > threading, but in the absence of that "the system" doesn't know Perl is > cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one > line at a time -- even mixing in C-level ungetc calls works (well, sometimes > <0.1 wink -- they don't always peek and poke enough fields>)). > > The Python QIO extension module is much easier to port but less compatible > (it doesn't use stdio, so QIO-opened files don't play well with others) and > slower (although that's likely repairable -- he's got two passes over the > buffer where one hairier pass should suffice). we have something called SIO which uses memory mapping where possible, and just a more aggressive read-ahead for other cases. on a windows box, a traditional while/readline loop runs 3-5 times faster than before. with SRE instead of re, a while/readline/match loop runs up to 10 times faster than before. note that this is without *any* changes to the Python source code... > > 2. The re module needs to be sped up, if not to catch up with > > Perl, then to catch up with the deprecated regex module. > > The irony here is that the re engine is very often unboundedly faster than > the regex engine -- provided you're chewing over large strings. Some tests > /F ran showed that the length-independent *overhead* of invoking re is about > 10x higher than for regex. Presumably the bulk of that is due to re.py, > i.e. that you get to the re engine via going thru Python layers on your way > in and out, while regex was pure C. I've attached some old benchmarks. I think the current code base is a bit faster, but you get the idea. > In any case, /F is working on a new engine (for Unicode), and I believe he > has this all well in hand. with a little luck, the new module will replace both pcre and regex... not to mention that it's fairly easy to write your own front- end to the matching engine -- the expression parser and the compiler are both written in good old python. $ python sre_bench.py 0 5 50 250 1000 5000 25000 ----- ----- ----- ----- ----- ----- ----- ----- search for Python|Perl in Perl -> sre8 0.007 0.008 0.010 0.010 0.020 0.073 0.349 sre16 0.007 0.007 0.008 0.010 0.020 0.075 0.353 re 0.097 0.097 0.101 0.103 0.118 0.175 0.480 regex 0.007 0.007 0.009 0.020 0.059 0.271 1.320 search for (Python|Perl) in Perl -> sre8 0.007 0.007 0.007 0.010 0.020 0.074 0.344 sre16 0.007 0.007 0.008 0.010 0.020 0.074 0.347 re 0.110 0.104 0.111 0.115 0.125 0.184 0.559 regex 0.006 0.006 0.009 0.019 0.057 0.285 1.432 search for Python in Python -> sre8 0.007 0.007 0.007 0.011 0.021 0.072 0.387 sre16 0.007 0.007 0.008 0.010 0.022 0.082 0.365 re 0.107 0.097 0.105 0.102 0.118 0.175 0.511 regex 0.009 0.008 0.010 0.018 0.036 0.139 0.708 search for .*Python in Python -> sre8 0.008 0.007 0.008 0.011 0.021 0.079 0.379 sre16 0.008 0.008 0.008 0.011 0.022 0.075 0.402 re 0.102 0.108 0.119 0.183 0.400 1.545 7.284 regex 0.013 0.019 0.072 0.318 1.231 8.035 45.366 search for .*Python.* in Python -> sre8 0.008 0.008 0.008 0.011 0.021 0.080 0.383 sre16 0.008 0.008 0.008 0.011 0.021 0.079 0.395 re 0.103 0.108 0.119 0.184 0.418 1.685 8.378 regex 0.013 0.020 0.073 0.326 1.264 9.961 46.511 search for .*(Python) in Python -> sre8 0.007 0.008 0.008 0.011 0.021 0.077 0.378 sre16 0.007 0.008 0.008 0.011 0.021 0.077 0.444 re 0.108 0.107 0.134 0.240 0.637 2.765 13.395 regex 0.026 0.112 3.820 87.322 (skipped) search for .*P.*y.*t.*h.*o.*n.* in Python -> sre8 0.010 0.010 0.014 0.031 0.093 0.419 2.212 sre16 0.010 0.011 0.014 0.030 0.093 0.419 2.292 re 0.112 0.121 0.195 0.521 1.747 8.298 40.877 regex 0.026 0.048 0.248 1.148 4.550 24.720 ... (searching for patterns in padded strings; sre8 is the sre engine compiled for 8-bit characters, sre16 is the same engine compiled for 16-bit characters) From mal at lemburg.com Thu Dec 30 12:52:50 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 30 Dec 1999 12:52:50 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <000001bf528c$5cbdb9a0$a02d153f@tim> Message-ID: <386B4792.A551022A@lemburg.com> Tim Peters wrote: > > [Skip Montanaro, wants nicer text facilities] > > While I don't want to turn Python into Perl, I would like to see > > it do a better job of what most people probably use the language > > for. Here is a very short list of things I think need attention: > > > > 1. [*A* clear way to do memory- and time-efficient textfile > > input] > > ... > > The Python QIO extension module is much easier to port but less compatible > (it doesn't use stdio, so QIO-opened files don't play well with others) and > slower (although that's likely repairable -- he's got two passes over the > buffer where one hairier pass should suffice). What is QIO ? > > Depending how far people want to go with things, adding some > > language syntax to support regular expressions might be in order. > > ... > > 3. I've not yet used it, but I am told the pattern matching in > > Marc-Andre Lemburg's mxTextTools > > (http://starship.python.net/crew/lemburg/) > > is both powerful and efficient (though it certainly appears > > complex). Perhaps it deserves consideration for > > incorporation into the core Python distribution. > > It's not complex, it's complicated -- and *that's* what makes it un-Pythonic > . Tony Ibbs has written a friendly wrapper around mxTextTools that > suppresses much of the non-essential complication. OTOH, if you go into > this with a regexp mindset, it will run much slower than a real regexp > package, because the bulk of the latter is devoted to doing optimization; > mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls > if you e.g. try to implement naive backtracking). All true. mxTextTools provides the tools, not the magic. But this is also its strength: you can optimize the hell out of your particular parsing requirement without having to think about how the RE optimizer works. > You should go to the REBOL site and look at the description of REBOL's PARSE > verb in the FAQ ... mumble, mumble ... at > > http://www.rebol.com/faq.html#11550948 > > Here's an example pulled from that page (this is a REBOL code fragment): > > digit: charset "0123456789" > expr: [term ["+" | "-"] expr | term] > term: [factor ["*" | "/"] term | factor] > factor: [primary "**" factor | primary] > primary: [value | "(" expr ")"] > value: [digit value | digit] > > parse "1 + 2 ** 9" expr > > There hasn't been a pattern scheme this clean, convenient or powerful since > SNOBOL4. It exploits REBOL's Forth-like (lack of!) syntax, and > Smalltalk-like penchant for passing around thunks (anonymous closures -- > "[...]" in REBOL builds a lexically-scoped entity called "a block", which > can be treated as code (executed) or data (manipulated like a Python list) > at will). Looks nice indeed, but how does executable code fit into that definition ? (mxTextTools allows you to write your own parsing elements in Python, BTW; it should be possible to use those mechanisms to achieve a similar intergration.) > ... > > BTW, the mxTextTools engine could be used to get blazing implementations of > the primary Searcher methods (it excels at simple analysis). OTOH, making > lots of calls to analyze short strings is slow. That's why mxTextTools converts these search idioms into byte codes which it executes at C level. Some future version will even "precompile" the tuple input and then omit the type checks during the search... that should give another noticeable speedup. Note that recursion etc. can be done at C level too -- Python function calls are not needed. > The only clean solutions to > that are Perl's and Icon's (build everyting into one language so the > compiler can optimize stuff away), and REBOL's (make no distinction between > code and data, so that code can be analyzed & optimized at runtime -- and > build the entire implementation around making closures and calls > supernaturally fast). Just for kicks, here is the mysplit() function using mxTextTools: from mx.TextTools import * table = ( # Match all whitespace (None,AllInSet,whitespace_set,+1), # Match and tag all non-whitespace ('text',AllInSet + AppendMatch,nonwhitespace_set,+1), # Loop until EOF (None,EOF,Here,-2), ) def mysplit(text): return tag(text,table)[1] The timings: mysplit: 5.84 sec. string.split: 3.62 sec. Note that you can customize the above to split text at any character set you like, not just whitespace... without compiling or writing C code. The function mx.TextTools.setsplit() provides this functionality as pure C function. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Get ready to party ! Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jim at interet.com Thu Dec 30 15:21:36 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 30 Dec 1999 09:21:36 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk> Message-ID: <386B6A70.3C9A0042@interet.com> Finn Bock wrote: > > James C. Ahlstrom wrote: > > > ftp://ftp.interet.com/pub/pylib.html > > I feel that it smell a bit too much like a tool and too little like an general > programming api. It was meant to be an API except for writepy(), which is clearly a tool. > - It can only add disk files. The ability to write data to a zip entry through > a file-like object or from a string would make it more like an API, IMHO I could add a method writestr(self, string, year, month, day, hour, minute, second, ...) There are a lot of fields required which usually come from the file. > - Some kind of access to the TOC entry fields (date, size, compressed > size etc) also seems like a nice feature. This access is provided directly by self.TOC, and the fields are documented. > - The data for an entry must be available in memory. Could be a problem > for huge files, but most like not in practical use. I agree, but adding loops will make it slower. What do others think? > I admit that I am fond of the api from java.util.zip.ZipFile and > java.util.zip.ZipOutputStream. I don't know this API. If writestr() is not sufficient, what API would you like? JimA From bckfnn at pipmail.dknet.dk Thu Dec 30 20:14:14 1999 From: bckfnn at pipmail.dknet.dk (Finn Bock) Date: Thu, 30 Dec 1999 19:14:14 GMT Subject: [Python-Dev] zipfile.py In-Reply-To: <386B6A70.3C9A0042@interet.com> References: <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk> <386B6A70.3C9A0042@interet.com> Message-ID: <386baec9.2867733@pipmail.dknet.dk> [I wrote] > - It can only add disk files. The ability to write data to a zip entry through > a file-like object or from a string would make it more like an API, IMHO [JimA wrote] >I could add a method > writestr(self, string, year, month, day, hour, minute, second, ...) >There are a lot of fields required which usually come from the file. Something like that seems fine to me. [I wrote] > - Some kind of access to the TOC entry fields (date, size, compressed > size etc) also seems like a nice feature. [JimA answers] >This access is provided directly by self.TOC, and the fields are >documented. Good enough. My bad, I was looking for getter methods. (me being a java dude) [I wrote] > I admit that I am fond of the api from java.util.zip.ZipFile and > java.util.zip.ZipOutputStream. [JimA asks] >I don't know this API. If writestr() is not sufficient, what >API would you like? This is only meant as a source for inspiration, certainly as a request for change. writestr would answer my complaint nicely. Below, only one ZipEntry can be actively read or written to at a time. All the small details of performance and implementation complexity are ignored. class ZipFile: def getEntry(name): ... self.activeentry = ZipEntry(name) return self.activeentry class ZipEntry: #enough methods and fields to fake file-ness to casual users like me. def write(list): ... def writelines(str): ... def read(size=None): ... def readlines(sizehint=-1): ... def seek(offset): ... def flush(): ... def close(str): ... def getSize(): .... def getCompressedSize(): .... def getFlags(): .... regards, finn From tim_one at email.msn.com Fri Dec 31 04:35:18 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 22:35:18 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <386B4792.A551022A@lemburg.com> Message-ID: <000001bf5340$0fb20300$e12d153f@tim> [M.-A. Lemburg] > What is QIO ? See DejaNews (I don't save URLs). "Quick" line-oriented text input adapted from INN. Someone rewrote that as a Python extension module. >> http://www.rebol.com/faq.html#11550948 > Looks nice indeed, but how does executable code fit into > that definition ? See the URL above I didn't save . PARSE's "pattern" argument is a block. Blocks can be (& often are) nested. Whether any given block is code or data is all the same to REBOL, so passing nested code blocks in PARSE's pattern argument is easy. Because blocks are lexically scoped, assignments (etc) inside a block are (well, can be) visible to its context; etc. It's a very Lispish approach. REBOL is essentially Scheme under the covers, but with syntax much more like Forth's (whitespace-separated strings of arbitrary non-whitespace characters, with few pre-assigned meanings or restrictions -- in fact, it's impossible for a compiler to determine where a REBOL function call begins or ends! can't be known until runtime). > (mxTextTools allows you to write your own parsing elements > in Python, BTW; it should be possible to use those mechanisms > to achieve a similar intergration.) It can't capture the flavor -- although I don't know that it needs to . There's no distinction between "the pattern language" and "the computational language" in REBOL or Icon, and it's hard to explain what a maddening distinction that can be once you've lived without it. mxTextTools embedding would feel more like Icon, where the matching engine is fully exposed to the programmer (REBOL hides it, allowing only "approved" interactions). >> OTOH, making lots of calls to analyze short strings is slow. > That's why mxTextTools converts these search idioms into byte > codes which it executes at C level. Some future version will > even "precompile" the tuple input and then omit the type checks > during the search...that should give another noticeable speedup. > Note that recursion etc. can be done at C level too -- Python > function calls are not needed. That's also the curse of having distinct languages; e.g., Python already had recursion, but you needed to reimplement it in a different way with different syntax and different rules in your pattern language. In Icon etc, there's no difference between a recursive pattern and a recursive function, except in *what* it computes. The machinery is all the same, and both more powerful and easier to learn because of that. > ... > Just for kicks, here is the mysplit() function using mxTextTools: > > from mx.TextTools import * > > table = ( > # Match all whitespace > (None,AllInSet,whitespace_set,+1), > # Match and tag all non-whitespace > ('text',AllInSet + AppendMatch,nonwhitespace_set,+1), > # Loop until EOF > (None,EOF,Here,-2), > ) > > def mysplit(text): > > return tag(text,table)[1] > > The timings: > mysplit: 5.84 sec. > string.split: 3.62 sec. > > Note that you can customize the above to split text at any > character set you like, not just whitespace... without > compiling or writing C code. That's equally true of the example I posted . Now what if I wanted to stop splitting right after I find a keyword, recognized as such because it's a key in some passed-in dictionary? In my example, I make an obvious local code change, from while s.notmany(white): # consume non-whitespace result.append(s.get_match()) s.many(white) to while s.notmany(white): # consume non-whitespace word = s.get_match() result.append(word) if dictionary.has_key(word): break s.many(white) What does it do to your example? Or what if the target string isn't "a string" (the code I posted only assumes the "str" object responds to indexing and slicing -- any buffer object is fine -- so my example doesn't change at all)? Or what if you need to pass the tokens on as they're found, pipeline style? Etc. This is why I do complex string processing in Icon <0.9 wink>. OTOH, at what it does well, mxTextTools runs quicker than Icon. Its biggest problem has always been that e.g. nobody knows what the hell (None,EOF,Here,-2), *means* at first glance -- or third . an-extreme-on-the-transparency-vs-speed-curve-ly y'rs - tim From mal at lemburg.com Fri Dec 31 12:18:57 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 31 Dec 1999 12:18:57 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <000001bf5340$0fb20300$e12d153f@tim> Message-ID: <386C9121.E9D9DC01@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > What is QIO ? > > See DejaNews (I don't save URLs). "Quick" line-oriented text input adapted > from INN. Someone rewrote that as a Python extension module. Ok, thanks. > >> http://www.rebol.com/faq.html#11550948 > > > Looks nice indeed, but how does executable code fit into > > that definition ? > > See the URL above I didn't save . PARSE's "pattern" argument is a > block. Blocks can be (& often are) nested. Whether any given block is code > or data is all the same to REBOL, so passing nested code blocks in PARSE's > pattern argument is easy. Because blocks are lexically scoped, assignments > (etc) inside a block are (well, can be) visible to its context; etc. It's a > very Lispish approach. REBOL is essentially Scheme under the covers, but > with syntax much more like Forth's (whitespace-separated strings of > arbitrary non-whitespace characters, with few pre-assigned meanings or > restrictions -- in fact, it's impossible for a compiler to determine where a > REBOL function call begins or ends! can't be known until runtime). If I understand the concept correctly, I think Python could do pretty much the same thing. The bummer is of course the need for new keywords and byte codes (although these could be split out into a separate text scanning engine). Using Python function calls would slow down things to an extent that would render the added functionality useless, well IMHO anyways ;-) > > (mxTextTools allows you to write your own parsing elements > > in Python, BTW; it should be possible to use those mechanisms > > to achieve a similar intergration.) > > It can't capture the flavor -- although I don't know that it needs to > . There's no distinction between "the pattern language" and "the > computational language" in REBOL or Icon, and it's hard to explain what a > maddening distinction that can be once you've lived without it. mxTextTools > embedding would feel more like Icon, where the matching engine is fully > exposed to the programmer (REBOL hides it, allowing only "approved" > interactions). Of course its hard for a Turing Machine to capture the flavor of any high level language :-) When you're programming the mxTextTools Tagging Engine directly you feel like writing assembler... but things are moving in the right direction: Tony Ibbs has a nice meta-language and M.C. Fletcher his SimpleParse to cover up these insufficiencies. > >> OTOH, making lots of calls to analyze short strings is slow. > > > That's why mxTextTools converts these search idioms into byte > > codes which it executes at C level. Some future version will > > even "precompile" the tuple input and then omit the type checks > > during the search...that should give another noticeable speedup. > > Note that recursion etc. can be done at C level too -- Python > > function calls are not needed. > > That's also the curse of having distinct languages; e.g., Python already had > recursion, but you needed to reimplement it in a different way with > different syntax and different rules in your pattern language. In Icon etc, > there's no difference between a recursive pattern and a recursive function, > except in *what* it computes. The machinery is all the same, and both more > powerful and easier to learn because of that. Agreed. > > ... > > Just for kicks, here is the mysplit() function using mxTextTools: > > > > from mx.TextTools import * > > > > table = ( > > # Match all whitespace > > (None,AllInSet,whitespace_set,+1), > > # Match and tag all non-whitespace > > ('text',AllInSet + AppendMatch,nonwhitespace_set,+1), > > # Loop until EOF > > (None,EOF,Here,-2), > > ) > > > > def mysplit(text): > > > > return tag(text,table)[1] > > > > The timings: > > mysplit: 5.84 sec. > > string.split: 3.62 sec. > > > > Note that you can customize the above to split text at any > > character set you like, not just whitespace... without > > compiling or writing C code. > > That's equally true of the example I posted . Now what if I wanted to > stop splitting right after I find a keyword, recognized as such because it's > a key in some passed-in dictionary? In my example, I make an obvious local > code change, from > > while s.notmany(white): # consume non-whitespace > result.append(s.get_match()) > s.many(white) > > to > > while s.notmany(white): # consume non-whitespace > word = s.get_match() > result.append(word) > if dictionary.has_key(word): > break > s.many(white) > > What does it do to your example? You'd replace the 'text' tagobj with a callable object and write AllInSet + CallTag as command. The Tagging Engine will then call the object with arguments (taglist,text,l,r,subtags) and let it decide what to do. In your example it would check the dictionary and raise an exception in case a keyword is found to stop any further scanning. If it's not a keyword, it would simply append the found string to the taglist and return None. Here's the code: from mx.TextTools import * import exceptions stoplist = {'abc':1, 'def':1} class KeywordFound(exceptions.StandardError): def __init__(self, taglist): self.taglist = taglist def callable(taglist,text,l,r,subtags): taglist.append(text[l:r]) if stoplist.has_key(text[l:r]): raise KeywordFound(taglist) table = ( # Match all whitespace (None,AllInSet,whitespace_set,+1), # Match and tag all non-whitespace (callable,AllInSet + CallTag,nonwhitespace_set,+1), # Loop until EOF (None,EOF,Here,-2), ) def mysplitex(text): try: return tag(text,table)[1] except KeywordFound,data: return data.taglist > Or what if the target string isn't "a > string" (the code I posted only assumes the "str" object responds to > indexing and slicing -- any buffer object is fine -- so my example doesn't > change at all)? The current version only handles string objects, but I am already beginning to convert all the APIs in mxTextTools to "s#" or "t#" style (can't decide which to use... "s#" is great for processing raw data, while "t#" more closely refers to text processing). > Or what if you need to pass the tokens on as they're found, > pipeline style? Etc. This is why I do complex string processing in Icon > <0.9 wink>. You can have all that extra magic via callable tag objects or callable matching functions. It's not exactly nice to write, but I'm sure that a meta-language could do the conversions for you. > OTOH, at what it does well, mxTextTools runs quicker than Icon. Its biggest > problem has always been that e.g. nobody knows what the hell > > (None,EOF,Here,-2), > > *means* at first glance -- or third . The structure of those tag tables is very simple: (tagobject, command, argument[, jump offset in case of failure [, jump offset in case of success]]) Please remember that this is byte code, not some higher level abstraction. The design is very much inverted from what you'd usually do: design a nice language and then try to find suitable set of byte codes to make it work as intended. Anyway, I'll keep focussing on the speed aspect of mxTextTools; others can focus on abstractions, so that eventually everybody will be happy :-) Happy New Year, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Get ready to party ! Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tim_one at email.msn.com Fri Dec 31 23:53:49 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 31 Dec 1999 17:53:49 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com> Message-ID: <000701bf53e1$e7119760$472d153f@tim> [Fredrik Lundh, whose very nice eMatter book is on sale until the end of the 20th century (as real people think of it), although the eMatter distribution scheme has lots of problems [just an editorial note from a bot who has to-- for unknown reasons Fatbrain "is working on" --delete the Fatbrain registry tree and reregister the book almost every time he tries to open it ] ] > we have something called SIO which uses memory mapping > where possible, and just a more aggressive read-ahead for > other cases. on a windows box, a traditional while/readline > loop runs 3-5 times faster than before. with SRE instead of > re, a while/readline/match loop runs up to 10 times faster > than before. > > note that this is without *any* changes to the Python > source code... If so, there's potential for significantly more speed. Python does its line-at-a-time input with a character-at-a-time macro-in-a-loop, the same way naive vendors (read "almost all vendors") implement fgets. It's replacing that inner loop with direct peeking into the FILE buffer that gets Perl its dramatic speed -- despite that Perl has fancier input functionality (the oft-requested automagical "input record separator"). So it sounds like the Perl trick is orthogonal to SIO's tricks; Perl isn't doing mmaps or read-aheads or anything else fancy under the covers -- it only optimizes the inner loop! > ... > with a little luck, the new module will replace both pcre > and regex... If something more tangible than luck would help to make this come true, feel free to mention it . > not to mention that it's fairly easy to write your own front- > end to the matching engine -- the expression parser and the > compiler are both written in good old python. Ah, good news / bad news. Perl refugees aren't accustomed to "precompiling" regexp objects, so write code that will cause regexps to get recompiled over & over. Even if you cache the results under the covers, the overhead of the Python call to the regexp compiler will likely take as long as the engine takes to search. Personally, in such cases, I think they should learn how to use the language <0.5 wink>. From tim_one at email.msn.com Fri Dec 31 23:53:56 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 31 Dec 1999 17:53:56 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <386C9121.E9D9DC01@lemburg.com> Message-ID: <000901bf53e1$eb4248c0$472d153f@tim> >> This is why I do complex string processing in Icon <0.9 wink>. [MAL] > You can have all that extra magic via callable tag objects > or callable matching functions. It's not exactly nice to > write, but I'm sure that a meta-language could do the > conversions for you. That wasn't my point: I do it in Icon because it *is* "exactly nice to write", and doesn't require any yet-another meta-language. It's all straightforward, in a way that separate schemes pasted together can never be (simply because they *are* "separate schemes pasted together" ). The point of my Python examples wasn't that they could do something mxTextTools can't do, but that they were *Python* examples: every variation I mentioned (or that you're likely to think of) was easy to handle for any Python programmer because the "control flow" and "data type" etc aspects could be handled exactly the way they always are in *non* pattern-matching Python code too, rather than recoded in pattern-scheme-specific different ways (e.g., where I had a vanailla "if/break", you set up a special exception to tickle the matching engine). I'm not attacking mxTextTools, so don't feel compelled to defend it -- people using regexps in those examples are dead in the water. mxTextTools is very good at what it does; if we have a real disagreement, it's probably that I'm less optimistic about the prospects for higher-level wrappers (e.g., MikeF's SimpleParse is much slower than "a real" BNF parsing system (ARBNFPS), in part because he isn't doing all the optimizations ARBNFPS does, but also in part because ARBNFPS uses an underlying engine more optimized to its specific task than mxTextTool's more-general engine *can* be). So I don't see mxTextTools as being the answer to everything -- and if you hadn't written it, you would agree with that on first glance . > Anyway, I'll keep focussing on the speed aspect of mxTextTools; > others can focus on abstractions, so that eventually everybody > will be happy :-) You and I will be, anyway . From guido at CNRI.Reston.VA.US Wed Dec 1 18:32:08 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 12:32:08 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Fri, 19 Nov 1999 14:59:11 CST." <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> Message-ID: <199912011732.MAA10419@eric.cnri.reston.va.us> > My first Python-Dev post. :-) Welcome! > >We had some discussion a while back about enabling thread support by > >default, if the underlying OS supports it obviously. I agree with this. MacOS seems to be the only OS without threads these days. > What's the consensus about Python microthreads -- a likely candidate > for incorporation in 1.6 (or later)? What are microthreads? If you think about threads implemented in the Python VM instead of in the OS, forget it. > Also, we have a couple minor convenience functions for Python in an > MSDEV environment, an exposure of OutputDebugString for writing to > the DevStudio log window and a means of tripping DevStudio C/C++ layer > breakpoints from Python code (currently experimental). The msvcrt > module seems like a likely candidate for these, would these be > welcome additions? Sure -- send patches. --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli at amber.org Wed Dec 1 18:39:00 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Wed, 1 Dec 1999 12:39:00 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: <199912011732.MAA10419@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Wed, Dec 01, 1999 at 12:32:08PM -0500 References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us> Message-ID: <19991201123900.A7419@trump.amber.org> Guido van Rossum [guido at CNRI.Reston.VA.US] wrote: > > >We had some discussion a while back about enabling thread support by > > >default, if the underlying OS supports it obviously. > > I agree with this. MacOS seems to be the only OS without threads > these days. I believe the new GUISI package has pthread-API compatible threads implemented, which talk to the underlying ThreadManager. With MacOSX being impending before 1.6 (i.e. early 2000), I'd say this is a good way to go. Threads are VERY useful for a lot of problem domains. Chris -- | Christopher Petrilli | petrilli at amber.org From guido at CNRI.Reston.VA.US Wed Dec 1 18:54:53 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 12:54:53 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Wed, 01 Dec 1999 12:39:00 EST." <19991201123900.A7419@trump.amber.org> References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us> <19991201123900.A7419@trump.amber.org> Message-ID: <199912011754.MAA10465@eric.cnri.reston.va.us> > > I agree with this. MacOS seems to be the only OS without threads > > these days. > > I believe the new GUISI package has pthread-API compatible threads > implemented, which talk to the underlying ThreadManager. With MacOSX > being impending before 1.6 (i.e. early 2000), I'd say this is a good > way to go. Threads are VERY useful for a lot of problem domains. What's GUISI? The son of GUSI? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Wed Dec 1 18:55:19 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 12:55:19 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Wed, 01 Dec 1999 12:32:08 EST." <199912011732.MAA10419@eric.cnri.reston.va.us> References: <11A17AA2B9EAD111BCEA00A0C9B4179303385C08@molach.origin.ea.com> <199912011732.MAA10419@eric.cnri.reston.va.us> Message-ID: <199912011755.MAA10476@eric.cnri.reston.va.us> > > Also, we have a couple minor convenience functions for Python in an > > MSDEV environment, an exposure of OutputDebugString for writing to > > the DevStudio log window and a means of tripping DevStudio C/C++ layer > > breakpoints from Python code (currently experimental). The msvcrt > > module seems like a likely candidate for these, would these be > > welcome additions? > > Sure -- send patches. I hadn't seen Mark Hammond's response -- I take it back. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Wed Dec 1 19:15:26 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 13:15:26 -0500 Subject: [Python-Dev] Another 1.6 wish In-Reply-To: Your message of "Sat, 20 Nov 1999 11:04:28 +1100." <005f01bf32ea$d0b82b90$0501a8c0@bobcat> References: <005f01bf32ea$d0b82b90$0501a8c0@bobcat> Message-ID: <199912011815.NAA10506@eric.cnri.reston.va.us> > This is really a pointer to the fact that some or all of the win32api > should be moved into the core - registry access is the thing people > most want, but there are plenty of other useful things that people > reguarly use... > > Guido objects to the coding style, but hopefully that wont be a big > issue. IMO, the coding style isnt "bad" - it is just more an "MS" > flavour than a "Python" flavour - presumably people reading the code > will have some experience with Windows, so it wont look completely > foreign to them. The good thing about taking it "as-is" is that it > has been fairly well bashed on over a few years, so is really quite > stable. The final "coding style" issue is that there are no "doc > strings" - all documentation is embedded in C comments, and extracted > using a tool called "autoduck" (similar to "autodoc"). However, Im > sure we can arrange something there, too. That's a good summary of the status quo. I would appreciate it if win32all could become part of the core. However the coding style issues need to be addressed (I also believe that it needs to be compiled in C++ mode). One concern that Mark doesn't mention is that there are some safety issues -- you can abuse some of the calls to cause segfaults, whether intentional or by mistake, and that's not a good thing. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Wed Dec 1 19:55:40 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 13:55:40 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Wed, 24 Nov 1999 09:43:57 EST." <383BF9AD.E183FB98@interet.com> References: <383BF9AD.E183FB98@interet.com> Message-ID: <199912011855.NAA10662@eric.cnri.reston.va.us> > I would like to argue that on Windows, import of dynamic libraries is > broken. If a file something.pyd is imported, then sys.path is searched > to find the module. If a file something.dll is imported, the same thing > happens. But Windows defines its own search order for *.dll files which > Python ignores. I would suggest that this is wrong for files named > *.dll, > but OK for files named *.pyd. I think you misunderstand some of the issues. Python cannot import every .dll file. Only .dll files that conform to the convention for Python extension modules can be imported. (The convention is that it must export an init function.) On most other platforms, shared libraries must have a specific extension (e.g. .so on most Unix). Python allows you to drop such a file into any directory where is looks for modules, and it will then direct the dynamic load support to load that specific file. This seems logical -- Python extensions must live in directories that Python searches (Python must do its own search because the search order is significant). On Windows, Python uses the same strategy. The only modification is that it is allowed to give the file a different extension, namely .pyd, to indicate that this really is a Python extension and not a regular DLL. This was mostly introduced because it is apparently common to have an existing DLL "foo.dll" and write a Python wrapper for it that is also called "foo". Clearly, two files foo.dll are too confusing, so we let you name the wrapper foo.pyd. But because the file format is essentially that of a DLL, we don't *require* this renaming; some ways of creating DLLs in the first place may make it difficult to do. > A SysAdmin should be able to install and maintain *.dll as she has > been trained to do. This makes maintaining Python installations > simpler and more un-surprising. I don't see that a SysAdmin needs to do much DLL management. This is up to installer scripts. Anyway how hard can it be for a SysAdmin to leave DLLs in specific directories alone? > I have no solution to the backward compatibilty problem. But the > code is only a couple lines. A LoadLibrary() call does its own > path searching. But at what point should this LoadLibrary() call be called? The import statement contains no clue that a DLL is requested -- the sys.path search reveals that. I claim that there is nothing with the current strategy. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Wed Dec 1 20:01:12 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 1 Dec 1999 14:01:12 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs References: <199911161700.MAA02716@eric.cnri.reston.va.us> <14389.31511.706588.20840@anthem.cnri.reston.va.us> Message-ID: <14405.28792.184298.298597@anthem.cnri.reston.va.us> >>>>> "BAW" == Barry A Warsaw writes: BAW> There was a suggestion to start augmenting the checkin emails BAW> to include the diffs of the checkin. This would let you keep BAW> a current snapshot of the tree without having to do a direct BAW> `cvs update'. The voting has stopped, with the "yeah" vote slightly head of the "nay" vote. We'll go with context diffs, and we'll be implementing Greg Stein's approach with the xml-checkins list: truncating diffs to H number of lines at the top and T number of lines at the bottom, so as not to overwhelm incoming email. I'll try to get this going sometime today (no promises). You'll likely see a number of tests coming through python-checkins in the meantime. I'll send a message out when it's done. -Barry From da at ski.org Wed Dec 1 20:34:56 1999 From: da at ski.org (David Ascher) Date: Wed, 1 Dec 1999 11:34:56 -0800 (Pacific Standard Time) Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues In-Reply-To: <14405.25141.297349.76968@gargle.gargle.HOWL> Message-ID: On Wed, 1 Dec 1999, Geoffrey Furnish wrote: [...] > Well, like I said above, I haven't analyzed your posts for technical > details, so I can't say whether you made avoidable mistakes. But I > definitely do agree with you that it is roughly 100 times harder than > it needs to be, to use Python from C++. The charter of this sig is to > fix that, by developing the additional software that would allow > Python's compiled interface to be exploited from C++ "with ease". > > The first and most basic issue, is compiling Python so it initializes > C++ global objects correctly. There is a patch on the sig's www site > to help with that. Any opinions from this esteemed body re: integrating said patch in the main tree? --david From jim at interet.com Wed Dec 1 20:47:14 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 14:47:14 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us> Message-ID: <38457B42.85552AC@interet.com> Guido van Rossum wrote: > > > I would like to argue that on Windows, import of dynamic libraries is > > broken. If a file something.pyd is imported, then sys.path is searched > > to find the module. If a file something.dll is imported, the same thing > > happens. But Windows defines its own search order for *.dll files which > > Python ignores. I would suggest that this is wrong for files named > > *.dll, > > but OK for files named *.pyd. > > I think you misunderstand some of the issues. > > Python cannot import every .dll file. Only .dll files that conform to > the convention for Python extension modules can be imported. (The > convention is that it must export an init function.) Of course I meant that the test is LoadLibrary(module) followed by GetProcAddress(h, "init" + module). Both must succeed. > This seems logical -- Python extensions must live in directories that > Python searches (Python must do its own search because the search > order is significant). The PYTHONPATH search path is what I am trying to get away from. If I eliminate PYTHONPATH I still can not use the Windows DLL search path (which is superior) because DLLs are searched on PYTHONPATH too; thus my post. I don't believe it is important for Python module.dll to be located on PYTHONPATH. > > A SysAdmin should be able to install and maintain *.dll as she has > > been trained to do. This makes maintaining Python installations > > simpler and more un-surprising. > > I don't see that a SysAdmin needs to do much DLL management. This is > up to installer scripts. Anyway how hard can it be for a SysAdmin to > leave DLLs in specific directories alone? The problem is maintaining PYTHONPATH plus having DLL's on a non-standard search path. Yes, PythonDev[:] and professional SysAdmins can do it. But it is not as simple as it could be. Someone has to write the install scripts. And what if something doesn't work? Think of Python being used as a teaching language for the 8th grade. Think of the 8th grade teacher trying to get all this right. The only thing that works is simplicity. > But at what point should this LoadLibrary() call be called? The > import statement contains no clue that a DLL is requested -- the > sys.path search reveals that. Just after built-in and frozen modules. > I claim that there is nothing with the current strategy. Thank you for thoughtfully considering and commenting at length on this issue. Lets ignore it for the moment. The other problems with PYTHONPATH are more pressing. But if those issues are solved, this one will stick out. JimA From da at ski.org Wed Dec 1 20:59:44 1999 From: da at ski.org (David Ascher) Date: Wed, 1 Dec 1999 11:59:44 -0800 (Pacific Standard Time) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <38457B42.85552AC@interet.com> Message-ID: On Wed, 1 Dec 1999, James C. Ahlstrom wrote: > > This seems logical -- Python extensions must live in directories that > > Python searches (Python must do its own search because the search > > order is significant). > > The PYTHONPATH search path is what I am trying to get away > from. If I eliminate PYTHONPATH I still can not use the > Windows DLL search path (which is superior) because DLLs > are searched on PYTHONPATH too; thus my post. I don't believe > it is important for Python module.dll to be located on PYTHONPATH. Why is the DLL search path superior? In my experience, the DLL search path (PATH for short) is problematic because it requires either using the System control panel or modifying autoexec.bat, both of which can have massive systemic effects completely unrelated to Python if a mistake is made during the modification. On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH, although I think there are significant variations in how that works across platforms. Most beginning unix users have no idea how to modify their LD_LIBRARY_PATH, as they typically don't understand the configuration mechanisms on Unix (system vs. user-specific, login vs. shell-specific, different shell configuration languages, etc.). I know it's not what you had in mind, but have you tried doing something like: import sys, os, string sys.path.extend(string.split(os.environ['PATH'], ';')) --david From gmcm at hypernet.com Wed Dec 1 21:19:13 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 1 Dec 1999 15:19:13 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: References: <38457B42.85552AC@interet.com> Message-ID: <1268042932-41354568@hypernet.com> David Ascher wrote: > On Wed, 1 Dec 1999, James C. Ahlstrom wrote: > > > > This seems logical -- Python extensions must live in > > > directories that Python searches (Python must do its own > > > search because the search order is significant). > > > > The PYTHONPATH search path is what I am trying to get away > > from. If I eliminate PYTHONPATH I still can not use the > > Windows DLL search path (which is superior) because DLLs are > > searched on PYTHONPATH too; thus my post. I don't believe it > > is important for Python module.dll to be located on PYTHONPATH. > > Why is the DLL search path superior? > > In my experience, the DLL search path (PATH for short) Make that: [ os.path.dirname(sys.executable), os.getcwd(), win32api.GetSystemDirectory(), os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'), win32api.GetWindowsDirectory() ] + string.split(os.environ['PATH'], ';') > is > problematic because it requires either using the System control > panel or modifying autoexec.bat, both of which can have massive > systemic effects completely unrelated to Python if a mistake is > made during the modification. Hear, hear! [snip] - Gordon From jim at interet.com Wed Dec 1 21:36:04 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 15:36:04 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: Message-ID: <384586B4.48905B32@interet.com> David Ascher wrote: > Why is the DLL search path superior? > > In my experience, the DLL search path (PATH for short) is problematic > because it requires either using the System control panel or modifying > autoexec.bat, both of which can have massive systemic effects completely > unrelated to Python if a mistake is made during the modification. I agree that altering PATH is problematic. So is altering PYTHONPATH and for exactly the same reason. That is why I think PYTHONPATH is a bad idea. The reason the DLL search path is superior is that it is not just PATH. It defines a path which includes the install directory of the application plus the system directories, and this path is discovered at runtime. So it is not necessary to set a global PYTHONPATH, nor make registry entries, nor do anything at all. It Just Works. The Windows DLL search path is: 1) The directory of the executable program. That means you can just throw all your DLL's in with the *.exe's, and it all Just Works. 2) The current directory. Also useful. 3) The Windows system directory (call GetSystemDirectory() to get this). 4) The Windows directory (call GetWindowsDirectory() to get this). These two directories are used for system files. Think of /sbin, /bin. Windows apps usually throw some of their DLL's here, especially if they are of general interest. 5) The directories in PATH. This is relatively useless, and AFAIK it is seldom used in a real installation. It is a left-over from DOS. That is also why it appears last. > On UNIX, the equivalent to Windows' PATH is typically LD_LIBRARY_PATH, > although I think there are significant variations in how that works across > platforms. Most beginning unix users have no idea how to modify their > LD_LIBRARY_PATH, as they typically don't understand the configuration > mechanisms on Unix (system vs. user-specific, login vs. shell-specific, > different shell configuration languages, etc.). I agree. > > I know it's not what you had in mind, but have you tried doing something > like: > > import sys, os, string > sys.path.extend(string.split(os.environ['PATH'], ';')) Adding PATH (or anything else) to PYTHONPATH is making it worse. Have you tried "import sys; print sys.path" on Windows? It is junk. JimA From jim at interet.com Wed Dec 1 21:44:00 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 15:44:00 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38457B42.85552AC@interet.com> <1268042932-41354568@hypernet.com> Message-ID: <38458890.BCB36FE2@interet.com> Gordon McMillan wrote: > Make that: > [ os.path.dirname(sys.executable), > os.getcwd(), > win32api.GetSystemDirectory(), > os.path.join(win32api.GetSystemDirectory(), '../SYSTEM'), > win32api.GetWindowsDirectory() > ] + string.split(os.environ['PATH'], ';') Very nice! "../SYSTEM" needed on NT I guess. JimA From fredrik at pythonware.com Wed Dec 1 21:56:16 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 1 Dec 1999 21:56:16 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> Message-ID: <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> James C. Ahlstrom wrote: > Adding PATH (or anything else) to PYTHONPATH is making it worse. Have > you tried "import sys; print sys.path" on Windows? It is junk. not on my machine. it would help if you stopped assuming that every- one have the same problems as you have. we've distributed several python apps on windows, and frankly, I don't understand what you're talking about. From jim at interet.com Wed Dec 1 22:26:37 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 01 Dec 1999 16:26:37 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> Message-ID: <3845928D.C0462322@interet.com> Fredrik Lundh wrote: > > you tried "import sys; print sys.path" on Windows? It is junk. > > not on my machine. On my Windows machine I get: ['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib', '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin'] PYTHONPATH is N:/prd/winlease/vest. os.path.dirname(sys.executable) is F:/bin. The others are junk. What do you get? Did you change sys.path from the default? > it would help if you stopped assuming that every- > one have the same problems as you have. we've > distributed several python apps on windows, and > frankly, I don't understand what you're talking > about. We distribute our app by freezing all *.py files into a DLL, and we don't set PYTHONPATH on the target machine. The files are located with the executable file and are found there. This works fine and we don't have a problem with it. It would help me a lot if you could describe how you distribute your app. Do you set PYTHONPATH on the target machine? JimA From da at ski.org Wed Dec 1 22:41:31 1999 From: da at ski.org (David Ascher) Date: Wed, 1 Dec 1999 13:41:31 -0800 (Pacific Standard Time) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <384586B4.48905B32@interet.com> Message-ID: On Wed, 1 Dec 1999, James C. Ahlstrom wrote: > > In my experience, the DLL search path (PATH for short) is problematic > > because it requires either using the System control panel or modifying > > autoexec.bat, both of which can have massive systemic effects completely > > unrelated to Python if a mistake is made during the modification. > > I agree that altering PATH is problematic. So is altering PYTHONPATH > and for exactly the same reason. That is why I think PYTHONPATH is > a bad idea. I see. Thanks for the explanation. I didn't know the complete story of the "Windows DLL search path". BTW, I think a huge difference b/w PYTHONPATH and PATH is the system-wide nature of PATH, vs. the Python-restriced nature of PYTHONPATH. --david From mhammond at skippinet.com.au Wed Dec 1 23:29:38 1999 From: mhammond at skippinet.com.au (Mark Hammond) Date: Thu, 2 Dec 1999 09:29:38 +1100 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Message-ID: <009c01bf3c4b$8f119090$0501a8c0@bobcat> > I see. Thanks for the explanation. I didn't know the > complete story of > the "Windows DLL search path". BTW, I think a huge difference b/w > PYTHONPATH and PATH is the system-wide nature of PATH, vs. the > Python-restriced nature of PYTHONPATH. And more to the point - and the critical distinction - is that PYTHONPATH is actually specific to the Python _app_, not just Python on the machine. Sure - the standard Python installation puts a "default" PYTHONPATH suitable for general purpose development - but any distributed application _can_ define their own PYTHONPATH that is independant of any other Python systems or applications. People have been doing this for years, including MS :-) Sorry Jim, but count this as another vote against it - which isnt to argue that the current system is perfect, simply (IMO) better than the Windows path and DLL search order. Mark. From guido at CNRI.Reston.VA.US Thu Dec 2 00:00:21 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:00:21 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Wed, 01 Dec 1999 16:26:37 EST." <3845928D.C0462322@interet.com> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> Message-ID: <199912012300.SAA10861@eric.cnri.reston.va.us> > Fredrik Lundh wrote: > > > > you tried "import sys; print sys.path" on Windows? It is junk. > > > > not on my machine. > > On my Windows machine I get: > > ['', '.', 'N:/prd/winlease/vest', '.\\DLLs', '.\\lib', > '.\\lib\\plat-win', '.\\lib\\lib-tk', 'f:\\bin'] > > PYTHONPATH is N:/prd/winlease/vest. > os.path.dirname(sys.executable) is F:/bin. > The others are junk. What do you get? Did > you change sys.path from the default? You must not have used the standard Python installer; if you had used it you wouldn't have had this problem (and perhaps we wouldn't have had this discussion). The problem is that you apparently have installed python.exe in f:\bin. "Modern" Python versions execute some code at startup that comes up with a suitable value for sys.path; the Windows version of this code is in PC/getpathp.c -- I recommend that you study it. This code tries to find the Python install directory by looking for a "landmark" file relative to the executable path, and then adds a bunch of directory entries to the path relative to the install directory. If it fails, it defaults to "." for the install directory. The entries '.\\DLLs', '.\\lib', '.\\lib\\plat-win', '.\\lib\\lib-tk' are all a result of this failing. As long as this works, there is no need for the user (or anyone) to ever set the PYTHONPATH variable -- that variable is only needed to add directories in front of sys.path for stuff that getpathp.c doesn't know about (e.g. PIL, Numeric, etc.). With packagized versions of those modules, even that won't be necessary, because the packages will be dropped in the Python install directory (typically C:\Program Files\Python). I believe that most of your desire to get rid of PYTHONPATH comes from your insistence to bypass the default installer. There's probably a way to install your app in such a way that the getpathp.c algorithm actually succeeds? There's also a separate env variable, PYTHONHOME, which overrides the Python install directory; if getpathp.c sees that it is set, it will bypass the search relative to the executable's path. I take blame for not documenting all this well enough. However I wish you stopped criticizing the design -- I think the design is quite solid. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Thu Dec 2 00:09:43 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:09:43 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Wed, 01 Dec 1999 14:47:14 EST." <38457B42.85552AC@interet.com> References: <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us> <38457B42.85552AC@interet.com> Message-ID: <199912012309.SAA10873@eric.cnri.reston.va.us> > > This seems logical -- Python extensions must live in directories that > > Python searches (Python must do its own search because the search > > order is significant). > > The PYTHONPATH search path is what I am trying to get away > from. If I eliminate PYTHONPATH I still can not use the > Windows DLL search path (which is superior) because DLLs > are searched on PYTHONPATH too; thus my post. I don't believe > it is important for Python module.dll to be located on PYTHONPATH. But I do. First of all, I'm not sure whether you're talking here about sys.path or PYTHONPATH. As I explained in a previous post, you should normally not have to set PYTHONPATH at all. Let's assume you really meant sys.path. Let's assume sys.path is [A, B]. Let's assume there's a foo.py and a foo.dll. If foo.py lives in A and foo.dll lives in B, then import foo should load foo.py. If it's the other way around, it should load foo.dll. If we were to use the default DLL search path, there's no way that we can get this behavior: either you have to look for a DLL first, which means there's no way for foo.py to override foo.dll, or you have to look for a DLL last, and then there's no way for a foo.dll to override foo.py. It is desirable that both overrides are possible: we want to be able to have foo.dll override foo.py, because perhaps foo.py should only be used when for some reason foo.dll can't be loaded (say foo.py does the same thing only slower); but we also want to be able to have foo.py override foo.dll (by simply placing it in a directory that's earlier on the path) e.g. in a situation where the dll version does something undesirable and we want to create a safe substitute. (Deleting files is not always an option.) > The problem is maintaining PYTHONPATH plus having DLL's on a > non-standard search path. I've commented already that PYTHONPATH maintenance is probably a red herring due to your non-standard install. I'm not sure what the problem is with having a DLL on a non-std path? > Yes, PythonDev[:] and professional > SysAdmins can do it. But it is not as simple as it could be. > Someone has to write the install scripts. The distutil-sig (a.k.a. Greg Ward :-) is taking care of this as we speak. > And what if something > doesn't work? Think of Python being used as a teaching language > for the 8th grade. Think of the 8th grade teacher trying to get > all this right. The only thing that works is simplicity. We will provide an installer that Just Works [tm]. > > But at what point should this LoadLibrary() call be called? The > > import statement contains no clue that a DLL is requested -- the > > sys.path search reveals that. > > Just after built-in and frozen modules. See my long comment above. > > I claim that there is nothing with the current strategy. > > Thank you for thoughtfully considering and commenting at length > on this issue. Lets ignore it for the moment. The other > problems with PYTHONPATH are more pressing. But if those > issues are solved, this one will stick out. And those other issues should be resolved in a different way than what you have been proposing. See other post. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Thu Dec 2 00:11:28 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:11:28 -0500 Subject: [Python-Dev] Re: [C++-SIG] Python calling C++ issues In-Reply-To: Your message of "Wed, 01 Dec 1999 11:34:56 PST." References: Message-ID: <199912012311.SAA10888@eric.cnri.reston.va.us> > > The first and most basic issue, is compiling Python so it initializes > > C++ global objects correctly. There is a patch on the sig's www site > > to help with that. > > Any opinions from this esteemed body re: integrating said patch in the > main tree? I presume you meant me :-) I'll give it a try tonight. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at cnri.reston.va.us Thu Dec 2 00:24:06 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Wed, 1 Dec 1999 18:24:06 -0500 (EST) Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01 Message-ID: <14405.44566.832799.96438@goon.cnri.reston.va.us> It looks like there has been some mail glitch that result in no digests being sent between 11/26 and 12/01 and no messages being archived between 11/24 and 12/01. Does anyone keep a personal archive that has those messages? I'd like to read them. Jeremy From guido at CNRI.Reston.VA.US Thu Dec 2 00:28:14 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 01 Dec 1999 18:28:14 -0500 Subject: [Python-Dev] copies of python-dev messages between 11/24 and 12/01 In-Reply-To: Your message of "Wed, 01 Dec 1999 18:24:06 EST." <14405.44566.832799.96438@goon.cnri.reston.va.us> References: <14405.44566.832799.96438@goon.cnri.reston.va.us> Message-ID: <199912012328.SAA12879@eric.cnri.reston.va.us> > It looks like there has been some mail glitch that result in no > digests being sent between 11/26 and 12/01 and no messages being > archived between 11/24 and 12/01. Does anyone keep a personal archive > that has those messages? I'd like to read them. I do :-) I'll provide Jeremy with an archive. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Thu Dec 2 05:24:03 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 1 Dec 1999 23:24:03 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS log messages with diffs References: <199911161700.MAA02716@eric.cnri.reston.va.us> <14389.31511.706588.20840@anthem.cnri.reston.va.us> Message-ID: <14405.62563.345566.500106@anthem.cnri.reston.va.us> Okay folks, I think I've got the diff thing working now. The trick (for you CVS heads) was that you can't do a `cvs diff' while you're executing a loginfo script. Lock contention (repeat after me: "I Love CVS!"). Anyway, let's see how you all like it. Note that based on a suggestion by Greg Stein, seconded by GvR, I do not send out the entire diff of every file (which could potentially be huge). I send out 20 lines from the head of the diff and 20 lines from the tail, and suppress everything inbetween. Those numbers can be easily tweaked, and I'm not sure what the ideal is. Let's see what the emails look like when stuff starts getting checked in. Enjoy, -Barry From jack at oratrix.nl Thu Dec 2 12:00:45 1999 From: jack at oratrix.nl (Jack Jansen) Date: Thu, 02 Dec 1999 12:00:45 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Message by Guido van Rossum , Wed, 01 Dec 1999 18:09:43 -0500 , <199912012309.SAA10873@eric.cnri.reston.va.us> Message-ID: <19991202110045.96F33370CF2@snelboot.oratrix.nl> On the Mac I've introduced "magic cookies" into sys.path, which allow you to do interesting searches (like searching for a DLL or PYC-resource in the application itself) at known places in the import process. There isn't a cookie for "search along the standard MacOS dll search path" (which is somewhat similar to the Windows dll search path) because I haven't seen a reason for it, but there's nothing to stop it. And if you'd insert that cookie it would be perfectly clear (at least, it should be) that only dll modules will be found in that step, not .py modules. Actually I'm so happy with the magic cookie scheme that I've advocated at various times in the past that something similar also be used for determining where builtin modules and frozen modules appear in sys.path... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido at CNRI.Reston.VA.US Thu Dec 2 12:59:34 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 06:59:34 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Thu, 02 Dec 1999 12:00:45 +0100." <19991202110045.96F33370CF2@snelboot.oratrix.nl> References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> Message-ID: <199912021159.GAA13732@eric.cnri.reston.va.us> > On the Mac I've introduced "magic cookies" into sys.path, which > allow you to do interesting searches (like searching for a DLL or > PYC-resource in the application itself) at known places in the > import process. > There isn't a cookie for "search along the standard MacOS dll search > path" (which is somewhat similar to the Windows dll search path) > because I haven't seen a reason for it, but there's nothing to stop > it. And if you'd insert that cookie it would be perfectly clear (at > least, it should be) that only dll modules will be found in that > step, not .py modules. > Actually I'm so happy with the magic cookie scheme that I've > advocated at various times in the past that something similar also > be used for determining where builtin modules and frozen modules > appear in sys.path... I see the magic cookies as a poor man's (but more compatible!) version of a chain of importers as advocated by Greg Stein and other imputil fans. I like the idea, except that I think that the chain should be manipulatable more easily than the current imputil implementation. (I'll have more comments on Greg's comments later, when I've actually read them through.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Thu Dec 2 13:09:40 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 2 Dec 1999 04:09:40 -0800 (PST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <199912021159.GAA13732@eric.cnri.reston.va.us> Message-ID: On Thu, 2 Dec 1999, Guido van Rossum wrote: >... > I see the magic cookies as a poor man's (but more compatible!) version > of a chain of importers as advocated by Greg Stein and other imputil > fans. I like the idea, except that I think that the chain should be > manipulatable more easily than the current imputil implementation. > (I'll have more comments on Greg's comments later, when I've actually > read them through.) Anything in sys.path that is not a string pointing to a directory is not very compatible. My current proposal keeps the existing semantics for sys.path (the proposal adds functionality thru other mechanisms, rather than changing/interfering with existing ones). I look forward to your comments! I'll definitely provide new solutions where you find problems :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Thu Dec 2 13:53:03 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 2 Dec 1999 13:53:03 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> <199912021159.GAA13732@eric.cnri.reston.va.us> Message-ID: <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> Guido van Rossum wrote: > > Actually I'm so happy with the magic cookie scheme that I've > > advocated at various times in the past that something similar also > > be used for determining where builtin modules and frozen modules > > appear in sys.path... > > I see the magic cookies as a poor man's (but more compatible!) version > of a chain of importers as advocated by Greg Stein and other imputil > fans. I like the idea, except that I think that the chain should be > manipulatable more easily than the current imputil implementation. I know this has been asked before, but cannot recall any of the arguments against it: how about replacing Jack's magic cookies with importer objects? (in other words, if a path item is a string, import as usual. otherwise, ask the importer for a code object or maybe better, a module object). From jack at oratrix.nl Thu Dec 2 14:23:31 1999 From: jack at oratrix.nl (Jack Jansen) Date: Thu, 02 Dec 1999 14:23:31 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Message by "Fredrik Lundh" , Thu, 2 Dec 1999 13:53:03 +0100 , <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> Message-ID: <19991202132331.E3F8D370CF2@snelboot.oratrix.nl> > > I see the magic cookies as a poor man's (but more compatible!) version > > of a chain of importers as advocated by Greg Stein and other imputil > > fans. [...] > > I know this has been asked before, but cannot recall > any of the arguments against it: how about replacing > Jack's magic cookies with importer objects? For the record: I definitely agree with both comments here. The only thing that would need solving (but maybe it already is? Greg?) is the external representation of an importer, as I'd definitely want to be able to name them in PYTHONPATH (or the mac equivalent). -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jim at interet.com Thu Dec 2 15:19:31 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 09:19:31 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <009c01bf3c4b$8f119090$0501a8c0@bobcat> Message-ID: <38467FF3.D938EE4@interet.com> Mark Hammond wrote: > Sure - the standard Python installation puts a "default" PYTHONPATH > suitable for general purpose development - but any distributed > application _can_ define their own PYTHONPATH that is independant of > any other Python systems or applications. People have been doing this > for years, including MS :-) How is this done? > Sorry Jim, but count this as another vote against it - which isnt to > argue that the current system is perfect, simply (IMO) better than the > Windows path and DLL search order. Sigh..... JimA From jim at interet.com Thu Dec 2 16:49:10 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 10:49:10 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> Message-ID: <384694F6.E5D74221@interet.com> Guido van Rossum wrote: > You must not have used the standard Python installer; if you had used > it you wouldn't have had this problem (and perhaps we wouldn't have > had this discussion). Correct, I did not use the standard Python installer. I compiled Python from the source distribution. There are good reasons for this in my case. First, my real issue is how to DISTRIBUTE Python programs, not to get Python working on my own machine. We have 12 machines on a network. It is not acceptable to run a Python installation script on every one of them just to run a simple Python program. OK, I guess I could do 12, but what about a larger company? And we ship to hundreds of customers. I can distribute simple C or C++ programs without a hassle, why not Python? It is not acceptable to ask our customers to run a separate Python installer. We have our own Wise installer to install our software. Every commercial vendor has Wise, Install Shield or other installer in place. No commercial vendor is going to abandon Wise et al. and move to The Official Python Installer because it will not have the features of Wise (such as binary patches across the network), and because what it does won't be documented, and because it is Just Different. Second, I can not run ANY installer on my development machine, Python or otherwise. This is a general Windows problem not specific to Python. Right now our help system is broken on every office machine except the one where the help system installer was run (where we develop help). If I run a Python installer, it may Just Work here. So testing is fine, but when I distribute the program to customers where the install program has not been run it fails. The installer made registry entries, installed files, etc. And what did it do?? No one knows. And how do I install at a customer site if I don't have documentation on what the Help installer or Python installer did?? No one knows. Who fixes it if something goes wrong?? Hours on the phone to Help System customer support. Does it work on Windows 2000?? No one knows. > f:\bin. "Modern" Python versions execute some code at startup that > comes up with a suitable value for sys.path; the Windows version of > this code is in PC/getpathp.c -- I recommend that you study it. This > [ Highly useful discussion of startup...] Thank you, I will study this. > know about (e.g. PIL, Numeric, etc.). With packagized versions of > those modules, even that won't be necessary, because the packages will > be dropped in the Python install directory (typically C:\Program > Files\Python). Yes, this is essential. Packages must be easily installed. I was hoping for single file package archive files. > I believe that most of your desire to get rid of PYTHONPATH comes from > your insistence to bypass the default installer. Correct, I refuse to execute the default installer. And I am a patient person who loves Python, so I will read getpathp.c to see what is happening. But other commercial developers, students, teachers, SysAdmins etc. are not so patient. In the interest of promoting Python, there should be documentation on the official way to easily install Python programs. > There's probably a > way to install your app in such a way that the getpathp.c algorithm > actually succeeds? There's also a separate env variable, PYTHONHOME, Perhaps, and if there is it should be prominently documented in the How to Distribute Your App section of the manual. I am worried about supporting versioning, but I will think about it. > I take blame for not documenting all this well enough. However I wish > you stopped criticizing the design -- I think the design is quite > solid. Thank you for the explanation. I will study the design again. I always wondered what PYTHONHOME did. JimA From guido at CNRI.Reston.VA.US Thu Dec 2 17:03:09 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 11:03:09 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Thu, 02 Dec 1999 10:49:10 EST." <384694F6.E5D74221@interet.com> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> Message-ID: <199912021603.LAA14455@eric.cnri.reston.va.us> > Perhaps, and if there is it should be prominently documented in the > How to Distribute Your App section of the manual. I > am worried about supporting versioning, but I will think about it. Join the distutil-SIG, they are discussing just this. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Thu Dec 2 16:48:40 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 02 Dec 1999 16:48:40 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <19991202110045.96F33370CF2@snelboot.oratrix.nl> <199912021159.GAA13732@eric.cnri.reston.va.us> <005e01bf3cc4$2e338c50$f29b12c2@secret.pythonware.com> Message-ID: <384694D8.DCA3D75E@lemburg.com> Fredrik Lundh wrote: > > Guido van Rossum wrote: > > > Actually I'm so happy with the magic cookie scheme that I've > > > advocated at various times in the past that something similar also > > > be used for determining where builtin modules and frozen modules > > > appear in sys.path... > > > > I see the magic cookies as a poor man's (but more compatible!) version > > of a chain of importers as advocated by Greg Stein and other imputil > > fans. I like the idea, except that I think that the chain should be > > manipulatable more easily than the current imputil implementation. > > I know this has been asked before, but cannot recall > any of the arguments against it: how about replacing > Jack's magic cookies with importer objects? > > (in other words, if a path item is a string, import as > usual. otherwise, ask the importer for a code object > or maybe better, a module object). Plus, for backward compatibility, make sure that str(importerobj) returns something which resembles a non-existing directory. Note that the builtin importer skips non-string entries in sys.path, so the above will only be needed for existing import hooks. Still, I would like to rephrase my 0.02EUR which I already posted twice... why not start to think about what these importers would do first ? If there are only a handful of wishes we could just add them to the builtin machinery and be done with it... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 29 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Thu Dec 2 17:28:28 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 11:28:28 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Fri, 19 Nov 1999 22:43:32 EST." <1269053086-27079185@hypernet.com> References: <1269053086-27079185@hypernet.com> Message-ID: <199912021628.LAA14506@eric.cnri.reston.va.us> > No success whatsoever in either direction across Samba. In > fact the mtime of my Linux home directory as seen from NT is > Jan 1, 1980. That's only the case for an NT mount point (something of the form \\host\name; I notice that os.stat() only believes it exists if you append a backslash: \\host\name\). For interior directories, at least with the Samba version that I'm using, os.stat() seems to give correct results. I think that this whole issue (that doing a stat on a directory to find out whether files in it were modified doesn't give usable results) is widely blown out of proportion. The only useful bit of info is that mtimes may have an up to 2 second granularity, and that anything as recent as 2 seconds should be considered as newer than the cache even if the cache is also less than 2 seconds. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at interet.com Thu Dec 2 17:28:50 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 11:28:50 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <383BF9AD.E183FB98@interet.com> <199912011855.NAA10662@eric.cnri.reston.va.us> <38457B42.85552AC@interet.com> <199912012309.SAA10873@eric.cnri.reston.va.us> Message-ID: <38469E42.AF0A0D55@interet.com> Guido van Rossum wrote: > Let's assume sys.path is [A, B]. Let's assume there's a foo.py and a > foo.dll. If foo.py lives in A and foo.dll lives in B, then import foo > ... Thank you for the detailed discussion showing that sys.path is needed so a choice can be made whether to load foo.dll or foo.py. As you correctly point out, a separate search path defeats this behavior. But I don't think the usefulness of the feature compensates for its resultant complexity. Specifically, it will be hard to create this behavior in archive files. As I envision archive files (which of course is subject to change) they contain *.pyc files and not DLL's. The DLL's must be in a ./DLL directory since the OS can not load them from strings. So if every *.pyc is in an archive file, your only choice is whether to load all DLL's first or last. That is, archive.pyl is either before or after ./DLL. If a package (probably with lots of subdirectories) author depends on having a search path within a package which discriminates between pyc and DLL files with equal names, then that search path plus the existence of the DLL's must be recorded in the archive. This is much more complicated than just an archive with all *.pyc files entered in a dotted name space: foo foo.sub1 foo.sub2 foo.sub2.pkx I would question whether equally named foo.dll and foo.py is worth it. The alternative (which is IMHO more common) is to code the choice in Python in the module that cares about it. > > And what if something > > doesn't work? Think of Python being used as a teaching language > > for the 8th grade. Think of the 8th grade teacher trying to get > > all this right. The only thing that works is simplicity. > > We will provide an installer that Just Works [tm]. OK for this case. Not enough for Python program distribution. JimA From jim at interet.com Thu Dec 2 17:30:49 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 02 Dec 1999 11:30:49 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us> Message-ID: <38469EB9.5EDB9617@interet.com> Guido van Rossum wrote: > > > Perhaps, and if there is it should be prominently documented in the > > How to Distribute Your App section of the manual. I > > am worried about supporting versioning, but I will think about it. > > Join the distutil-SIG, they are discussing just this. I already belong to the distutil-SIG and have seen no such discussion. Jim From guido at CNRI.Reston.VA.US Thu Dec 2 18:17:52 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 12:17:52 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Thu, 02 Dec 1999 11:30:49 EST." <38469EB9.5EDB9617@interet.com> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us> <38469EB9.5EDB9617@interet.com> Message-ID: <199912021717.MAA14682@eric.cnri.reston.va.us> [Jim] > > > Perhaps, and if there is it should be prominently documented in the > > > How to Distribute Your App section of the manual. I > > > am worried about supporting versioning, but I will think about it. [me] > > Join the distutil-SIG, they are discussing just this. [Jim again] > I already belong to the distutil-SIG and have seen no such > discussion. Sorry, you're right (except for a brief exchange between you and Paul Dubois :-). But I think they should, it falls under their charter. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 2 18:30:02 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 2 Dec 1999 12:30:02 -0500 (EST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <199912021717.MAA14682@eric.cnri.reston.va.us> References: <384586B4.48905B32@interet.com> <011101bf3c3e$82feaa70$f29b12c2@secret.pythonware.com> <3845928D.C0462322@interet.com> <199912012300.SAA10861@eric.cnri.reston.va.us> <384694F6.E5D74221@interet.com> <199912021603.LAA14455@eric.cnri.reston.va.us> <38469EB9.5EDB9617@interet.com> <199912021717.MAA14682@eric.cnri.reston.va.us> Message-ID: <14406.44186.574647.651111@weyr.cnri.reston.va.us> Guido van Rossum writes: > Sorry, you're right (except for a brief exchange between you and Paul > Dubois :-). But I think they should, it falls under their charter. This was deliberatly postponed until after extension packages are supported and in place. I know Greg is interested in application installation as well as package installation. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From gmcm at hypernet.com Thu Dec 2 18:53:03 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 2 Dec 1999 12:53:03 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912021628.LAA14506@eric.cnri.reston.va.us> References: Your message of "Fri, 19 Nov 1999 22:43:32 EST." <1269053086-27079185@hypernet.com> Message-ID: <1267965342-1446902@hypernet.com> [Gordon] > > No success whatsoever in either direction across Samba. In fact > > the mtime of my Linux home directory as seen from NT is Jan 1, > > 1980. [Guido] > That's only the case for an NT mount point (something of the form > \\host\name; I notice that os.stat() only believes it exists if > you append a backslash: \\host\name\). For interior directories, > at least with the Samba version that I'm using, os.stat() seems > to give correct results. Correct (as I discovered not long after I posted). (I find that from NT I have to stat some file _in_ the directory to get an updated mtime from the stat _of_ the directory). > I think that this whole issue (that doing a stat on a directory > to find out whether files in it were modified doesn't give usable > results) is widely blown out of proportion. This has come up twice: re caching importers and dircache.py (used only by dircmp). We've arrived at the fact that it _can_ be made to work on Windows boxes. NFS? Andrew (anyone still use that)? IOW, do we want to trust it? Do we want to document that it might not be trustworthy in some situations? Make it optional- for-wizards? Kill it? IOOW, what's the proper proportion ;-)? > The only useful bit of info is that mtimes may have an up to 2 > second granularity, and that anything as recent as 2 seconds > should be considered as newer than the cache even if the cache is > also less than 2 seconds. From guido at CNRI.Reston.VA.US Thu Dec 2 21:43:46 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 15:43:46 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Fri, 19 Nov 1999 05:29:50 PST." References: Message-ID: <199912022043.PAA15108@eric.cnri.reston.va.us> Here's the promised response to Greg's response to my wishlist. > On Thu, 18 Nov 1999, Guido van Rossum wrote: > > Gordon McMillan wrote: > >... > > > I think imputil's emulation of the builtin importer is more of a > > > demonstration than a serious implementation. As for speed, it > > > depends on the test. > > > > Agreed. I like some of imputil's features, but I think the API > > need to be redesigned. > > It what ways? It sounds like you've applied some thought. Do you have any > concrete ideas yet, or "just a feeling" :-) I'm working through some > changes from JimA right now, and would welcome other suggestions. I think > there may be some outstanding stuff from MAL, but I'm not sure (Marc?) I actually think that the way the PVM (Python VM) calls the importer ought to be changed. Assigning to __builtin__.__import__ is a crock. The API for __import__ is a crock. > >... > > So here's a challenge: redesign the import API from scratch. > > I would suggest starting with imputil and altering as necessary. I'll use > that viewpoint below. > > > Let me start with some requirements. > > > > Compatibility issues: > > --------------------- > > > > - the core API may be incompatible, as long as compatibility layers > > can be provided in pure Python > > Which APIs are you referring to? The "imp" module? The C functions? The > __import__ and reload builtins? > I'm guessing some of imp, the two builtins, and only one or two C > functions. All of those. > > - support for rexec functionality > > No problem. I can think of a number of ways to do this. Agreed, I think that imputil can do this. > > - support for freeze functionality > > No problem. A function in "imp" must be exposed to Python to support this > within the imputil framework. Agreed. It currently exports init_frozen() which is about the right functionality. > > - load .py/.pyc/.pyo files and shared libraries from files > > No problem. Again, a function is needed for platform-specific loading of > shared libraries. Is it useful to expose the platform differences? The current imp.load_dynamic() should suffice. > > - support for packages > > No problem. Demo's in current imputil. > > > - sys.path and sys.modules should still exist; sys.path might > > have a slightly different meaning > > I would suggest that both retain their *exact* meaning. We introduce > sys.importers -- a list of importers to check, in sequence. The first > importer on that list uses sys.path to look for and load modules. The > second importer loads builtins and frozen code (i.e. modules not on > sys.path). This is looking like the redesign I was looking for. (Note that imputil's current chaining is not good since it's impossible to remove or reorder importers, which I think is a required feature; an explicit list would solve this.) Actually, the order is the other way around, but by now you should know that. It makes sense to have separate ones for builtin and frozen modules -- these have nothing in common. There's another issue, which isn't directly addressed by imputil, although with clever use of inheritance it might be doable. I'd like more support for this however. Quite orthogonally to the issue of having separate importers, I might want to recognize new extensions. Take the example of the ILU folks. They want to be able to drop a file "foo.isl" in any directory on sys.path and have the ILU stubber automatically run if you try to import foo (the client stubs) or foo__skel (the server skeleton). This doesn't fit in the sys.importers strategy, because they want to be able to drop their .isl files in any directory along sys.path. (Or, more likely, they want to have control over where in sys.modules the directory/directories with .isl files are placed.) This requires an ugly modification to the _fs_import() function. (Which should have been a method, by the way, to make overriding it in a subclass of PathImporter easier!) I've been thinking here along the lines of a strategy where the standard importer (the one that walks sys.path) has a set of hooks that define various things it could look for, e.g. .py files, .pyc files, .so or .dll files. This list of hooks could be changed to support looking for .isl files. There's an old, subtle issue that could be solved through this as well: whether or not a .pyc file without a .py file should be accepted or not. Long ago (in Python 0.9.8) a .pyc file alone would never be loaded. This was changed at the request of a small but vocal minority of Python developers who wanted to distribute .pyc files without .py files. It has occasionally caused frustration because sometimes developers move .py files around but forget to remove the .pyc files, and then the .pyc file is silently picked up if it occurs on sys.path earlier than where the .py was moved to. Having a set of hooks for various extensions would make it possible to have a default where lone .pyc files are ignored, but where one can insert a .pyc importer in the list of hooks that does the right thing here. (Of course, it may be possible that this whole feature of lone .pyc files should be replaced since the same need is easily taken care of by zip importers. I also want to support (Jim A notwithstanding :-) a feature whereby different things besides directories can live on sys.path, as long as they are strings -- these could be added from the PYTHONPATH env variable. Every piece of code that I've ever seen that uses sys.path doesn't care if a directory named in sys.path doesn't exist -- it may try to stat various files in it, which also don't exist, and as far as it is concerned that is just an indication that the requested module doesn't live there. Again, we would have to dissect imputil to support various hooks that deal with different kind of entities in sys.path. The default hook list would consist of a single item that interprets the name as a directory name; other hooks could support zip files or URLs. Jack's "magic cookies" could also be supported nicely through such a mechanism. > Users can insert/append new importers or alter sys.path as before. > > sys.modules continues to record name:module mappings. Yes. Note that the interpretation of __file__ could be problematic. To what value do you set __file__ for a module loaded from a zip archive? > > - $PYTHONPATH and $PYTHONHOME should still be supported > > No problem. > > > (I wouldn't mind a splitting up of importdl.c into several > > platform-specific files, one of which is chosen by the configure > > script; but that's a bit of a separate issue.) > > Easy enough. The standard importer can select the appropriate > platform-specific module/function to perform the load. i.e. these can move > to Modules/ and be split into a module-per-platform. Again: what's the advantage of exposing the platform specificity? > > New features: > > ------------- > > > > - Integrated support for Greg Ward's distribution utilities (i.e. a > > module prepared by the distutil tools should install painlessly) > > I don't know the specific requirements/functionality that would be > required here (does Greg? :-), but I can't imagine any problem with this. Probably more support is required from the other end: once it's common for modules to be imported from zip files, the distutil code needs to support the creation and installation of such zip files. Also, there is a need for the install phase of distutil to communicate the location of the zip file to the Python installation. > > - Good support for prospective authors of "all-in-one" packaging tool > > authors like Gordon McMillan's win32 installer or /F's squish. (But > > I *don't* require backwards compatibility for existing tools.) > > Um. *No* problem. :-) :-) > > - Standard import from zip or jar files, in two ways: > > > > (1) an entry on sys.path can be a zip/jar file instead of a directory; > > its contents will be searched for modules or packages Note that this is what I mention above for distutil support. > While this could easily be done, I might argue against it. Old > apps/modules that process sys.path might get confused. Above I argued that this shouldn't be a problem. > If compatibility is not an issue, then "No problem." > > An alternative would be an Importer instance added to sys.importers that > is configured for a specific archive (in other words, don't add the zip > file to sys.path, add ZipImporter(file) to sys.importers). This would be harder for distutil: where does Python get the initial list of importers? > Another alternative is an Importer that looks at a "sys.py_archives" list. > Or an Importer that has a py_archives instance attribute. OK, but again distutil needs to be able to add to this list when it installs a package. (Note that package deinstallation should also be supported!) (Of course I don't require this to affect Python processes that are already running; but it should be possible to easily change the default search path for all newly started instances of a given Python installation.) > > (2) a file in a directory that's on sys.path can be a zip/jar file; > > its contents will be considered as a package (note that this is > > different from (1)!) > > No problem. This will slow things down, as a stat() for *.zip and/or *.jar > must be done, in addition to *.py, *.pyc, and *.pyo. Fine, this is where the caching comes in handy. > > I don't particularly care about supporting all zip compression > > schemes; if Java gets away with only supporting gzip compression > > in jar files, so can we. > > I presume we would support whatever zlib gives us, and no more. That's it. :-) > > - Easy ways to subclass or augment the import mechanism along > > different dimensions. For example, while none of the following > > features should be part of the core implementation, it should be > > easy to add any or all: > > > > - support for a new compression scheme to the zip importer > > Presuming ZipImporter is a class (derived from Importer), then this > ability is wholly dependent upon the author of ZipImporter providing the > hook. Agreed. But since we're likely going to provide this as a standandard feature, we must ensure that it provides this hook. > The Importer class is already designed for subclassing (and its interface > is very narrow, which means delegation is also *very* easy; see > imputil.FuncImporter). But maybe it's *too* narrow; some of the hooks I suggest above seem to require extra interfaces -- at least in some of the subclasses of the Importer base class. Note: I looked at the doc string for get_code() and I don't understand what the difference is between the modname and fqname arguments. If I write "import foo.bar", what are modname and fqname? Why are both present? Also, while you claim that the API is narrow, the multiple return values (also the different types for the second item) make it complicated. > > - support for a new archive format, e.g. tar > > A cakewalk. Gordon, JimA, and myself each have archive formats. :-) > > > - a hook to import from URLs or other data sources (e.g. a > > "module server" imported in CORBA) (this needn't be supported > > through $PYTHONPATH though) > > No problem at all. > > > - a hook that imports from compressed .py or .pyc/.pyo files > > No problem at all. > > > - a hook to auto-generate .py files from other filename > > extensions (as currently implemented by ILU) > > No problem at all. See above -- I think this should be more integrated with sys.path than you are thinking of. The more I think about it, the more I see that the problem is that for you, the importer that uses sys.path is a final subclass of Importer (i.e. it is itself not further subclassed). Several of the hooks I want seem to require additional hooks in the PathImporter rather than new importers. > > - a cache for file locations in directories/archives, to improve > > startup time > > No problem at all. > > > - a completely different source of imported modules, e.g. for an > > embedded system or PalmOS (which has no traditional filesystem) > > No problem at all. > > In each of the above cases, the Importer.get_code() method just needs to > grab the byte codes from the XYZ data source. That data source can be > cmopressed, across a network, on-the-fly generated, or whatever. Each > importer can certainly create a cache based on its concept of "location". > In some cases, that would be a mapping from module name to filesystem > path, or to a URL, or to a compiled-in, frozen module. See above for sys.path integration remark. > > - Note that different kinds of hooks should (ideally, and within > > reason) properly combine, as follows: if I write a hook to recognize > > .spam files and automatically translate them into .py files, and you > > write a hook to support a new archive format, then if both hooks are > > installed together, it should be possible to find a .spam file in an > > archive and do the right thing, without any extra action. Right? > > Ack. Very, very difficult. Actually, I take most of this back. Importers that deal with new extension types often have to go through a file system to transform their data to .py files, and this is just too complicated. However it would be still nice if there was code sharing between the code that looks for .py and .pyc files in a zip archive and the code that does the same in a filesystem. Hm, maybe even that shouldn't be necessary, the zip file probably should contain only .pyc files... (Unrelated remark: I should really try to release the set of modules we've written here at CNRI to deal with zip files. Unfortunately zip files are hairy and so is our code.) > The imputil scheme combines the concept of locating/loading into one step. > There is only one "hook" in the imputil system. Its semantic is "map this > name to a code/module object and return it; if you don't have it, then > return None." That's fine. I actually don't recall where the find-then-load API came from, I think it may be an artefact of the original implementation strategy. It is currently used as follows: we try to see if there's a .pyc and then we try to see if there's a .py; if both exist we compare the timestamps etc. to choose which one. But that's still a red herring. > Your compositing example is based on the capabilities of the > find-then-load paradigm of the existing "ihooks.py". One module finds > something (foo.spam) and the other module loads it (by generating a .py). I still don't understand why ihooks.py had to be so complicated. I guess I just had much less of an understanding of the issues. (It was also partly a compromise with an alternative design by Ken Manheimer, who basically forced me to support packages, originally through ni.py.) > All is not lost, however. I can easily envision the get_code() hook as > allowing any kind of return type. If it isn't a code or module object, > then another hook is called to transform it. > [ actually, I'd design it similarly: a *series* of hooks would be called > until somebody transforms the foo.spam into a code/module object. ] OK. This could be a feature of a subclass of Importer. > The compositing would be limited ony by the (Python-based) Importer > classes. For example, my ZipImporter might expect to zip up .pyc files > *only*. Obviously, you would want to alter this to support zipping any > file, then use the suffic to determine what to do at unzip time. > > > - It should be possible to write hooks in C/C++ as well as Python > > Use FuncImporter to delegate to an extension module. Maybe not so great, since it sounds like the C code can't benefit from any of the infrastructure that imputil offers. I'm not sure about this one though. > This is one of the benefits of imputil's single/narrow interface. Plus its vague specs? :-) > > - Applications embedding Python may supply their own implementations, > > default search path, etc., but don't have to if they want to piggyback > > on an existing Python installation (even though the latter is > > fraught with risk, it's cheaper and easier to understand). > > An application would have full control over the contents of sys.importers. > > For a restricted execution app, it might install an Importer that loads > files from *one* directory only which is configured from a specific > Win32 Registry entry. That importer could also refuse to load shared > modules. The BuiltinImporter would still be present (although the app > would certainly omit all but the necessary builtins from the build). > Frozen modules could be excluded. Actually there's little reason to exclude frozen modules or any .py/.pyc modules -- by definition, bytecode can't be dangerous. It's the builtins and extensions that need to be censored. We currently do this by subclassing ihooks, where we mask the test for builtins with a comparison to a predefined list of names. > > Implementation: > > --------------- > > > > - There must clearly be some code in C that can import certain > > essential modules (to solve the chicken-or-egg problem), but I don't > > mind if the majority of the implementation is written in Python. > > Using Python makes it easy to subclass. > > I posited once before that the cost of import is mostly I/O rather than > CPU, so using Python should not be an issue. MAL demonstrated that a good > design for the Importer classes is also required. Based on this, I'm a > *strong* advocate of moving as much as possible into Python (to get > Python's ease-of-coding with little relative cost). Agreed. However, how do you explain the slowdown (from 9 to 13 seconds I recall) though? Are you a lousy coder? :-) > The (core) C code should be able to search a path for a module and import > it. It does not require dynamic loading or packages. This will be used to > import exceptions.py, then imputil.py, then site.py. It does, however, need to import builtin modules. imputil currently imports imp, sys, strop and __builtin__, struct and marshal; note that struct can easily be a dynamic loadable module, and so could strop in theory. (Note that strop will be unnecessary in 1.6 if you use string methods.) I don't think that this chicken-or-egg problem is particularly problematic though. > The platform-specific module that perform dynamic-loading must be a > statically linked module (in Modules/ ... it doesn't have to be in the > Python/ directory). See earlier comments. > site.py can complete the bootstrap by setting up sys.importers with the > appropriate Importer instances (this is where an application can define > its own policy). sys.path was initially set by the import.c bootstrap code > (from the compiled-in path and environment variables). I thing that algorithm (currently in getpath.c / getpathp.c) might also be moved to Python code -- imported frozen. Sadly, rebuilding with a new version of a frozen module might be more complicated than rebuilding with a new version of a C module, but writing and maintaining this code in Python would be *sooooooo* much easier that I think it's worth it. > Note that imputil.py would not install any hooks when it is loaded. That > is up to site.py. This implies the core C code will import a total of > three modules using its builtin system. After that, the imputil mechanism > would be importing everything (site.py would .install() an Importer which > then takes over the __import__ hook). (Three not counting the builtin modules.) > Further note that the "import" Python statement could be simplified to use > only the hook. However, this would require the core importer to inject > some module names into the imputil module's namespace (since it couldn't > use an import statement until a hook was installed). While this > simplification is "neat", it complicates the run-time system (the import > statement is broken until a hook is installed). Same chicken-or-egg. We can be pragmatic. For a developer, I'd like a bit of robustness (all this makes it rather hard to debug a broken imputil, and that's a fair amount of code!). > Therefore, the core C code must also support importing builtins. "sys" and > "imp" are needed by imputil to bootstrap. > > The core importer should not need to deal with dynamic-load modules. Same question. Since that all has to be coded in C anyway, why not? > To support frozen apps, the core importer would need to support loading > the three modules as frozen modules. I'd like to see a description of how someone like Jim A would build a single-file application using the new mechanism. This could completely replace freeze. (Freeze currently requires a C compiler; that's bad.) > The builtin/frozen importing would be exposed thru "imp" for use by > imputil for future imports. imputil would load and use the (builtin) > platform-specific module to do dynamic-load imports. Sure. > > - In order to support importing from zip/jar files using compression, > > we'd at least need the zlib extension module and hence libz itself, > > which may not be available everywhere. > > Yes. I don't see this as a requirement, though. We wouldn't start to use > these by default, would we? Or insist on zlib being present? I see this as > more along the lines of "we have provided a standardized Importer to do > this, *provided* you have zlib support." Agreed. Zlib support is easy to get, but there are probably platforms where it's not. (E.g. maybe the Mac? I suppose that on the Mac, there would be some importer classes to import from a resource fork.) > > - I suppose that the bootstrap is solved using a mechanism very > > similar to what freeze currently used (other solutions seem to be > > platform dependent). > > The bootstrap that I outlined above could be done in C code. The import > code would be stripped down dramatically because you'll drop package > support and dynamic loading. Not the dynamic loading. But yes the package support. > Alternatively, you could probably do the path-scanning in Python and > freeze that into the interpreter. Personally, I don't like this idea as it > would not buy you much at all (it would still need to return to C for > accessing a number of scanning functions and module importing funcs). > > > - I also want to still support importing *everything* from the > > filesystem, if only for development. (It's hard enough to deal with > > the fact that exceptions.py is needed during Py_Initialize(); > > I want to be able to hack on the import code written in Python > > without having to rebuild the executable all the time. > > My outline above does not freeze anything. Everything resides in the > filesystem. The C code merely needs a path-scanning loop and functions to > import .py*, builtin, and frozen types of modules. Good. Though I think there's also a need for freezing everything. And when we go the route of the zip archive, the zip archive handling code needs to be somewhere -- frozen seems to be a reasonable choice. > If somebody nukes their imputil.py or site.py, then they return to Python > 1.4 behavior where the core interpreter uses a path for importing (i.e. no > packages). They lose dynamically-loaded module support. But if the path guessing is also done by site.py (as I propose) the path will probably be wrong. A warning should be printed. > > Let's first complete the requirements gathering. Are these > > requirements reasonable? Will they make an implementation too > > complex? Am I missing anything? > > I'm not a fan of the compositing due to it requiring a change to semantics > that I believe are very useful and very clean. However, I outlined a > possible, clean solution to do that (a secondary set of hooks for > transforming get_code() return values). As you may see from my responses, I'm a big fan of having several different sets of hooks. I do withdraw the composition requirement though. > The requirements are otherwise reasonable to me, as I see that they can > all be readily solved (i.e. they aren't burdensome). > > While this email may be long, I do not believe the resulting system would > be complex. From the user-visible side of things, nothing would be > changed. sys.path is still present and operates as before. They *do* have > new functionality they can grow into, though (sys.importers). The > underlying C code is simplified, and the platform-specific dynamic-load > stuff can be distributed to distinct modules, as needed > (e.g. BeOS/dynloadmodule.c and PC/dynloadmodule.c). > > > Finally, to what extent does this impact the desire for dealing > > differently with the Python bytecode compiler (e.g. supporting > > optimizers written in Python)? And does it affect the desire to > > implement the read-eval-print loop (the >>> prompt) in Python? > > If the three startup files require byte-compilation, then you could have > some issues (i.e. the byte-compiler must be present). Another chicken-or-egg. No biggie. > Once you hit site.py, you have a "full" environment and can easily detect > and import a read-eval-print loop module (i.e. why return to Python? just > start things up right there). You mean "why return to C?" I agree. It would be cool if somehow IDLE and Pythonwin would also be bootstrapped using the same mechanisms. (This would also solve the question "which interactive environment am I using?" that some modules and apps want to see answered because they need to do things differently when run under IDLE,for example.) > site.py can also install new optimizers as desired, a new Python-based > parser or compiler, or whatever... If Python is built without a parser or > compiler (I hope that's an option!), then the three startup modules would > simply be frozen into the executable. More power to hooks! --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 2 22:22:33 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 2 Dec 1999 16:22:33 -0500 (EST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us> References: <199912022043.PAA15108@eric.cnri.reston.va.us> Message-ID: <14406.58137.359127.921135@weyr.cnri.reston.va.us> Guido van Rossum writes: > variable. Every piece of code that I've ever seen that uses sys.path > doesn't care if a directory named in sys.path doesn't exist -- it may > try to stat various files in it, which also don't exist, and as far as Not the case -- I know you've looked at some of my code in the KOE that ensures only real directories are on the path, and each is only there once (pathhack.py). Given that sys.path is often too long and includes duplicate entries in a large system (often one entry with and one without a trailing / for a given directory), it useful to be able to distinguish between things that should be interpretable as paths and things that aren't. It should not be hard to declare that "cookies" or whatever have some special form, like "". > (Unrelated remark: I should really try to release the set of modules > we've written here at CNRI to deal with zip files. Unfortunately zip > files are hairy and so is our code.) It doesn't help that that code just plain stinks. I maintain that no one here understands the whole of it. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jcw at equi4.com Thu Dec 2 22:41:46 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Thu, 02 Dec 1999 22:41:46 +0100 Subject: [Python-Dev] Import redesign [LONG] References: <199912022043.PAA15108@eric.cnri.reston.va.us> Message-ID: <3846E79A.446EAFD5@equi4.com> Guido van Rossum wrote: [...] > Note that the interpretation of __file__ could be problematic. To > what value do you set __file__ for a module loaded from a zip archive? Makefiles use "archive(entry)" (this also supports nesting if needed). [...] > I'd like to see a description of how someone like Jim A would build a > single-file application using the new mechanism. This could > completely replace freeze. (Freeze currently requires a C compiler; > that's bad.) [...] This may be off-topic, but has anyone considered what it would take to load shared libs out of an archive? One way is to extract on-the-fly to a temporary area. A refinement is to leave extracted files there as cache, and perhaps even to extract to a file with a name derived from its MD5 digest (this way multiple users and even Python installations can share the cache). Would it be useful to define a "standard" area? -- Jean-Claude From gmcm at hypernet.com Fri Dec 3 00:15:50 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 2 Dec 1999 18:15:50 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us> References: Your message of "Fri, 19 Nov 1999 05:29:50 PST." Message-ID: <1267945992-2611810@hypernet.com> [Guido] big snip > Note that the interpretation of __file__ could be problematic. > To what value do you set __file__ for a module loaded from a zip > archive? I just left it alone (ie, as it was when I picked up the .pyc). Turns out OK, because then when the end user files a bug report, the developer can track it down. > Note: I looked at the doc string for get_code() and I don't > understand what the difference is between the modname and fqname > arguments. If I write "import foo.bar", what are modname and > fqname? As I recall: import foo.bar -> get_code(None, 'foo', 'foo') # returns foo -> get_code(, 'bar', 'foo.bar') > Why are both present? I think so the importer can choose between being tree structured or flat. > I'd like to see a description of how someone like Jim A would > build a single-file application using the new mechanism. This > could completely replace freeze. (Freeze currently requires a C > compiler; that's bad.) I have something working for Linux now. I froze exceptions.py. I hacked getpath.c so prefix = exec_prefix = executable's directory and the starting path is [prefix]. Although I did it differently, you could regard imputil.py and archive.py as frozen, too. (On WIndows it's somewhat different, because the result uses the stock python15.dll.) This somewhat oversimplifies; and I haven't really thought out all the ways people might try to use sym links. I'm inclined to think the starting path should contain both the executable's real directory and the sym link's directory. > .... I do withdraw the composition > requirement though. Hooray! - Gordon From gstein at lyra.org Fri Dec 3 01:19:14 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 2 Dec 1999 16:19:14 -0800 (PST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <384694D8.DCA3D75E@lemburg.com> Message-ID: On Thu, 2 Dec 1999, M.-A. Lemburg wrote: >... > Still, I would like to rephrase my 0.02EUR which I already > posted twice... why not start to think about what these > importers would do first ? If there are only a handful of > wishes we could just add them to the builtin machinery and > be done with it... I'd rather see the builtin machinery move to Python, regardless of what system is used and/or what features are added. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Fri Dec 3 04:19:40 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 2 Dec 1999 19:19:40 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912022043.PAA15108@eric.cnri.reston.va.us> Message-ID: On Thu, 2 Dec 1999, Guido van Rossum wrote: >... > Sometime, Greg Stein wrote: >... > > On Thu, 18 Nov 1999, Guido van Rossum wrote: >... > > > Agreed. I like some of imputil's features, but I think the API > > > need to be redesigned. > > > > It what ways? It sounds like you've applied some thought. Do you have any > > concrete ideas yet, or "just a feeling" :-) I'm working through some > > changes from JimA right now, and would welcome other suggestions. I think > > there may be some outstanding stuff from MAL, but I'm not sure (Marc?) > > I actually think that the way the PVM (Python VM) calls the importer > ought to be changed. Assigning to __builtin__.__import__ is a crock. > The API for __import__ is a crock. Something like sys.set_import_hook() ? The other alternative that I see would be to have the C code scan sys.importers, assuming each are callable objects, and call them with the appropriate params (e.g. module name). Of course, to move this scanning into Python would require something like sys.set_import_hook() unless Python looks for a hard-coded module and entrypoint. >... > > Which APIs are you referring to? The "imp" module? The C functions? The > > __import__ and reload builtins? > > > I'm guessing some of imp, the two builtins, and only one or two C > > functions. > > All of those. We can provide Python code to provide compatibility for "imp" and the two hooks. Nothing we can do to the C code, though. I'm not sure what the import API looks like from C, and whether they could all stay. A brief glance looks like most could stay. [ removing any would change Python's API version, which might be "okay" ] >... > > > - load .py/.pyc/.pyo files and shared libraries from files > > > > No problem. Again, a function is needed for platform-specific loading of > > shared libraries. > > Is it useful to expose the platform differences? The current > imp.load_dynamic() should suffice. This comes up several times throughout this message, and in some off-list mail Guido and I have exchanged. Namely, "should dynamic loading be part of the core, or performed via a module?" I would rather see it become a module, rather than inside the core (despite the fact that the module would have to be compiled into the interpreter). I believe this provides more flexibility for people looking to replace/augment/update/fix dynamic loading on various architectures. Rather than changing the core, a person can just drop in another module. The isolation between the core and modules is nicer, aesthetically, to me. The modules would also be exposing Just Another Importer Function, rather than a specialized API in the builtin imp module. Also note that it is easier to keep a module *out* of a Python-based application, than it is to yank functions out of the core of Python. Frozen apps, embedded apps, etc could easily leave out dynamic loading. Are there strict advantages? Not any that I can think of right now (beyond a bit of ease-of-use mentioned above). It just feels better to me. >... > > > - sys.path and sys.modules should still exist; sys.path might > > > have a slightly different meaning > > > > I would suggest that both retain their *exact* meaning. We introduce > > sys.importers -- a list of importers to check, in sequence. The first > > importer on that list uses sys.path to look for and load modules. The > > second importer loads builtins and frozen code (i.e. modules not on > > sys.path). > > This is looking like the redesign I was looking for. (Note that > imputil's current chaining is not good since it's impossible to remove > or reorder importers, which I think is a required feature; an explicit > list would solve this.) The chaining is an aspect of the current, singular import hook that Python uses. In the past, I've suggested the installation of a "manager" that maintains a list. sys.importers is similar in practice. Note that this Manager would be present with the sys.set_import_hook() scheme, while the Manager is implied if the core scans sys.importers. > Actually, the order is the other way around, but by now you should > know that. It makes sense to have separate ones for builtin and > frozen modules -- these have nothing in common. Yes, JimA pointed this out. The latest imputil has corrected this. I combined the builtin and frozen Importers because they were just so similar. I didn't want to iterate over two Importers when a single one sufficed quite well. *shrug* Could go either way, really. > There's another issue, which isn't directly addressed by imputil, > although with clever use of inheritance it might be doable. I'd like > more support for this however. Quite orthogonally to the issue of > having separate importers, I might want to recognize new extensions. Correct: while imputil doesn't address this, the standard/default Importer classes *definitely* can. >... > the directory/directories with .isl files are placed.) This requires > an ugly modification to the _fs_import() function. (Which should have > been a method, by the way, to make overriding it in a subclass of > PathImporter easier!) I yanked that code out of the DirectoryImporter so that the PathImporter could use it. I could see a reorg that creates a FileSystemImporter that defines the method, and the other two just subclass from that. > I've been thinking here along the lines of a strategy where the > standard importer (the one that walks sys.path) has a set of hooks > that define various things it could look for, e.g. .py files, .pyc > files, .so or .dll files. This list of hooks could be changed to > support looking for .isl files. Agreed. It should be easy to have a mapping of extension to handler. One issue: should there be an ordering to the extensions? Exercise for the reader to alter the data structures... > There's an old, subtle issue that could be solved through this as > well: whether or not a .pyc file without a .py file should be accepted > or not. Long ago (in Python 0.9.8) a .pyc file alone would never be > loaded. This was changed at the request of a small but vocal minority > of Python developers who wanted to distribute .pyc files without .py > files. It has occasionally caused frustration because sometimes > developers move .py files around but forget to remove the .pyc files, > and then the .pyc file is silently picked up if it occurs on sys.path > earlier than where the .py was moved to. I think, "too bad for them." :-) Having just a .pyc is a very nice feature. But how can you tell whether it was meant to be a plain .pyc or a mis-ordered one? To truly resolve that, you would need to scan the whole path, looking for a .py. However, maybe somebody put the .pyc there on purpose, to override the .py! --- begin slightly-off-topic --- Here is a neat little Bash script that allows you to use a .pyc as a CGI (to avoid parse overhead). Normally, you can't just drop a .pyc into the cgi-bin directory because the OS doesn't know how to execute it. Not a problem, I say... just append your .pyc to the following Bash script and execute! :-) #!/bin/bash exec - 3< $0 ; exec python -c 'import os,marshal ; f = os.fdopen(3, "rb") ; f.readline() ; f.readline() ; f.seek(8, 1) ; _c = marshal.load(f) ; del os, marshal, f ; exec _c' $@ (the script should be two lines; and no... you can't use readlines(2)) The above script will preserve stdin, stdout, and stderr. If the caller also use 3< ... well, that got overridden :-) The script doesn't work on Windows for two reasons, though: 1) Bash, 2) the "rb" mode followed by readline() Detailed info at the bottom of http://www.lyra.org/greg/python/ --- end of off-topic --- > Having a set of hooks for various extensions would make it possible to > have a default where lone .pyc files are ignored, but where one can > insert a .pyc importer in the list of hooks that does the right thing > here. (Of course, it may be possible that this whole feature of lone > .pyc files should be replaced since the same need is easily taken care > of by zip importers. Maybe. I'd still like to see plain .pyc files, but I know I can work around any change you might make here :-) (i.e. whatever you'd like to do... go for it) > I also want to support (Jim A notwithstanding :-) a feature whereby > different things besides directories can live on sys.path, as long as > they are strings -- these could be added from the PYTHONPATH env > variable. Every piece of code that I've ever seen that uses sys.path > doesn't care if a directory named in sys.path doesn't exist -- it may > try to stat various files in it, which also don't exist, and as far as > it is concerned that is just an indication that the requested module > doesn't live there. I'm not in favor of this, but it is more-than-doable. Again: your discretion... > Again, we would have to dissect imputil to support various hooks that > deal with different kind of entities in sys.path. The default hook > list would consist of a single item that interprets the name as a > directory name; other hooks could support zip files or URLs. Jack's > "magic cookies" could also be supported nicely through such a > mechanism. Specifically, the PathImporter would get "dissected" :-). No problem. > > Users can insert/append new importers or alter sys.path as before. > > > > sys.modules continues to record name:module mappings. > > Yes. > > Note that the interpretation of __file__ could be problematic. To > what value do you set __file__ for a module loaded from a zip archive? You don't (certainly in a way that is nice/compatible for modules that refer to it). This is why I don't like __file__ and __path__. They just don't make sense in archives or frozen code. Python code that relies on them will create problems when that code is placed into different packaging mechanisms. >... > > > (I wouldn't mind a splitting up of importdl.c into several > > > platform-specific files, one of which is chosen by the configure > > > script; but that's a bit of a separate issue.) > > > > Easy enough. The standard importer can select the appropriate > > platform-specific module/function to perform the load. i.e. these can move > > to Modules/ and be split into a module-per-platform. > > Again: what's the advantage of exposing the platform specificity? See above. >... > Probably more support is required from the other end: once it's common > for modules to be imported from zip files, the distutil code needs to > support the creation and installation of such zip files. Also, there > is a need for the install phase of distutil to communicate the > location of the zip file to the Python installation. I'm quite confident that something can be designed that would satisfy the needs here. Something akin to .pth files that a zip importer could read. >... > > > - Standard import from zip or jar files, in two ways: > > > > > > (1) an entry on sys.path can be a zip/jar file instead of a directory; > > > its contents will be searched for modules or packages > > Note that this is what I mention above for distutil support. > > > While this could easily be done, I might argue against it. Old > > apps/modules that process sys.path might get confused. > > Above I argued that this shouldn't be a problem. For most code, no, but as Fred mentioned (and I surmise), there are things out there assuming that sys.path contains strings which specify directories. Sure, we can do this (your discretion), but my feeling is to avoid it. > > If compatibility is not an issue, then "No problem." > > > > An alternative would be an Importer instance added to sys.importers that > > is configured for a specific archive (in other words, don't add the zip > > file to sys.path, add ZipImporter(file) to sys.importers). > > This would be harder for distutil: where does Python get the initial > list of importers? Default is just the two: BuiltinImporter and PathImporter. Adding ZipImporters (or anything else) at startup is TBD, but shouldn't pose a problem. >... > > > (2) a file in a directory that's on sys.path can be a zip/jar file; > > > its contents will be considered as a package (note that this is > > > different from (1)!) > > > > No problem. This will slow things down, as a stat() for *.zip and/or *.jar > > must be done, in addition to *.py, *.pyc, and *.pyo. > > Fine, this is where the caching comes in handy. IFF caching is enabled for the particular platform and installation. >... > > The Importer class is already designed for subclassing (and its interface > > is very narrow, which means delegation is also *very* easy; see > > imputil.FuncImporter). > > But maybe it's *too* narrow; some of the hooks I suggest above seem to > require extra interfaces -- at least in some of the subclasses of the > Importer base class. Correct -- the *subclasses*. I still maintain the imputil design of a single hook (get_code) is Right. I'll make a swipe at PathImporter in the next few weeks to add the capability for new extensions. > Note: I looked at the doc string for get_code() and I don't understand > what the difference is between the modname and fqname arguments. If I > write "import foo.bar", what are modname and fqname? Why are both > present? Also, while you claim that the API is narrow, the multiple > return values (also the different types for the second item) make it > complicated. Gordon detailed this in another note... Yes, the multiple return values make it a bit more complicated, but I can't think of any reasonable alternatives. A bit more doc should do the trick, I'd guess. >... > > > - a hook to auto-generate .py files from other filename > > > extensions (as currently implemented by ILU) > > > > No problem at all. > > See above -- I think this should be more integrated with sys.path than > you are thinking of. The more I think about it, the more I see that > the problem is that for you, the importer that uses sys.path is a > final subclass of Importer (i.e. it is itself not further subclassed). > Several of the hooks I want seem to require additional hooks in the > PathImporter rather than new importers. Correct -- I've currently designed/implemented PathImporter as "final". I don't forsee a problem turning it into something that can be hooked at run-time, or subclassed at code-time. A detailing of the features needed would be handy: * allow alternative file suffixes, with functions or subclasses to map the file into a code/module object. >... > > > - Note that different kinds of hooks should (ideally, and within > > > reason) properly combine, as follows: if I write a hook to recognize > > > .spam files and automatically translate them into .py files, and you > > > write a hook to support a new archive format, then if both hooks are > > > installed together, it should be possible to find a .spam file in an > > > archive and do the right thing, without any extra action. Right? > > > > Ack. Very, very difficult. > > Actually, I take most of this back. Importers that deal with new > extension types often have to go through a file system to transform > their data to .py files, and this is just too complicated. However it > would be still nice if there was code sharing between the code that > looks for .py and .pyc files in a zip archive and the code that does > the same in a filesystem. Hm, maybe even that shouldn't be necessary, > the zip file probably should contain only .pyc files... Gordon replies to this... All of the archives that myself, Gordon, and JimA have been using only store .pyc files. I don't see much code sharing between the filesystem and archive import code. >... > > All is not lost, however. I can easily envision the get_code() hook as > > allowing any kind of return type. If it isn't a code or module object, > > then another hook is called to transform it. > > [ actually, I'd design it similarly: a *series* of hooks would be called > > until somebody transforms the foo.spam into a code/module object. ] > > OK. This could be a feature of a subclass of Importer. That would be my preference, rather than loading more into the Importer base class itself. >... > > > - It should be possible to write hooks in C/C++ as well as Python > > > > Use FuncImporter to delegate to an extension module. > > Maybe not so great, since it sounds like the C code can't benefit from > any of the infrastructure that imputil offers. I'm not sure about > this one though. There isn't any infrastructure that needs to be accessed. get_code() is the call-point, and there is no mechanism provided to the callee to call back into the imputil system. > > This is one of the benefits of imputil's single/narrow interface. > > Plus its vague specs? :-) Ouch. I thought I was actually doing quite a bit better than normal with that long doc-string on get_code :-( >... > > For a restricted execution app, it might install an Importer that loads > > files from *one* directory only which is configured from a specific > > Win32 Registry entry. That importer could also refuse to load shared > > modules. The BuiltinImporter would still be present (although the app > > would certainly omit all but the necessary builtins from the build). > > Frozen modules could be excluded. > > Actually there's little reason to exclude frozen modules or any > .py/.pyc modules -- by definition, bytecode can't be dangerous. It's > the builtins and extensions that need to be censored. > > We currently do this by subclassing ihooks, where we mask the test for > builtins with a comparison to a predefined list of names. True. My concern is an invader misusing one "type" of module for another. For example, let's say you've provided a selection of modules each exporting function FOO, and the user can configure which module to use. Can they do damage if some unrelated, frozen module also exports FOO? Minor issue, anyhow. All the functionality is there. >... > > I posited once before that the cost of import is mostly I/O rather than > > CPU, so using Python should not be an issue. MAL demonstrated that a good > > design for the Importer classes is also required. Based on this, I'm a > > *strong* advocate of moving as much as possible into Python (to get > > Python's ease-of-coding with little relative cost). > > Agreed. However, how do you explain the slowdown (from 9 to 13 > seconds I recall) though? Are you a lousy coder? :-) Heh :-) I have not spent *any* time working on optimization. Currently, each Importer in the chain redoes some work of the prior Importer. A bit of restructuring would split the common work out to a Manager, which then calls a method in the Importer (and passes all the computed work). Of course, a bit of profiling wouldn't hurt either. Some of the "imp" interfaces could possibly be refined to better support the BuiltinImporter or the dynamic load features. The question is still valid, though -- at the moment, I can't explain it because I haven't looked into it. > > The (core) C code should be able to search a path for a module and import > > it. It does not require dynamic loading or packages. This will be used to > > import exceptions.py, then imputil.py, then site.py. Note: after writing this, I realized there is really no need for the core to do the imputil import. site.py can easily do that. > It does, however, need to import builtin modules. imputil currently Correct. > imports imp, sys, strop and __builtin__, struct and marshal; note that > struct can easily be a dynamic loadable module, and so could strop in > theory. (Note that strop will be unnecessary in 1.6 if you use string > methods.) I knew about strop, but imputil would be harder to use today if it relied on the string methods. So... I've delayed that change. The struct module is used in a couple teeny cases, dealing with constructing a network-order, 4-byte, binary integer value. It would be easy enough to just do that with a bit of Python code instead. > I don't think that this chicken-or-egg problem is particularly > problematic though. Right. In my ideal world, the core couldn't do a dynamic load, so that would need to be considered within the bootstrap process. >... > > site.py can complete the bootstrap by setting up sys.importers with the > > appropriate Importer instances (this is where an application can define > > its own policy). sys.path was initially set by the import.c bootstrap code > > (from the compiled-in path and environment variables). > > I thing that algorithm (currently in getpath.c / getpathp.c) might > also be moved to Python code -- imported frozen. Sadly, rebuilding > with a new version of a frozen module might be more complicated than > rebuilding with a new version of a C module, but writing and > maintaining this code in Python would be *sooooooo* much easier that I > think it's worth it. I think we can find a better way to freeze modules and to use them. Especially for the cases where we have specific "core" functions implemented in Python. (e.g. freezing parsers, compilers, and/or the read-eval loop) I don't forsee an issue that the build process becomes more complicated. If we nuke "makesetup" in favor of a Python script, then we could create a stub Python executable which runs the build script which writes the Setup file and the getpath*.c file(s). > > Note that imputil.py would not install any hooks when it is loaded. That > > is up to site.py. This implies the core C code will import a total of > > three modules using its builtin system. After that, the imputil mechanism > > would be importing everything (site.py would .install() an Importer which > > then takes over the __import__ hook). > > (Three not counting the builtin modules.) Correct, although I'll modify my statement to "two plus the builtins". > > Further note that the "import" Python statement could be simplified to use > > only the hook. However, this would require the core importer to inject > > some module names into the imputil module's namespace (since it couldn't > > use an import statement until a hook was installed). While this > > simplification is "neat", it complicates the run-time system (the import > > statement is broken until a hook is installed). > > Same chicken-or-egg. We can be pragmatic. > > For a developer, I'd like a bit of robustness (all this makes it > rather hard to debug a broken imputil, and that's a fair amount of > code!). True. I threw that out as an alternative, and then presented the counter argument :-) >... > > Therefore, the core C code must also support importing builtins. "sys" and > > "imp" are needed by imputil to bootstrap. > > > > The core importer should not need to deal with dynamic-load modules. > > Same question. Since that all has to be coded in C anyway, why not? It simplifies the core's import code to not deal with that stuff at all. > > To support frozen apps, the core importer would need to support loading > > the three modules as frozen modules. > > I'd like to see a description of how someone like Jim A would build a > single-file application using the new mechanism. This could > completely replace freeze. (Freeze currently requires a C compiler; > that's bad.) The portable mechanism for freezing will always need a compiler. Platform specific mechanisms (e.g. append to the .EXE, or use the linker to create a new ELF section) can optimize the freeze process in different ways. I don't have a design in my head for the freeze issues -- I've been considering that the mechanism would remain about the same. However, I can easily see that different platforms may want to use different freeze processes... hmm... >... > > Yes. I don't see this as a requirement, though. We wouldn't start to use > > these by default, would we? Or insist on zlib being present? I see this as > > more along the lines of "we have provided a standardized Importer to do > > this, *provided* you have zlib support." > > Agreed. Zlib support is easy to get, but there are probably platforms > where it's not. (E.g. maybe the Mac? I suppose that on the Mac, > there would be some importer classes to import from a resource fork.) Exactly. And importer classes to load from a Win32 resources (modifying a .EXE's resources post-link is cleaner than the append solution) >... > > My outline above does not freeze anything. Everything resides in the > > filesystem. The C code merely needs a path-scanning loop and functions to > > import .py*, builtin, and frozen types of modules. > > Good. Though I think there's also a need for freezing everything. > And when we go the route of the zip archive, the zip archive handling > code needs to be somewhere -- frozen seems to be a reasonable choice. Sure. > > If somebody nukes their imputil.py or site.py, then they return to Python > > 1.4 behavior where the core interpreter uses a path for importing (i.e. no > > packages). They lose dynamically-loaded module support. > > But if the path guessing is also done by site.py (as I propose) the > path will probably be wrong. A warning should be printed. All right. Doesn't Python already print a warning if it can't find site.py? > > > Let's first complete the requirements gathering. Are these > > > requirements reasonable? Will they make an implementation too > > > complex? Am I missing anything? > > > > I'm not a fan of the compositing due to it requiring a change to semantics > > that I believe are very useful and very clean. However, I outlined a > > possible, clean solution to do that (a secondary set of hooks for > > transforming get_code() return values). > > As you may see from my responses, I'm a big fan of having several > different sets of hooks. Yes. However, I've only recognized one so far. Propose more... I'm confident we can update the PathImporter design to accomodate (and retain the underlying imputil paradigm). > I do withdraw the composition requirement > though. :-) >... > > Once you hit site.py, you have a "full" environment and can easily detect > > and import a read-eval-print loop module (i.e. why return to Python? just > > start things up right there). > > You mean "why return to C?" I agree. It would be cool if somehow Heh. Yah, that's what I meant :-) > IDLE and Pythonwin would also be bootstrapped using the same > mechanisms. (This would also solve the question "which interactive > environment am I using?" that some modules and apps want to see > answered because they need to do things differently when run under > IDLE,for example.) Haven't thought on this. Should be doable, I'd think. > > site.py can also install new optimizers as desired, a new Python-based > > parser or compiler, or whatever... If Python is built without a parser or > > compiler (I hope that's an option!), then the three startup modules would > > simply be frozen into the executable. > > More power to hooks! :-) You betcha! I believe my next order of business: * update PathImporter with the file-extension hook * dynload C code reorg, per the other email * create new-model site.py and trash import.c * review freeze mechanisms and process * design mechanism for frozen core functionality (eg. getpath*.c) (coding and building design) * shift core functions to Python, using above design I'll just plow ahead, but also recognize that any/all may change. ie. I'll build examples/finals/prototypes and Guido can pick/choose/reimplement/etc as needed. I'm out next week, but should start on the above items by the end of the month (will probably do another mod_dav release in there somewhere). Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Fri Dec 3 11:10:10 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 3 Dec 1999 11:10:10 +0100 Subject: [Python-Dev] Import redesign [LONG] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> Message-ID: <023601bf3d78$0ec3dc30$f29b12c2@secret.pythonware.com> Jean-Claude Wippler wrote: > This may be off-topic, but has anyone considered what it would take to > load shared libs out of an archive? well, we do that in a number of applications. (lazy installers are really cool... if you've installed works, you've seen some weird stuff -- for example, when the application starts the first time, it's loading everything from inside the installer. the rest of the installation is done from within the application itself, using archives in the installation executable) I think things like this are better left for the application designers, though... From mal at lemburg.com Fri Dec 3 11:03:31 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 03 Dec 1999 11:03:31 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: Message-ID: <38479573.B2CFDD2B@lemburg.com> Greg Stein wrote: > > On Thu, 2 Dec 1999, M.-A. Lemburg wrote: > >... > > Still, I would like to rephrase my 0.02EUR which I already > > posted twice... why not start to think about what these > > importers would do first ? If there are only a handful of > > wishes we could just add them to the builtin machinery and > > be done with it... > > I'd rather see the builtin machinery move to Python, regardless of what > system is used and/or what features are added. In the long run that's probably the right direction, but right now we are only talking a very small set of additional features, which can easily be added to the existing code without too much fuzz. Plus it won't slow things down, which is important since Python startup time is already an issue all by itself. The imputil.py approach of doing (a whole bunch of) recursive Python function calls to all kinds of importers will not speed this up, I'm afraid. A on-disk lookup table would speed this up, but it would also break the current logic in imputil.py, which puts importer independence above all. -- IMHO, we should retreat to a more centralized interface, one which more resembles a manager rather than the agent interface implemented in imputil.py. Add-ons can then register themselves to say "hey, I can handle pyz-archives" or "I know how to import .so modules" or "I provide a search function which you can call to have me scan my module container (directory, web-site, archive)". The manager would take care of what to call and in which order, plus delegate requests to add-ons which implement the needed logic, e.g. add-ons for signature checking, unzipping archives, file system lookup tables, etc. It could also trace its actions and then keep an on-disk knowledge base for what it did in the past to find certain modules under certain conditions. Anyway, all this is extra magic for some future version of Python. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 28 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Dec 3 14:45:07 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 03 Dec 1999 08:45:07 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: Your message of "Fri, 03 Dec 1999 11:03:31 +0100." <38479573.B2CFDD2B@lemburg.com> References: <38479573.B2CFDD2B@lemburg.com> Message-ID: <199912031345.IAA16376@eric.cnri.reston.va.us> [Greg] > > I'd rather see the builtin machinery move to Python, regardless of what > > system is used and/or what features are added. [Marc] > In the long run that's probably the right direction, but right now > we are only talking a very small set of additional features, > which can easily be added to the existing code without too much > fuzz. I disagree. We should do the redisign right rather than tweaking the existing code. > Plus it won't slow things down, which is important since > Python startup time is already an issue all by itself. The > imputil.py approach of doing (a whole bunch of) recursive Python > function calls to all kinds of importers will not speed this up, > I'm afraid. A on-disk lookup table would speed this up, but > it would also break the current logic in imputil.py, which > puts importer independence above all. I don't care about the current logic in imputil. It's only a prototype! > IMHO, we should retreat to a more centralized interface, > one which more resembles a manager rather than the agent > interface implemented in imputil.py. Add-ons can then > register themselves to say "hey, I can handle pyz-archives" > or "I know how to import .so modules" or "I provide a > search function which you can call to have me scan > my module container (directory, web-site, archive)". This makes sense. > The manager would take care of what to call and in which > order, plus delegate requests to add-ons which implement > the needed logic, e.g. add-ons for signature checking, unzipping > archives, file system lookup tables, etc. > > It could also trace its actions and then keep an on-disk > knowledge base for what it did in the past to find certain > modules under certain conditions. > > Anyway, all this is extra magic for some future version of > Python. I would say the manager API design and a basic set of specific handlers should go into 1.6. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Fri Dec 3 15:14:00 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 3 Dec 1999 15:14:00 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> Message-ID: <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> MAL wrote: > > IMHO, we should retreat to a more centralized interface, > > one which more resembles a manager rather than the agent > > interface implemented in imputil.py. Add-ons can then > > register themselves to say "hey, I can handle pyz-archives" > > or "I know how to import .so modules" or "I provide a > > search function which you can call to have me scan > > my module container (directory, web-site, archive)". but why? in my small-minded view of how python works, an importer carries out a very simple task: given a name, check if you have a module with that name, and install it. if you cannot, fail (in which case python asks the next importer along the path). why do you have to complicate things beyond that? why not just let Python provide a few base classes and mixins for people who want to create custom importers, and be done with it? rationale, please. From jim at interet.com Fri Dec 3 15:34:40 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 03 Dec 1999 09:34:40 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> Message-ID: <3847D500.53833D06@interet.com> "M.-A. Lemburg" wrote: > > Greg Stein wrote: > > I'd rather see the builtin machinery move to Python, regardless of what > > system is used and/or what features are added. > > In the long run that's probably the right direction, but right now > we are only talking a very small set of additional features, > which can easily be added to the existing code without too much > fuzz. I volunteer to write a Python archive in either Python or C. In fact I currently have prototypes for both. But I have to agree with Greg here. I think a Python importer is the way to go. The C code is 300 lines mostly in import.c and parallel to existing code. The Python archive is about 100 lines and is prettier, easy to read, alter and re-use (obviously). > Plus it won't slow things down, which is important since > Python startup time is already an issue all by itself. The I think archive files should be able to be fast, and should help, not hurt, startup time. Provided that the use of sys.path is curtailed, os.readdir() is not needed, and the specifications are not complicated. Although archive files are my special concern, I realize that imputil is not just about archives. JimA From guido at CNRI.Reston.VA.US Fri Dec 3 15:39:25 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 03 Dec 1999 09:39:25 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Thu, 02 Dec 1999 19:19:40 PST." References: Message-ID: <199912031439.JAA16524@eric.cnri.reston.va.us> Greg, Great response. I think we know where we each stand. Please go ahead with a new design. (That's trust, not carte blanche.) Just one thought: the more I think about it, the less I like sys.importers: functionality which is implemented through sys.importers must necessarily be placed either in front of all of sys.path or after it. While this is helpful for "canned" apps that want *everything* to be imported from a fixed archive, I think that for regular Python installations sys.path should remain the point of attack. In particular, installing a new package (e.g. PIL) should affect sys.path, regardless of the way of delivery of the modules (shared libs, .py files, .pyc files, or a zip archive). I'm not too worried about code that inspects sys.path and expects certain invariants; that code is most likely interfering with the import mechanism so should be revisited anyway. On the lone .pyc issue: I'd like to see this disappear when using the filesystem, I see no use for it there if we support .pyc files in zip archives. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at interet.com Fri Dec 3 15:44:54 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 03 Dec 1999 09:44:54 -0500 Subject: [Python-Dev] Import redesign [LONG] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> Message-ID: <3847D766.1E5FFAF3@interet.com> Jean-Claude Wippler wrote: > > Guido van Rossum wrote: > > [...] > > Note that the interpretation of __file__ could be problematic. To > > what value do you set __file__ for a module loaded from a zip archive? > > Makefiles use "archive(entry)" (this also supports nesting if needed). I discovered the hard way this entry is not optional. I just used the archive file name for __file__. > This may be off-topic, but has anyone considered what it would take to > load shared libs out of an archive? One way is to extract on-the-fly to > a temporary area. A refinement is to leave extracted files there as > cache, and perhaps even to extract to a file with a name derived from > its MD5 digest (this way multiple users and even Python installations > can share the cache). Would it be useful to define a "standard" area? IMHO putting shared libs in an archive is a bad idea because the OS can not use them there. They must be extracted as you say. But then storage is wasted by using space in the archive and the external file. Deleting them after use wastes time. Better to leave them out of the archive and provide for them in the installer. IMHO the archive is a basic simple feature, and people make installers on top of that. Archives shouldn't try to do it all. JimA From mal at lemburg.com Fri Dec 3 15:14:09 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 03 Dec 1999 15:14:09 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> Message-ID: <3847D030.2C936E24@lemburg.com> Guido van Rossum wrote: > > [Greg] > > > I'd rather see the builtin machinery move to Python, regardless of what > > > system is used and/or what features are added. > > [Marc] > > In the long run that's probably the right direction, but right now > > we are only talking a very small set of additional features, > > which can easily be added to the existing code without too much > > fuzz. > > I disagree. We should do the redisign right rather than tweaking the > existing code. Ok, then... > > IMHO, we should retreat to a more centralized interface, > > one which more resembles a manager rather than the agent > > interface implemented in imputil.py. Add-ons can then > > register themselves to say "hey, I can handle pyz-archives" > > or "I know how to import .so modules" or "I provide a > > search function which you can call to have me scan > > my module container (directory, web-site, archive)". > > This makes sense. > > > The manager would take care of what to call and in which > > order, plus delegate requests to add-ons which implement > > the needed logic, e.g. add-ons for signature checking, unzipping > > archives, file system lookup tables, etc. > > > > It could also trace its actions and then keep an on-disk > > knowledge base for what it did in the past to find certain > > modules under certain conditions. > > > > Anyway, all this is extra magic for some future version of > > Python. > > I would say the manager API design and a basic set of specific > handlers should go into 1.6. BTW, is there a timeline for the 1.6 release ? I mean which things will have to be in 1.6 ? Some recent topics as hints: 1. Unicode 2. Import Manager API + default handlers 3. Python style coercion at C type level 4. Rich comparisons 5. __doc__ string extraction tool -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 28 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 3 15:24:04 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 03 Dec 1999 15:24:04 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> Message-ID: <3847D284.8CBF2A9C@lemburg.com> Fredrik Lundh wrote: > > MAL wrote: > > > IMHO, we should retreat to a more centralized interface, > > > one which more resembles a manager rather than the agent > > > interface implemented in imputil.py. Add-ons can then > > > register themselves to say "hey, I can handle pyz-archives" > > > or "I know how to import .so modules" or "I provide a > > > search function which you can call to have me scan > > > my module container (directory, web-site, archive)". > > but why? in my small-minded view of how python > works, an importer carries out a very simple task: > > given a name, check if you have a > module with that name, and install > it. if you cannot, fail (in which case > python asks the next importer along > the path). > > why do you have to complicate things beyond that? > why not just let Python provide a few base classes > and mixins for people who want to create custom > importers, and be done with it? Because importing in Python has become *much* more complicated over time. There are requests for new features which touch subjects such as storage mechanisms, lookups, signatures (for trusted code), lazy imports, etc. A chain of simple minded importers won't work together too well, duplicate work and downgrade performance considerably due to the many recursive function calls. Also, centralized caching strategies are hard to implement across import handlers. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 28 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jeremy at cnri.reston.va.us Fri Dec 3 17:47:54 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Fri, 3 Dec 1999 11:47:54 -0500 (EST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <14406.58137.359127.921135@weyr.cnri.reston.va.us> References: <199912022043.PAA15108@eric.cnri.reston.va.us> <14406.58137.359127.921135@weyr.cnri.reston.va.us> Message-ID: <14407.62522.360386.757519@goon.cnri.reston.va.us> >>>>> "FLD" == Fred L Drake, writes: >> (Unrelated remark: I should really try to release the set of >> modules we've written here at CNRI to deal with zip files. >> Unfortunately zip files are hairy and so is our code.) FLD> It doesn't help that that code just plain stinks. I maintain FLD> that no one here understands the whole of it. I'm all for improving the code and getting it out. The real problem is that interfaces have been glommed on for every new use of a Zip file. (You want to read one off a socket and extract files before you've got the whole thing? No problem! Add a new class.) We need to figure out the common patterns for using the archives and write a new set of interfaces to support that. Jeremy From guido at CNRI.Reston.VA.US Fri Dec 3 18:12:07 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 03 Dec 1999 12:12:07 -0500 Subject: [Python-Dev] What to do with our Zip code? In-Reply-To: Your message of "Fri, 03 Dec 1999 11:47:54 EST." <14407.62522.360386.757519@goon.cnri.reston.va.us> References: <199912022043.PAA15108@eric.cnri.reston.va.us> <14406.58137.359127.921135@weyr.cnri.reston.va.us> <14407.62522.360386.757519@goon.cnri.reston.va.us> Message-ID: <199912031712.MAA17061@eric.cnri.reston.va.us> [Jeremy, on our Zip code] > I'm all for improving the code and getting it out. The real problem > is that interfaces have been glommed on for every new use of a Zip > file. (You want to read one off a socket and extract files before > you've got the whole thing? No problem! Add a new class.) We need to > figure out the common patterns for using the archives and write a new > set of interfaces to support that. If we gave you the code we currently have, would someone else in this forum be willing to redesign it? Eventually it would become part of the Python distribution. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Sat Dec 4 10:54:30 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 4 Dec 1999 10:54:30 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> Message-ID: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> M.-A. Lemburg wrote: > > given a name, check if you have a > > module with that name, and install > > it. if you cannot, fail (in which case > > python asks the next importer along > > the path). > > > > why do you have to complicate things beyond that? > > why not just let Python provide a few base classes > > and mixins for people who want to create custom > > importers, and be done with it? > > Because importing in Python has become *much* more > complicated over time. There are requests for new > features which touch subjects such as storage mechanisms, > lookups, signatures (for trusted code), lazy imports, etc. sorry, I still don't understand it. our applications already use different storage mechanisms, databases, signatures, lazy importing, version handling, etc, etc. now, if *we* have managed to build all that on top of an old version of imputil.py, how come it's not sufficient for the rest of you? > A chain of simple minded importers won't work together > too well why? it sure works for us... > duplicate work avoiding duplicate work is what object oriented design is all about. and last time I checked, Python had excellent support for that. > and downgrade performance considerably due to the > many recursive function calls now that's what I call premature optimization. and this scares the hell out of me: if the rest of the python-dev crowd don't seriously believe that Python is (or can be made) fast enough to implement things like this, why the heck are you using Python at all? am I the only one here who doesn't believe in osterhout's talk about "the great system vs. scripting language divide"? From fredrik at pythonware.com Sat Dec 4 10:54:42 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 4 Dec 1999 10:54:42 +0100 Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> Message-ID: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> James C. Ahlstrom wrote: > IMHO putting shared libs in an archive is a bad idea because the OS > can not use them there. They must be extracted as you say. But then > storage is wasted by using space in the archive and the external file. > Deleting them after use wastes time. Better to leave them out of the > archive and provide for them in the installer. IMHO the > archive is a basic simple feature, and people make installers on top > of that. Archives shouldn't try to do it all. have you tried it? if not, why do you think you should be allowed to forbid others from doing it? in "the inmates are running the asylum", alan cooper points out that the *major* reason people all over the world love web applications are that there are no bloody installers. and here you are advocating that we all should be forced to use installers, when python makes it trivial to write self-installing apps. double-argh! (on the other hand, why do I complain? all pythonworks customers is going to be able to do all this anyway...). frankly, this "design by committee" (or is it "design by people who've never even been close to implementing something because they thought it was too hard, and thus think they're qualified to argue against those of us who didn't even realize that it was a hard problem"?) trend I've been seeing in all kinds of python forums makes me sooooo sad. the more of this I see (dist- utils-sig, doc-sig, here, c.l.python), the sadder I get, and the more I sympathise with John Skaller who's defining his own python-like universe... if someone needs me, I'll be down in the pub having a beer with the mad scientist, the shiny eff-bot, and mr. nitpicker. if we're not there, you'll find us in the lab, working on new string matching facilities for 1.6, SOAP [1], tkinter replacements for the masses, and whatever else we can come up with... see you! 1) http://www.newsalert.com/bin/story?StoryId=Coenz0bWbu0znmdKXqq From gstein at lyra.org Sat Dec 4 11:42:27 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 02:42:27 -0800 (PST) Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) In-Reply-To: <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> Message-ID: On Sat, 4 Dec 1999, Fredrik Lundh wrote: > M.-A. Lemburg wrote: >... > > Because importing in Python has become *much* more > > complicated over time. There are requests for new > > features which touch subjects such as storage mechanisms, > > lookups, signatures (for trusted code), lazy imports, etc. > > sorry, I still don't understand it. our applications already > use different storage mechanisms, databases, signatures, > lazy importing, version handling, etc, etc. now, if *we* > have managed to build all that on top of an old version > of imputil.py, how come it's not sufficient for the rest > of you? I agree. The imputil mechanism has been proven in combat to work for many scenarios. I have not (yet) heard of a case where the model has proven insufficient. > > A chain of simple minded importers won't work together > > too well > > why? it sure works for us... Exactly. "Why?" Please provide an example. >... > > and downgrade performance considerably due to the > > many recursive function calls > > now that's what I call premature optimization. and this > scares the hell out of me: if the rest of the python-dev > crowd don't seriously believe that Python is (or can be > made) fast enough to implement things like this, why > the heck are you using Python at all? am I the only > one here who doesn't believe in osterhout's talk about > "the great system vs. scripting language divide"? Don't worry Fredrik... I'm with you on this one. I do not believe there is a problem with the speed. Nobody has yet profiled imputil to find out where/how the time is being spent. Nobody has tried to speed it up. Therefore, any claims about its performance are simply FUD. I claim that its interface is correct, and you (Fredrik) stated it well: "given a name, please give me a module if you can (otherwise None)." Underneath that semantic, there are a lot of things that can be done to alter the performance and organization. Claims about speed are entirely premature. Yes, I'm biased. But, in truth, I haven't seen a better mechanism yet. I've tossed out a few ideas on how imputil could be improved (which are solely based on guess, rather than empirical evidence of profiling output). When those changes are completed and there is still an issue, then I'll admit defeat and wait for somebody else to provide a new design. Cheers, -g -- Greg Stein, http://www.lyra.org/ From marangoz at python.inrialpes.fr Sat Dec 4 12:15:53 1999 From: marangoz at python.inrialpes.fr (Vladimir Marangozov) Date: Sat, 4 Dec 1999 12:15:53 +0100 (CET) Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] In-Reply-To: <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> from "Fredrik Lundh" at Dec 04, 1999 10:54:42 AM Message-ID: <199912041115.MAA00539@python.inrialpes.fr> Fredrik Lundh wrote: > [snip] > > > > frankly, this "design by committee"... [snip] > ... see you! > > > C'mon /F, it's a battle of ideas and that's the way it works before filtering the good ones from the bad ones, then focusing on the appropriate implementation. I'm in sync with the discussion, although I haven't posted my partial notes on it due to lack of time. But let me say that overall, this discussion is a good thing and the more opinions we get, the better. BTW, you just _can't_ leave like this and start playing solitaire at the bar, first, because we need beer too and it's unlikely that you'll find a bar we don't know already, and second, because it was you who revived this discussion with 1 word, repeated 3 times: > Subject: Re: [Python-Dev] Python 1.6 status > Date: Wed, 17 Nov 1999 12:46:01 +0100 > > Guido van Rossum wrote: > > - suggestions for new issues that maybe ought to be settled in 1.6 > > three things: imputil, imputil, imputil > > Thus, with no visible argumentation (so don't shoot on others when they argue instead of you), and with this one word, you pushed Guido to the extreme of suggesting a complete redesign of the import machinery from scratch, based on a "Grand Architecture" :-). Right? -- Right! This is a fact and a fairly amount of the credits go entirely to you! Since then, however, I haven't really seen your arguments, and I believe that nobody here got exactly your point. I, for one, may well argue against imputil as being just another brick on top of the grand mess. But because I haven't made the time to write properly my notes, I don't dare to express a partial opinion, not blame those who argue good or bad in the meantime, when I'm silent. So, why are you showing us your back when you have clearly something to say, but like me, you haven't made the time to say it? Please don't waste my time with emotional rants ;-). Everybody here tries to contribute according to its knowledge, experience and availability. Later, -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From mal at lemburg.com Sat Dec 4 11:45:52 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 04 Dec 1999 11:45:52 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> Message-ID: <3848F0E0.B8132AD2@lemburg.com> Fredrik Lundh wrote: > > M.-A. Lemburg wrote: > > > given a name, check if you have a > > > module with that name, and install > > > it. if you cannot, fail (in which case > > > python asks the next importer along > > > the path). > > > > > > why do you have to complicate things beyond that? > > > why not just let Python provide a few base classes > > > and mixins for people who want to create custom > > > importers, and be done with it? > > > > Because importing in Python has become *much* more > > complicated over time. There are requests for new > > features which touch subjects such as storage mechanisms, > > lookups, signatures (for trusted code), lazy imports, etc. > > sorry, I still don't understand it. our applications already > use different storage mechanisms, databases, signatures, > lazy importing, version handling, etc, etc. now, if *we* > have managed to build all that on top of an old version > of imputil.py, how come it's not sufficient for the rest > of you? I've tried to get (an older) imputil.py version up and running too. It did work, but only after some considerable tweaking and even with integrated cache mechanisms did not reach the performance of the builtin importer (which doesn't use the kinds of caching strategies I had built into imputil.py). Getting the whole setup to work wasn't easy at all, because of the way imputil importers delegate work and things get even more confusing when it starts to "take over" certain parts of packages by installing temselves as importers for a particular package. > > A chain of simple minded importers won't work together > > too well > > why? it sure works for us... An example: A path importer knows how to scan directories and how to use a path to tell the correct order. It can maybe also import .py/.pyc/.pyo files. Now what happens if it finds a shared lib as module... the usual imputil way would be to delegate the request to some other importer which can handle shared libs... but wait: how does the shared lib importer know where to look ? It will have to rescan the directories, etc... > > duplicate work > > avoiding duplicate work is what object oriented design > is all about. and last time I checked, Python had excellent > support for that. See my example above. The agent approach used by imputil does not support OO design too well: even though you can avoid duplicate programming work on the importers by using a few base classes which implement dir scans, shared lib imports, etc. the imputil design does not provide means to avoid duplicate actions taken by the importers. > > and downgrade performance considerably due to the > > many recursive function calls > > now that's what I call premature optimization. and this > scares the hell out of me: if the rest of the python-dev > crowd don't seriously believe that Python is (or can be > made) fast enough to implement things like this, why > the heck are you using Python at all? am I the only > one here who doesn't believe in osterhout's talk about > "the great system vs. scripting language divide"? Looks like you are in ranting mode here ;-) Seriously, I've checked my imputil.py version (with caches enabled) against the builtin importer and noticed a performance downgrade by factor >2. This was enough to convince me of looking for other techniques to handle the problems I had at the time... you know, relative imports and things. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 27 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Sat Dec 4 12:04:15 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 04 Dec 1999 12:04:15 +0100 Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> Message-ID: <3848F52F.5F5B748F@lemburg.com> Fredrik Lundh wrote: > > > > frankly, this "design by committee" (or is it "design by > people who've never even been close to implementing > something because they thought it was too hard, and > thus think they're qualified to argue against those of > us who didn't even realize that it was a hard problem"?) Huh ? Two points: 1. How can you be sure that people haven't tried implementing their ideas and for various reasons have come to some conclusion about those ideas ? 2. Would you seriously disqualify people from joining a discussion by the simple arguement that they have not implemented anything yet ? Just take the Unicode discussion as example: it was very lively and resulted in a decent proposal which is now subject to further investigation by the implementors ;-) Many people have joined in even though they did not and/or will not implement anything. Still, their arguments were very useful to show up weaknesses in the proposal. Now, let's rather have a beer in the pub around the corner than go on ranting about :-). -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 27 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Sat Dec 4 12:53:33 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 04 Dec 1999 12:53:33 +0100 Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) References: Message-ID: <384900BD.D16E72BC@lemburg.com> Greg Stein wrote: > > > [me:] > > > A chain of simple minded importers won't work together > > > too well > > > > why? it sure works for us... > > Exactly. "Why?" Please provide an example. See my reply to Fredrik. > >... > > > and downgrade performance considerably due to the > > > many recursive function calls > > > > now that's what I call premature optimization. and this > > scares the hell out of me: if the rest of the python-dev > > crowd don't seriously believe that Python is (or can be > > made) fast enough to implement things like this, why > > the heck are you using Python at all? am I the only > > one here who doesn't believe in osterhout's talk about > > "the great system vs. scripting language divide"? > > Don't worry Fredrik... I'm with you on this one. I do not believe there is > a problem with the speed. Nobody has yet profiled imputil to find out > where/how the time is being spent. Nobody has tried to speed it up. Sorry, Greg, but that is simply not true. I've spend a few days on trying to get more performance out of it and have succeeded, but in the end it wasn't enough to convince me of the approach. > Therefore, any claims about its performance are simply FUD. BTW, did anybody mention that an import manager wouldn't be able to provide an API which is useable for imputil style importers ? I'm not argueing against the possibility to use imputil style importers, just against making it the sole method of adding wisdom to Python imports. The imputil importers could well benefit from a manager providing logic to do basic things like importing shared libs, checking signatures, downloading modules from the web, etc. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 27 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein at lyra.org Sat Dec 4 13:15:13 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 04:15:13 -0800 (PST) Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) In-Reply-To: <384900BD.D16E72BC@lemburg.com> Message-ID: On Sat, 4 Dec 1999, M.-A. Lemburg wrote: >... > > Don't worry Fredrik... I'm with you on this one. I do not believe there is > > a problem with the speed. Nobody has yet profiled imputil to find out > > where/how the time is being spent. Nobody has tried to speed it up. > > Sorry, Greg, but that is simply not true. I've spend a few > days on trying to get more performance out of it and have > succeeded, but in the end it wasn't enough to convince me > of the approach. You sent me your changes... I don't believe that you were aggressive enough. As I've mentioned before, I think it is quite possible to retain the general Importer style and get_code() interface, but to shift some functionality out (to be computed once) to a higher-level mechanism. The patches that you sent me did not do this, so I'm not surprised that you hit a wall. Ack. See? Now I'm getting into discussions about performance and implementation without truly knowing where the timing is spent. Eyeballing it, I have an idea, but it would be best too see a profile output. My mantra is always "90% of the time you're wrong about where 90% of the time is being spent." I am unconcerned about performance, but will work on it so that I don't need to continue this conversation. That burden is on me. > > Therefore, any claims about its performance are simply FUD. > > BTW, did anybody mention that an import manager wouldn't > be able to provide an API which is useable for imputil > style importers ? I'm not argueing against the possibility > to use imputil style importers, just against making it the > sole method of adding wisdom to Python imports. Since the core will delegate out to Python (note: current working theory), then it certainly is not the "sole method" (since you can just replace the Python code). But there must be a default mechanism. The ihooks stuff was too complicated. imputil seems to be much easier. I'd love to see a third mechanism.... so I can steal ideas :-) > The imputil importers could well benefit from a manager > providing logic to do basic things like importing > shared libs, checking signatures, downloading modules > from the web, etc. For shared libs, yes. For the others: geez... I don't want to see that in the core infrastructure. Shift that out to specialized Importers. The infrstructure ought to be teeny and agnostic about how to map a module name to a module. Side note to python-dev people: I apologize... I realize that I'm beginning to get a bit defensive here. I'm going to be at XML '99 until Friday, so that should give me a breather. When I get back, I'll skip the talk and do some code. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Sat Dec 4 13:32:04 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 04:32:04 -0800 (PST) Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <3848F0E0.B8132AD2@lemburg.com> Message-ID: On Sat, 4 Dec 1999, M.-A. Lemburg wrote: > Fredrik Lundh wrote: >... > > sorry, I still don't understand it. our applications already > > use different storage mechanisms, databases, signatures, > > lazy importing, version handling, etc, etc. now, if *we* > > have managed to build all that on top of an old version > > of imputil.py, how come it's not sufficient for the rest > > of you? > > I've tried to get (an older) imputil.py version up and running > too. It did work, but only after some considerable tweaking > and even with integrated cache mechanisms did not reach > the performance of the builtin importer (which doesn't > use the kinds of caching strategies I had built into > imputil.py). 1) yes, it was an older version and did not have the PathImporter class. As a by product, the DirectoryImporters that it *did* have were much slower. It still did not support builtins, frozen modules, or dynamic loads. All of that is present now, so it works "out of the box" much better. 2) Performance: as I wrote in the other email, I don't believe that is an argument against the design. The imputil approach *will* be slower than the current Python mechanism, but there is some more coding to do to truly see how much. The side benefits (e.g. ZipImporter and caching) may outweigh the result. Time will tell. > Getting the whole setup to work wasn't easy > at all, because of the way imputil importers delegate work > and things get even more confusing when it starts to "take > over" certain parts of packages by installing temselves > as importers for a particular package. I don't understand this. If it is relevant, then please expand. Thx. > > > A chain of simple minded importers won't work together > > > too well > > > > why? it sure works for us... > > An example: > > A path importer knows how to scan directories and how to use > a path to tell the correct order. It can maybe also import > .py/.pyc/.pyo files. Now what happens if it finds a shared > lib as module... the usual imputil way would be to delegate > the request to some other importer which can handle shared > libs... but wait: how does the shared lib importer know > where to look ? It will have to rescan the directories, > etc... No, the "usual imputil way" is that the PathImporter understands searching a path and loading stuff from that path. An Importer is a combination of locating and loading (since they are, typically, tightly bound). The next rev will allow user-plugging of support for new file types. > > > duplicate work > > > > avoiding duplicate work is what object oriented design > > is all about. and last time I checked, Python had excellent > > support for that. > > See my example above. > > The agent approach used by imputil does not support > OO design too well: even though you can avoid duplicate > programming work on the importers by using a few > base classes which implement dir scans, shared lib > imports, etc. the imputil design does not provide > means to avoid duplicate actions taken by the importers. There is always a balance to be struck between independence and coupling. I chose to reduce coupling and increase independence. If you shift a bunch of stuff out of the Importers, then you will increase the coupling between the imputil framework and the Importers. That coupling will then close off future possibilities. Within the framework itself (e.g. between _import_hook and get_code), there is a lot of opportunity for change. Since that is behind the covers, it is no big deal to shift functionality around. I plan to do so. >... > Looks like you are in ranting mode here ;-) Seriously, > I've checked my imputil.py version (with caches enabled) > against the builtin importer and noticed a performance > downgrade by factor >2. This was enough to convince me > of looking for other techniques to handle the problems > I had at the time... you know, relative imports and things. I have run a long series of tests. Without doing any performance work on imputil, the ratio is 9 to 13. The 13 may have bumped up to about 15 or 16 when I added some dynamic loading code (I forget). Regardless, it is definitely less than a 2X increase. And that is with zero optimization. *shrug* I'm done. I'll do some code in a couple weeks. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Sat Dec 4 14:12:32 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 05:12:32 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912031439.JAA16524@eric.cnri.reston.va.us> Message-ID: On Fri, 3 Dec 1999, Guido van Rossum wrote: >... > Great response. I think we know where we each stand. Please go ahead > with a new design. (That's trust, not carte blanche.) Accepted gratefully. Thx. > Just one thought: the more I think about it, the less I like > sys.importers: functionality which is implemented through > sys.importers must necessarily be placed either in front of all of > sys.path or after it. While this is helpful for "canned" apps that > want *everything* to be imported from a fixed archive, I think that > for regular Python installations sys.path should remain the point of > attack. In particular, installing a new package (e.g. PIL) should > affect sys.path, regardless of the way of delivery of the modules > (shared libs, .py files, .pyc files, or a zip archive). Okay. I'll design with respect to this model. To be explicit/clear and to be sure I'm hearing you right: sys.path may contain Importer instances. Given the name FOO, the system will step through sys.path looking for the first occurence of FOO (looking in a directory or delegating). FOO may be found with any number of (configurable) file extensions, which are ordered (e.g. ".so" before ".py" before ".isl"). > I'm not too worried about code that inspects sys.path and expects > certain invariants; that code is most likely interfering with the > import mechanism so should be revisited anyway. The Benevolent Dictator has spoken. So be it. :-) > On the lone .pyc issue: I'd like to see this disappear when using the > filesystem, I see no use for it there if we support .pyc files in zip > archives. No problem. This actually creates a simplification in the system, as I'm seeing it now. I'm also seeing opportunities for a code reorg which may work towards MAL's issues with performance. I hope to have something in two or three weeks. I also hope people can be patient :-), but I certainly wouldn't mind seeing some alternative code! Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm at hypernet.com Sat Dec 4 15:59:44 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Sat, 4 Dec 1999 09:59:44 -0500 Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) In-Reply-To: <384900BD.D16E72BC@lemburg.com> Message-ID: <1267803104-11215142@hypernet.com> M.-A. Lemburg wrote: > Greg Stein wrote: > > Don't worry Fredrik... I'm with you on this one. I do not > > believe there is a problem with the speed. Nobody has yet > > profiled imputil to find out where/how the time is being spent. > > Nobody has tried to speed it up. > > Sorry, Greg, but that is simply not true. I've spend a few > days on trying to get more performance out of it and have > succeeded, but in the end it wasn't enough to convince me > of the approach. Remember those comparisons of Perl and Python, to which you added cgipython? I've added to the list a version that uses an old version of imputil (probably the one you optimized) and a compressed std lib. Note that my Linux python (1.5.2) is built in the RedHat style - even struct and strop are .so's; so that accounts for the majority of the open calls. This is a full Python (runs code.py if you don't pass it a script name). For lack of a better name, I've called it "pykit". First, the size of log files (in lines), i.e. number of system calls: Solaris Linux IRIX[1] Perl 88 85 70 Python 425 316 257 cgipython 182 pykit 136 Next, the number of "open" calls: Solaris Linux IRIX Perl 16 10 9 Python 107 71 48 cgipython 33 pykit 9 And the number of unsuccessful "open" calls: Solaris Linux IRIX Perl 6 1 3 Python 77 49 32 cgipython 28 pykit 2 Number of "mmap" calls: Solaris Linux IRIX Perl 25 25 1 Python 36 24 1 cgipython 13 pykit 21 This test would show off more if it went beyond startup. An import of a standard lib module in my stock Python involves 2 failed stats and 6 failed opens, then 2 successful opens and 2 fstats before the module is loaded. None of these occur in pykit. The downside (asking my Importer for a .so or a module not in the importer) takes no system calls, and involves a dozen or so lines of Python and a check of a dictionary. - Gordon From tismer at appliedbiometrics.com Sat Dec 4 16:29:03 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sat, 04 Dec 1999 16:29:03 +0100 Subject: [Python-Dev] imputil speed (was: .DLL vs .PYD search order) References: Message-ID: <3849333F.1DF2A201@appliedbiometrics.com> Greg Stein wrote: ... > My mantra is always "90% of the time you're wrong about where 90% > of the time is being spent." What a great sentence! We all know it, but many of us (especially me) forget about it during 90% of our coding time. Much better to spend this on design (as you did). thanks - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jim at interet.com Sat Dec 4 18:27:44 1999 From: jim at interet.com (James C. Ahlstrom) Date: Sat, 04 Dec 1999 12:27:44 -0500 Subject: [Python-Dev] Re: Import redesign [Warning: INCLUDES RANT] References: <199912022043.PAA15108@eric.cnri.reston.va.us> <3846E79A.446EAFD5@equi4.com> <3847D766.1E5FFAF3@interet.com> <011701bf3e3d$97081640$f29b12c2@secret.pythonware.com> Message-ID: <38494F10.C644BA7@interet.com> Fredrik Lundh wrote: > > James C. Ahlstrom wrote: > > IMHO putting shared libs in an archive is a bad idea because the OS Dear Fredrik, I thought the point of Python-Dev was to propose designs and get feedback, right? Well, I got feedback :-). OK, I agree to alter my archive format so it provides the ability to store shared libs and not just *.pyd. I will add the string length and if needed a flag indicating the name is a shared lib. Now the details: > have you tried it? if not, why do you think you should > be allowed to forbid others from doing it? Yes I have tried it, and I am currently on my fourth version of an archive format which is based on formats by Greg Stein and Gordon McMillan. I hope it meets with the favor of the Grand Inquisition, and becomes the standard format. But maybe it won't. Oh well. > bloody installers. and here you are advocating that > we all should be forced to use installers, when python > makes it trivial to write self-installing apps. double-argh! I am not forcing anyone to do anything, only proposing that shared libs are best handled directly by imputil and not the class within imputil which handles archive files. It is just a geeky design issue, nothing more. JimA From jim at interet.com Sat Dec 4 19:31:48 1999 From: jim at interet.com (James C. Ahlstrom) Date: Sat, 04 Dec 1999 13:31:48 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> Message-ID: <38495E14.9C2FB107@interet.com> "M.-A. Lemburg" wrote: > An example: > > A path importer knows how to scan directories and how to use > a path to tell the correct order. It can maybe also import > .py/.pyc/.pyo files. Now what happens if it finds a shared > lib as module... the usual imputil way would be to delegate > the request to some other importer which can handle shared > libs... but wait: how does the shared lib importer know > where to look ? It will have to rescan the directories, > etc... The above refers to an earlier but still very recent version of imputil. On that basis is is perfectly accurate. Here is another example from my own experience almost identical to the above: One possible archive file format holds its list of archived *.pyc file names as keys in a dictionary. This is simple and efficient, but fails to correctly address the problem of shared libs (aka DLL's in Windows) with names identical to names of *.pyc files in the archive. For example, suppose foo.pyc is in the archive, and foo.dll is in a directory. Suppose sys.path is to be used to decide whether to load foo.pyc or foo.dll. Then an "archive importer" will fail to do this. Specifically you can't see if foo.pyc is in the archive and then check sys.path, nor can you do the reverse. You must call the "archive importer" repeatedly for each element of sys.path and search the directory at the same time. JimA From jim at interet.com Sat Dec 4 20:51:47 1999 From: jim at interet.com (James C. Ahlstrom) Date: Sat, 04 Dec 1999 14:51:47 -0500 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <384970D3.26A9ECDB@interet.com> Greg Stein wrote: > > On Fri, 3 Dec 1999, Guido van Rossum wrote: > > attack. In particular, installing a new package (e.g. PIL) should > > affect sys.path, regardless of the way of delivery of the modules > > (shared libs, .py files, .pyc files, or a zip archive). > To be explicit/clear and to be sure I'm hearing you right: sys.path may > contain Importer instances. Given the name FOO, the system will step > through sys.path looking for the first occurence of FOO (looking in a > directory or delegating). FOO may be found with any number of > (configurable) file extensions, which are ordered (e.g. ".so" before > ".py" before ".isl"). This is basically a gripe about this design spec. So if the answer turns out to be "we need this functionality so shut up" then just say that and don't flame me. This spec is painful. Suppose sys.path has 10 elements, and there are six file extensions. Then the simple algorithm is slow: for path in sys.path: # Yikes, may not be a string! for ext in file_extensions: name = "%s.%s" % (module_name, ext) full_path = os.path.join(path, name) if os.path.isfile(full_path): # Process file here And sys.path can contain class instances which only makes things slower. You could do a readdir() and cache the results, but maybe that would be slower. A better algorithm might be faster, but a lot more complicated. In the context of archive files, it is also painful. It prevents you from saving a single dictionary of module names. Instead you must have len(sys.path) dictionaries. You could try to save in the archive information about whether (say) a foo.dll was present in the file system, but the list of extensions is extensible. The above problem only exists to support equally-named modules; that is, to support a run-time choice of whether to load foo.pyc, foo.dll, foo.isl, etc. I claim (without having written it) that the fastest algorithm to solve the unique-name case is much faster than the fastest algorithm to solve the choose-among-equal-names case. Do we really need to support the equal-name case [Jim runs for cover...]? If so, how about inventing a new way to support it. Maybe if equal names exist, these must be pre-loaded from a known location? JimA From gstein at lyra.org Sat Dec 4 22:59:00 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 13:59:00 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <384970D3.26A9ECDB@interet.com> Message-ID: On Sat, 4 Dec 1999, James C. Ahlstrom wrote: > Greg Stein wrote: >... > > To be explicit/clear and to be sure I'm hearing you right: sys.path may > > contain Importer instances. Given the name FOO, the system will step > > through sys.path looking for the first occurence of FOO (looking in a > > directory or delegating). FOO may be found with any number of > > (configurable) file extensions, which are ordered (e.g. ".so" before > > ".py" before ".isl"). > > This is basically a gripe about this design spec. So if the answer > turns out to be "we need this functionality so shut up" then just > say that and don't flame me. > > This spec is painful. Suppose sys.path has 10 elements, and there > are six file extensions. Then the simple algorithm is slow: > for path in sys.path: # Yikes, may not be a string! > for ext in file_extensions: > name = "%s.%s" % (module_name, ext) > full_path = os.path.join(path, name) > if os.path.isfile(full_path): > # Process file here This is the algorithm that Python uses today, and my standard Importers follow. > And sys.path can contain class instances > which only makes things slower. IMO, we don't know this, or whether it is significant. > You could do a readdir() and cache > the results, but maybe that would be slower. A better > algorithm might be faster, but a lot more complicated. Who knows. BUT: the import process is now in Python -- it makes it *much* easier to run these experiments. We could not really do this when the import process is "hard-coded" in C code. > In the context of archive files, it is also painful. It prevents > you from saving a single dictionary of module names. Instead you > must have len(sys.path) dictionaries. You could try to > save in the archive information about whether (say) a foo.dll was > present in the file system, but the list of extensions is extensible. I am not following this. What/where is the "single dictionary of module names" ? Are you referring to a cache? Or is this about building an archive? An archive would look just like we have now: map a name to a module. It would not need multiple dictionaries. > The above problem only exists to support equally-named modules; that > is, to support a run-time choice of whether to load foo.pyc, foo.dll, > foo.isl, etc. I claim (without having written it) that the fastest > algorithm to solve the unique-name case is much faster than the fastest > algorithm to solve the choose-among-equal-names case. > > Do we really need to support the equal-name case [Jim runs for > cover...]? > If so, how about inventing a new way to support it. Maybe if equal > names exist, these must be pre-loaded from a known location? I don't understand what the problem is. I don't see one. We are still mapping a name to a module. sys.path defines a precedence. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Sun Dec 5 02:17:57 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 4 Dec 1999 17:17:57 -0800 (PST) Subject: [Python-Dev] pyc archives (was: .DLL vs .PYD search order) In-Reply-To: <38495E14.9C2FB107@interet.com> Message-ID: On Sat, 4 Dec 1999, James C. Ahlstrom wrote: >... > One possible archive file format holds its list of archived > *.pyc file names as keys in a dictionary. This is simple and > efficient, but fails to correctly address the problem of shared > libs (aka DLL's in Windows) with names identical to names of > *.pyc files in the archive. For example, suppose foo.pyc is in the > archive, and foo.dll is in a directory. Suppose sys.path is to be > used to decide whether to load foo.pyc or foo.dll. Then an > "archive importer" will fail to do this. Specifically you can't > see if foo.pyc is in the archive and then check sys.path, nor can > you do the reverse. You must call the "archive importer" repeatedly > for each element of sys.path and search the directory at the same time. What? The archive is independent of each .pyc's original position in sys.path. There is no reason/need to carry that information into an archive. If the archive contains "foo", then you're done. If it doesn't, then move on to the next element of sys.path (directory or Importer instance) and look there. Basically: if you deploy an archive, then all of its files will take precedence over any file found later on sys.path. This is exactly what sys.path is about: establishing precedence. If I understand you correctly, then you're trying to say there is some sort of interleaving that must occur. If so, then I don't understand why. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Mon Dec 6 13:20:34 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 6 Dec 1999 13:20:34 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> <384B7E32.F7B81D82@lemburg.com> Message-ID: <004401bf3fe4$4cab6ea0$f29b12c2@secret.pythonware.com> > > you obviously attempted to use imputil to implement > > non-standard import behaviour on top of the standard > > storage system -- while we've used it to implement > > standard import behaviour on top of non-standard > > storage systems. > > No, I tried to make the imputil approach work as replacement > for the standard builtin importer. I'm confused. earlier, you said (or rather, I think you said) that you looked at imputil to see if it could "handle the problems you had at the time"... and now you say that you tried to use it as a drop-in replacement for the "standard path importer". I must be missing something here... > After I got that to work, I added some caching > to avoid duplicated stats. The resulting importer was > around twice as slow as the builtin one for the following > imports: > > # the default one Python does at startup, plus: > from mx import HTMLTools,DateTime,ODBC > > This is a pretty common setup for my scripts, so its > preformance is relevant to me. did you try stuffing all your PYC's into an archive file, and running them from there? From fredrik at pythonware.com Sun Dec 5 19:22:57 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun, 5 Dec 1999 19:22:57 +0100 Subject: [Python-Dev] Re: .DLL vs .PYD search order References: <38479573.B2CFDD2B@lemburg.com> <199912031345.IAA16376@eric.cnri.reston.va.us> <015e01bf3d98$a5d3fcc0$f29b12c2@secret.pythonware.com> <3847D284.8CBF2A9C@lemburg.com> <011601bf3e3d$9087f6a0$f29b12c2@secret.pythonware.com> <3848F0E0.B8132AD2@lemburg.com> Message-ID: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> > I've checked my imputil.py version (with caches enabled) > against the builtin importer and noticed a performance > downgrade by factor >2. This was enough to convince me > of looking for other techniques to handle the problems > I had at the time... you know, relative imports and things. hmm. I think I see the problem here... you obviously attempted to use imputil to implement non-standard import behaviour on top of the standard storage system -- while we've used it to implement standard import behaviour on top of non-standard storage systems. I don't know if imputil is good enough for the former, and I don't think I care... I've spent too many nights debugging code that relied on clever, non-standard hacks. PS. on the performance side of things, did you know that 're' can be up to ten times slower than 'regex'? but people don't complain -- probably because it allows them to do things they couldn't do before... From jim at interet.com Mon Dec 6 20:40:01 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 06 Dec 1999 14:40:01 -0500 Subject: [Python-Dev] Re: pyc archives (was: .DLL vs .PYD search order) References: Message-ID: <384C1111.92984B5A@interet.com> Greg Stein wrote: > > On Sat, 4 Dec 1999, James C. Ahlstrom wrote: > >... > > One possible archive file format holds its list of archived > > *.pyc file names as keys in a dictionary. This is simple and > > efficient, but fails to correctly address the problem of shared > What? The archive is independent of each .pyc's original position in > sys.path. There is no reason/need to carry that information into an > archive. > > If the archive contains "foo", then you're done. If it doesn't, then move > on to the next element of sys.path (directory or Importer instance) and > look there. > > Basically: if you deploy an archive, then all of its files will take > precedence over any file found later on sys.path. This is exactly what > sys.path is about: establishing precedence. Sorry, I am a little slow today. My daughter got me up at 6 am to work on her computer video editor. No disk space, fragmentation, 2 gig limit on AVI files, ........ Are you saying this? If foo is imported, the archive importer is consulted first to see if it can provide foo. If not, sys.path is searched for foo.pyc, foo.pyl etc., and if foo.pyl is found, then its contents are added to the single archive importer dictionary. The order of addition to the archive dictionary is determined by sys.path, and duplicate names are not entered because they lie later on sys.path. But once a file is recognized as in an archive, it effectively precedes all of sys.path. Or this? If foo is imported, sys.path is searched for foo.pyc, foo.pyl, etc., and also all archive files found at each element of sys.path are searched for foo. If "bar" is imported, it may be found in foo.pyl. That is, there is an instance of an archive importer for each element of sys.path. What if the user names an archive file not on sys.path? What order does it have? JimA From jim at interet.com Mon Dec 6 19:34:41 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 06 Dec 1999 13:34:41 -0500 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <384C01C1.8D1AFFFF@interet.com> Greg Stein wrote: > > On Sat, 4 Dec 1999, James C. Ahlstrom wrote: > > # Process file here > > This is the algorithm that Python uses today, and my standard Importers > follow. Agreed. > > And sys.path can contain class instances > > which only makes things slower. > > IMO, we don't know this, or whether it is significant. Agreed. > > You could do a readdir() and cache > > the results, but maybe that would be slower. A better > > algorithm might be faster, but a lot more complicated. > > Who knows. BUT: the import process is now in Python -- it makes it *much* > easier to run these experiments. We could not really do this when the > import process is "hard-coded" in C code. Agreed. > > In the context of archive files, it is also painful. It prevents > > you from saving a single dictionary of module names. Instead you > > must have len(sys.path) dictionaries. You could try to > > save in the archive information about whether (say) a foo.dll was > > present in the file system, but the list of extensions is extensible. > > I am not following this. What/where is the "single dictionary of module > names" ? Are you referring to a cache? Or is this about building an > archive? > > An archive would look just like we have now: map a name to a module. It > would not need multiple dictionaries. The "single dictionary of names" is in the single archive importer instance and has nothing to do with creating the archive. It is currently programmed this way. Suppose the user specifies by name 12 archive files to be searched. That is, the user hacks site.py to add archive names to the importer. The "single dictionary" means that the archive importer takes the 12 dictionaries in the 12 files and merges them together into one dictionary in order to speed up the search for a name. The good news is you can always just call the archive importer to get a module. The bad news is you can't do that for each entry on sys.path because there is no necessary identity between archive files and sys.path. The user specified the archive files by name, and they may or may not be on sys.path, and the user may or may not have specified them in the same order as sys.path even if they are. Suppose archive files must lie on sys.path and are processed in order. Then to find them you must know their name. But IMHO you want to avoid doing a readdir() on each element of sys.path and looking for files *.pyl. Suppose archive file names in general are the known name "lib.pyl" for the Python library, plus the names "package.pyl" where "package" can be the name of a Python package as a single archive file. Then if the user tries to import foo, imputil will search along sys.path looking for foo.pyc, foo.pyl, etc. If it finds foo.pyl, the archive importer will add it to its list of known archive files. But it must not add it to its single dictionary, because that would destroy the information about its position along sys.path. Instead, it must keep a separate dictionary for each element of sys.path and search the separate dictionaries under control of imputil. That is, get_code() needs a new argument for the element of sys.path being searched. Alternatively, you could create a new importer instance for each archive file found, but then you still have multiple dictionaries. They are in the multiple instances. All this is needed only to support import of identically named modules. If there are none, there is no problem because sys.path is being used only to find modules, not to disambiguate them. See also my separate reply to your other post which discusses this same issue. JimA From gstein at lyra.org Tue Dec 7 01:43:21 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 6 Dec 1999 16:43:21 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <384C01C1.8D1AFFFF@interet.com> Message-ID: On Mon, 6 Dec 1999, James C. Ahlstrom wrote: > Greg Stein wrote: >... > > I am not following this. What/where is the "single dictionary of module > > names" ? Are you referring to a cache? Or is this about building an > > archive? > > > > An archive would look just like we have now: map a name to a module. It > > would not need multiple dictionaries. > > The "single dictionary of names" is in the single archive importer > instance and has nothing to do with creating the archive. It > is currently programmed this way. Ah. There is the problem. In Guido's suggestion for the "next path of inquiry" :-), there is no "single dictionary of names". Instead, you have Importer instances as items in sys.path. Each instance maintains its dictionary, and they are not (necessarily) combined. If we were to combine them, then we would need to maintain the ordering requirements implied by sys.path. However, this would be problematic if sys.path changed -- we would have to detect the situation and rebuild a merged dict. > Suppose the user specifies by name 12 archive files to be searched. > That is, the user hacks site.py to add archive names to the importer. > The "single dictionary" means that the archive importer takes the 12 > dictionaries in the 12 files and merges them together into one > dictionary > in order to speed up the search for a name. The good news is you can > always just call the archive importer to get a module. The bad news is > you can't do that for each entry on sys.path because there is no > necessary identity between archive files and sys.path. The user > specified the archive files by name, and they may or may not be on > sys.path, and the user may or may not have specified them in the > same order as sys.path even if they are. The importer must be inserted into sys.path to establish a precedence. If the user wants to add 12 libraries... fine. But *all* of those modules will fall under a precedence defined by the Importer's position on sys.path. > Suppose archive files must lie on sys.path and are processed in order. > Then to find them you must know their name. But IMHO you want to > avoid doing a readdir() on each element of sys.path and looking for > files *.pyl. I do not believe that we will arbitrarily locate and open library files. They must be specified explicitly. > Suppose archive file names in general are the known name "lib.pyl" > for the Python library, plus the names "package.pyl" where "package" > can be the name of a Python package as a single archive file. Then > if the user tries to import foo, imputil will search along sys.path > looking for foo.pyc, foo.pyl, etc. If it finds foo.pyl, the archive > importer will add it to its list of known archive files. But it must > not add it to its single dictionary, because that would destroy the > information about its position along sys.path. Instead, it must keep > a separate dictionary for each element of sys.path and search the > separate dictionaries under control of imputil. That is, get_code() > needs a new argument for the element of sys.path being searched. > Alternatively, you could create a new importer instance for each > archive file found, but then you still have multiple dictionaries. > They are in the multiple instances. If the user installs ".pyl" as a recognized extension (i.e. installs into the PathImporter), then the above scenario is possible. In my in-head-design, I had not imagined any state being retained for extension-recognizer hooks. Of course, state can be retained simply by using a bound-method for the hook function. get_code() would not need to change. The foo.pyl would be consulted at the appropriate time based on where it is found in sys.path. Note that file- extension hooks would definitely have a complete path to the target file. Those are not Importers, however (although they will closely follow the get_code() hook since the extension is called from get_code). From tim_one at email.msn.com Tue Dec 7 06:11:25 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 7 Dec 1999 00:11:25 -0500 Subject: [Python-Dev] Re: .DLL vs .PYD search order In-Reply-To: <02eb01bf3f4d$c200d790$f29b12c2@secret.pythonware.com> Message-ID: <001601bf4071$8278cc20$88a0143f@tim> [/F] > PS. on the performance side of things, did you know > that 're' can be up to ten times slower than 'regex'? > but people don't complain -- probably because it > allows them to do things they couldn't do before... Bad example: people do complain about this. Those who care a lot continue to use regex, temporarily pacified by the promise that re.py will get recoded in C and thus regain a good chunk of regex's speed. Those who care a whale of a lot continue to use Perl <0.9 wink>. From guido at CNRI.Reston.VA.US Tue Dec 7 13:45:25 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 07 Dec 1999 07:45:25 -0500 Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: Your message of "Mon, 06 Dec 1999 16:43:21 PST." References: Message-ID: <199912071245.HAA21596@eric.cnri.reston.va.us> > If we were to combine them, then we would need to maintain the ordering > requirements implied by sys.path. However, this would be problematic if > sys.path changed -- we would have to detect the situation and rebuild a > merged dict. No need to worry about this: just don't merge the caches. Compared to the hundreds of failed open() calls that are done now, it's no big deal to do 12 failed Python dictionary lookups instead of one. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Tue Dec 7 14:25:54 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 7 Dec 1999 14:25:54 +0100 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com> Greg Stein wrote: > > The "single dictionary of names" is in the single archive importer > > instance and has nothing to do with creating the archive. It > > is currently programmed this way. > > Ah. There is the problem. In Guido's suggestion for the "next path of > inquiry" :-), there is no "single dictionary of names". Instead, you have > Importer instances as items in sys.path. Each instance maintains its > dictionary, and they are not (necessarily) combined. so the "sys.path contains importers (or strings)" strategy is now officially sanctioned? cool!!! (a quick look in our code base says that this will cause some trouble, unless os.path.isdir() is modified to reject non-strings... after all, if it's not a string, it cannot be a valid directory path, so this does make some sense ;-) another aside: can we have a standard mechanism for listing the contents of a given archive, please? we have a lot of "path scanning" stuff (PIL and PST, among others), and it would be great if things didn't break down if you stuff it all in an archive. something like: for path in sys.path: if os.path.isdir(path): files = os.listdir(path) else: try: files = path.listdir() except AttributeError: files = None if files is None: # no idea what's in here else: # path provides (at least) these modules would be really useful. and yes, it shouldn't have to be mentioned, since squeeze have done it since early 1997, but archive importers should provide a standard way to include non-module resources in the archive, and a standard way to access such resources as ordinary python streams. e.g: file = path.open(name, "rb") or something... From jim at interet.com Tue Dec 7 16:20:15 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 07 Dec 1999 10:20:15 -0500 Subject: [Python-Dev] Import redesign [LONG] References: <199912071245.HAA21596@eric.cnri.reston.va.us> Message-ID: <384D25AF.4C4F5107@interet.com> Guido van Rossum wrote: > No need to worry about this: just don't merge the caches. Compared to > the hundreds of failed open() calls that are done now, it's no big > deal to do 12 failed Python dictionary lookups instead of one. Agreed. JimA From jim at interet.com Tue Dec 7 16:31:30 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 07 Dec 1999 10:31:30 -0500 Subject: [Python-Dev] Import redesign [LONG] References: Message-ID: <384D2852.3C36C216@interet.com> Greg Stein wrote: > Ah. There is the problem. In Guido's suggestion for the "next path of > inquiry" :-), there is no "single dictionary of names". Instead, you have > Importer instances as items in sys.path. Each instance maintains its > dictionary, and they are not (necessarily) combined. > [A large number of other design issues] OK, all design issues agreed. I will make needed changes. JimA From jim at interet.com Tue Dec 7 16:37:36 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 07 Dec 1999 10:37:36 -0500 Subject: [Python-Dev] Import redesign [LONG] References: <020a01bf40b6$97dafd00$f29b12c2@secret.pythonware.com> Message-ID: <384D29C0.3D3A2194@interet.com> Fredrik Lundh wrote: > another aside: can we have a standard mechanism for > listing the contents of a given archive, please? I will add this. > and yes, it shouldn't have to be mentioned, since squeeze > have done it since early 1997, but archive importers should > provide a standard way to include non-module resources in > the archive, and a standard way to access such resources > as ordinary python streams. I will add this. JimA From gstein at lyra.org Tue Dec 7 17:53:49 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 7 Dec 1999 08:53:49 -0800 (PST) Subject: [Python-Dev] Import redesign [LONG] In-Reply-To: <199912071245.HAA21596@eric.cnri.reston.va.us> Message-ID: On Tue, 7 Dec 1999, Guido van Rossum wrote: > > If we were to combine them, then we would need to maintain the ordering > > requirements implied by sys.path. However, this would be problematic if > > sys.path changed -- we would have to detect the situation and rebuild a > > merged dict. > > No need to worry about this: just don't merge the caches. Compared to > the hundreds of failed open() calls that are done now, it's no big > deal to do 12 failed Python dictionary lookups instead of one. Have no fear... I wasn't planning on this... complicates too much stuff for too little gain. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at CNRI.Reston.VA.US Wed Dec 8 13:07:31 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 07:07:31 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim> References: <000201bf4150$46749da0$5aa2143f@tim> Message-ID: <199912081207.HAA00040@eric.cnri.reston.va.us> [Great analysis, Tim!] > 4) The audience is Python end-users "in general", and the product is pure > Python. I think this is the most important one for Distutils to address, > and compilation isn't a part of it. So far, though, what Gordon is doing > seems more appropriate than what Distutils has been up to. I hope his work > gets folded into this. I'm not sure what stuff by which Gordon you're referring to. I am only familiar with his installer, which I thought is win32 only (but I may be mistaken) and is an installer for a whole application, not just a bunch of modules. Please correct me if I'm wrong. But this reminds me of a different issue, which Jim Ahlstrom has been hammering about before: there's a completely separate set of cases where what you are distributing is a stand-alone application, and the target consists of end users who are entirely uninterested in whether it's written in Python, C or Elvish. (And then there's still the distinction between Win32, Unix or both.) The current distutil dools don't deal with this at all. I think it should though, and I think its framework is powerful enough to be able to add this, e.g. as a new "appdist" command. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Dec 8 15:16:07 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 09:16:07 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim> Message-ID: <1267460464-31845181@hypernet.com> Guido wrote: > [Great analysis, Tim!] > > > 4) The audience is Python end-users "in general", and the > > product is pure Python. I think this is the most important one > > for Distutils to address, and compilation isn't a part of it. > > So far, though, what Gordon is doing seems more appropriate > > than what Distutils has been up to. I hope his work gets > > folded into this. > > I'm not sure what stuff by which Gordon you're referring to. I > am only familiar with his installer, which I thought is win32 > only (but I may be mistaken) and is an installer for a whole > application, not just a bunch of modules. Please correct me if > I'm wrong. It needed a name. I hate the word "Installer", but it expresses in one word the most common use of my stuff. I'll be releasing a beta for Linux real soon. Only some of the tricks are Windows only (such as self-extracting executables, which is only culturally appropriate on Windows, anyway). But more importantly it's not just for installing. The Python I use (interactively) on my wife's machine is 1 directory with about 6 files in it. On my Linux box I've been using the std lib in a .pyz for about a month now. Someone distributing a pure Python package could instead ship 3 files (imputil.py, archive.py and .pyz) with the "install" consisting of adding one line to site.py in the user's perfectly normal Python installation. And yeah, I solved the "manifest" problem, too. Mine predates Distutils, so don't accuse me of duplicate effort, (I pointed them to it a couple times). It uses ConfigParser and a config file, so it allows finer control. While .pyz's are completely cross-platform, I have yet to work out endianness issues in the other archive I use (which should probably be zip format - it can hold anything). And at the "Installer" end, I have yet to work out how things should work on non-ELF/COFF platforms (where I can't append the archive to the executable). But there aren't any technical issues involved; just lack of time. So no, it's not just for Windows; and no, it's not just for creating standalones (though that's what almost everyone uses it for). - Gordon From guido at CNRI.Reston.VA.US Wed Dec 8 15:56:42 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 09:56:42 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com> References: Your message of "Wed, 08 Dec 1999 02:46:02 EST." <000201bf4150$46749da0$5aa2143f@tim> <1267460464-31845181@hypernet.com> Message-ID: <199912081456.JAA00200@eric.cnri.reston.va.us> > It needed a name. I hate the word "Installer", but it expresses > in one word the most common use of my stuff. > > I'll be releasing a beta for Linux real soon. Only some of the > tricks are Windows only (such as self-extracting executables, > which is only culturally appropriate on Windows, anyway). > > But more importantly it's not just for installing. The Python I > use (interactively) on my wife's machine is 1 directory with > about 6 files in it. On my Linux box I've been using the std lib > in a .pyz for about a month now. Someone distributing a pure > Python package could instead ship 3 files (imputil.py, > archive.py and .pyz) with the "install" consisting of > adding one line to site.py in the user's perfectly normal Python > installation. > > And yeah, I solved the "manifest" problem, too. Mine predates > Distutils, so don't accuse me of duplicate effort, (I pointed > them to it a couple times). It uses ConfigParser and a config > file, so it allows finer control. > > While .pyz's are completely cross-platform, I have yet to work > out endianness issues in the other archive I use (which should > probably be zip format - it can hold anything). And at the > "Installer" end, I have yet to work out how things should work > on non-ELF/COFF platforms (where I can't append the archive > to the executable). But there aren't any technical issues > involved; just lack of time. > > So no, it's not just for Windows; and no, it's not just for > creating standalones (though that's what almost everyone > uses it for). Gordon, I'm sorry, but from this description I still have no idea what your stuff is (and I forgot the URL so I can't look it up). For example, if it's not (just) for installing, what *is* it for? What is the ``"manifest" problem'' and how did you solve it? Also, note that editing site.py is a no-no! You can create/edit sitecustomize.py, but you should leave site.py alone! --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Dec 8 17:17:03 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 11:17:03 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081456.JAA00200@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com> Message-ID: <1267453215-32281635@hypernet.com> Guido, > Gordon, I'm sorry, but from this description I still have no idea > what your stuff is (and I forgot the URL so I can't look it up). http://starship.python.org/crew/gmcm/installer.html The Linux stuff has a couple alpha testers and will probably get announced in a week or two. > For example, if it's not (just) for installing, what *is* it for? At the bottom level, it's a bunch of tools using freeze's modulefinder, imputil.py and 2 kinds of archives. There's at least 2 layers above that, with "Installer" being the top. There's a clean separation between the layers, so you can break in wherever you like. > What is the ``"manifest" problem'' and how did you solve it? The problem is specifying a set of resources, hopefully without having to list them explicitly. I solve this with a config file that lets you specify packages, directories, directory trees.. with filters that can work from paths, names, extensions, regular expressions... > Also, note that editing site.py is a no-no! You can create/edit > sitecustomize.py, but you should leave site.py alone! That would work fine. One of the standalone configurations will write a site.py, but that's for a completely self-contained installation (ie, one which will have no conflicts with another Python installation). I'd also note that, for Windows at least, the path-expanding mechanism created by site.py has not caught on. I've got lots installed, and no site-python, site-packages or sitecustomize. - Gordon From guido at CNRI.Reston.VA.US Wed Dec 8 17:23:34 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 11:23:34 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> References: Your message of "Wed, 08 Dec 1999 09:16:07 EST." <1267460464-31845181@hypernet.com> <1267453215-32281635@hypernet.com> Message-ID: <199912081623.LAA04119@eric.cnri.reston.va.us> [me] > > Also, note that editing site.py is a no-no! You can create/edit > > sitecustomize.py, but you should leave site.py alone! [Gordon] > That would work fine. One of the standalone configurations will > write a site.py, but that's for a completely self-contained > installation (ie, one which will have no conflicts with another > Python installation). > > I'd also note that, for Windows at least, the path-expanding > mechanism created by site.py has not caught on. I've got lots > installed, and no site-python, site-packages or sitecustomize. You shouldn't see site-python or site-packages, they only exist on Unix. On Windows, everything is installed in the top Python directory. However you should see .pth files there, which is what site.py looks for. I believe NumPy and PIL use those. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Dec 8 17:55:51 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 11:55:51 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081623.LAA04119@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> Message-ID: <1267450887-32421651@hypernet.com> > [Gordon] > > That would work fine. One of the standalone configurations will > > write a site.py, but that's for a completely self-contained > > installation (ie, one which will have no conflicts with another > > Python installation). > > > > I'd also note that, for Windows at least, the path-expanding > > mechanism created by site.py has not caught on. I've got lots > > installed, and no site-python, site-packages or sitecustomize. [Guido] > You shouldn't see site-python or site-packages, they only exist > on Unix. You mean "they only exist _for_ Unix", (site.py looks for them on Windows). I don't like that. For one thing, modulo a few platform differences, the same mechanism should work for multi-user Unix and Windows LAN installations. And single- user Windows (I know, redundant, even on NT) should be a degenerate case of the above. > On Windows, everything is installed in the top Python > directory. However you should see .pth files there, which is > what site.py looks for. I believe NumPy and PIL use those. No NumPy, no PIL, no .pth files. 99% of everything out there just says "unzip this somewhere on your Python path". In this case, Jim Ahlstrom may be right - there are too many options, or at least an insufficiently emphasized "proper" method. Until I worked out my own way of installing stuff, I used to lose a large number of packages whenever I upgraded my Windows Python. Much as I love Mark's stuff (and hesitate to criticize crazy Aussies), I wish there weren't so much special casing here for Windows. And no, I don't have any solutions to this, I'm just griping... - Gordon From guido at CNRI.Reston.VA.US Wed Dec 8 18:07:30 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 12:07:30 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Wed, 08 Dec 1999 11:55:51 EST." <1267450887-32421651@hypernet.com> References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> Message-ID: <199912081707.MAA04242@eric.cnri.reston.va.us> > [Guido] > > You shouldn't see site-python or site-packages, they only exist > > on Unix. [Gordon] > You mean "they only exist _for_ Unix", (site.py looks for them > on Windows). No it doesn't. The code in site.py only adds site-packages and site-python when os.sep is '/'. RTSL. > I don't like that. For one thing, modulo a few > platform differences, the same mechanism should work for > multi-user Unix and Windows LAN installations. And single- > user Windows (I know, redundant, even on NT) should be a > degenerate case of the above. What do you mean by "the same mechanism should work"? The same mechanism for what? Are you talking about sharing the installed files somehow? > > On Windows, everything is installed in the top Python > > directory. However you should see .pth files there, which is > > what site.py looks for. I believe NumPy and PIL use those. > > No NumPy, no PIL, no .pth files. 99% of everything out there > just says "unzip this somewhere on your Python path". Fair enough. Of course I know about .pth files so I unzipped them elsewhere and added a .pth file pointing there... > In this case, Jim Ahlstrom may be right - there are too many > options, or at least an insufficiently emphasized "proper" > method. Until I worked out my own way of installing stuff, I > used to lose a large number of packages whenever I upgraded > my Windows Python. The .pth files are designed for this. Maybe they haven't been explained as well as they should. > Much as I love Mark's stuff (and hesitate to criticize crazy > Aussies), I wish there weren't so much special casing here for > Windows. It's not Mark's fault, it's Microsoft's fault. If you don't do things the way MS wants you to, experienced Windows users will gripe, misunderstand what you do, etc. > And no, I don't have any solutions to this, I'm just griping... Ditto. Understanding the problems is half of the solution though. The problems seem pretty complex! --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Dec 8 19:25:50 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 13:25:50 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us> References: Your message of "Wed, 08 Dec 1999 11:55:51 EST." <1267450887-32421651@hypernet.com> Message-ID: <1267445488-32746429@hypernet.com> [Guido] > No it doesn't. The code in site.py only adds site-packages and > site-python when os.sep is '/'. RTSL. Oops. Missed that. > > I don't like that. For one thing, modulo a few > > platform differences, the same mechanism should work for > > multi-user Unix and Windows LAN installations. And single- user > > Windows (I know, redundant, even on NT) should be a degenerate > > case of the above. > > What do you mean by "the same mechanism should work"? The same > mechanism for what? Are you talking about sharing the installed > files somehow? In the above, "mechanism" basically meant that which creates sys.path. Basically, this came up for me because in standalone configurations (my Installer again), I have to take complete control of sys.path. After doing so differently on Windows and Linux, I finally realized that I can do it the same way on both. Which makes me question why they are so different. > The .pth files are designed for this. Maybe they haven't been > explained as well as they should. I'd say "badgered" or "browbeaten" instead of "explained" ;-). > > Much as I love Mark's stuff (and hesitate to criticize crazy > > Aussies), I wish there weren't so much special casing here for > > Windows. > > It's not Mark's fault, it's Microsoft's fault. If you don't do > things the way MS wants you to, experienced Windows users will > gripe, misunderstand what you do, etc. Even MS doesn't do things the way MS says they want you to. I find MS users equally divided between those who scream bloody murder if you touch the registry, and those who scream if you don't. It's not like *nixen suffer from an excessive degree of conformity in preferred installation procedures, but somehow Python survives there... > > And no, I don't have any solutions to this, I'm just griping... > > Ditto. Understanding the problems is half of the solution > though. The problems seem pretty complex! Grumpily agreed ;-). - Gordon From jim at interet.com Wed Dec 8 19:33:51 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 08 Dec 1999 13:33:51 -0500 Subject: [Python-Dev] Linux Journal confirms evil rumor References: Your message of "Wed, 08 Dec 1999 11:17:03 EST." <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> Message-ID: <384EA48F.F5190180@interet.com> I finally got around to reading the current Linux Journal (which just keeps getting better and better) and lo! there was a picture of a familiar face I just couldn't quite.... Oh no! Could it be true? I heard rumors but I refused to believe them until now. The glasses are gone! Guido now looks like an investment banker! The sky is falling! Next will probably be a Python 1.6 as a 27 Meg DLL, and a Python IPO. Well, maybe not. Now that I look more closely, he is wearing a black and white and mustard (??MUSTARD) T-shirt which says "You Need Python". At least we ought to make him wear a name tag at IPC8. JimA From fdrake at acm.org Wed Dec 8 19:37:44 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 8 Dec 1999 13:37:44 -0500 (EST) Subject: [Python-Dev] Linux Journal confirms evil rumor In-Reply-To: <384EA48F.F5190180@interet.com> References: <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> <384EA48F.F5190180@interet.com> Message-ID: <14414.42360.309237.967766@weyr.cnri.reston.va.us> James C. Ahlstrom writes: > Oh no! Could it be true? I heard rumors but I refused to > believe them until now. The glasses are gone! Guido now > looks like an investment banker! The sky is falling! I'm afraid this non-distinctive look was introduced at IPC7... it's too bad we can't tell people Python was invented by the guy with the glasses anymore. > Next will probably be a Python 1.6 as a 27 Meg DLL, and > a Python IPO. Well, maybe not. Now that I look more > closely, he is wearing a black and white and mustard > (??MUSTARD) T-shirt which says "You Need Python". It's really the blue & white & orange IPC7 shirt. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw at cnri.reston.va.us Wed Dec 8 19:41:51 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 8 Dec 1999 13:41:51 -0500 (EST) Subject: [Python-Dev] Linux Journal confirms evil rumor References: <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> <384EA48F.F5190180@interet.com> Message-ID: <14414.42607.701538.783684@anthem.cnri.reston.va.us> >>>>> "JCA" == James C Ahlstrom writes: JCA> Oh no! Could it be true? I heard rumors but I refused to JCA> believe them until now. The glasses are gone! Guido now JCA> looks like an investment banker! The sky is falling! He's not the only one who's, like, "gone corporate", but I won't mention any names, so as to protect the guilty. From jim at digicool.com Wed Dec 8 20:03:42 1999 From: jim at digicool.com (Jim Fulton) Date: Wed, 08 Dec 1999 14:03:42 -0500 Subject: [Python-Dev] Linux Journal confirms evil rumor References: <1267453215-32281635@hypernet.com> <1267450887-32421651@hypernet.com> <199912081707.MAA04242@eric.cnri.reston.va.us> <384EA48F.F5190180@interet.com> <14414.42607.701538.783684@anthem.cnri.reston.va.us> Message-ID: <384EAB8E.EBA595B5@digicool.com> "Barry A. Warsaw" wrote: > > He's not the only one who's, like, "gone corporate", but I won't > mention any names, so as to protect the guilty. OK, Buzz. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From tim_one at email.msn.com Thu Dec 9 06:31:52 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 9 Dec 1999 00:31:52 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081207.HAA00040@eric.cnri.reston.va.us> Message-ID: <000301bf4206$b39e5b80$36a2143f@tim> [Guido] > [Great analysis, Tim!] I beg to differ: it's internally inconsistent and should have identified at least 3 axes and hence at least 8 cases. Still, you got more than you paid for . >> 4) The audience is Python end-users "in general", and the >> product is pure Python. I think this is the most important one >> for Distutils to address, and compilation isn't a part of it. >> So far, though, what Gordon is doing seems more appropriate >> than what Distutils has been up to. I hope his work gets folded >> into this. > I'm not sure what stuff by which Gordon you're referring to. You guessed right! > I am only familiar with his installer, which I thought is win32 > only (but I may be mistaken) and is an installer for a whole > application, not just a bunch of modules. Please correct me if > I'm wrong. If it can install a whole app, what makes you suspect it couldn't install just a bunch of modules <0.5 wink>? It started life as Windows-only, and I believe it's been virtually ignored by non-Windows folk because of that. Bad blind spot. It supplies already-working approaches to many of the issues that are still being *talked* about on Distutils (at least archive formats, code to manipulate same, manifest files (how do you tell the tool which files to package?), and transparently bundling a Python interpreter when needed). > But this reminds me of a different issue, which Jim Ahlstrom has > been hammering about before: there's a completely separate set of > cases where what you are distributing is a stand-alone application, > and the target consists of end users who are entirely uninterested > in whether it's written in Python, C or Elvish. I include part of that in my case #4 above, where the app happens to be written in Pure Python -- but the user doesn't have to know that. Gordon is addressing at least that part of it. AFAIK he can't deal with transparently compiling C or exorcising Elvish on the target platform, but if you're just distributing the binaries I expect his work is directly usable already. > (And then there's still the distinction between Win32, Unix or > both.) I vote "both". The world really doesn't need another Win32-only (or Unix-only) installer, archive format, compression format, or distribution model. Jim seems mostly interested in Win32-only to me, and his concerns haven't been about the mechanics of distribution but about how-- regardless of tool --to create a bulletproof Python installation by hook or by crook. Last time we went thru this, it was concluded that one couldn't without patching the Python Windows binary with a resource editor (to point to its own infernal <0.5 wink> registry entries). Distutils hasn't talked about that at all (that I've seen, anyway); if there were a less radical approach to that, I suspect Jim would be delighted to use one of the commercial Win32 installation pkgs (and if that's what his customers expect, delighted or not that's what he'll do). > The current distutil dools don't deal with this at all. That's why I said I thought what Gordon is doing seems more appropriate to case #4 than what Distutils has been doing. > I think it should though, Ditto. > and I think its framework is powerful enough to be able to > add this, e.g. as a new "appdist" command. I cordially invite (since Gordon will uncordially browbeat ) people to look seriously at what he's done. Best I can tell, for apps that don't need compilation "on the other end", it's mostly "there" already! give-the-man-a-hand-ly y'rs - tim From tim_one at email.msn.com Thu Dec 9 06:52:23 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 9 Dec 1999 00:52:23 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <1267453215-32281635@hypernet.com> Message-ID: <000601bf4209$90a90c80$36a2143f@tim> > http://starship.python.org/crew/gmcm/installer.html Eh? Doesn't work for me. This does: http://starship.python.net/crew/gmcm/distribute.html From tim_one at email.msn.com Thu Dec 9 07:38:54 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 9 Dec 1999 01:38:54 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912081707.MAA04242@eric.cnri.reston.va.us> Message-ID: <000701bf4210$10925a40$36a2143f@tim> [Gordon] >> Much as I love Mark's stuff (and hesitate to criticize crazy >> Aussies), I wish there weren't so much special casing here for >> Windows. [Guido] > It's not Mark's fault, it's Microsoft's fault. If you don't do > things the way MS wants you to, experienced Windows users will > gripe, misunderstand what you do, etc. Something just occurred to me: MS's guidelines aren't arbitrary, they actually have very good reasons. In the case of putting all an app's crucial info in the Registry, it's the only way to allow a site administrator to set policy and site options remotely (an admin can fiddle other machines' registries remotely). This works very well indeed when there's only "one copy" of an app on a machine (or at most one copy "per user"). What just occurred to me is that JimA is concerned with *not* letting any info from a previously-installed Python affect the app he's installing. Similarly, Gordon's Win32 "standalone installer" modifies python.exe and pythonw.exe to use a PYTHONPATH he forces, leaving the registry out of it. Similarly, the woes I've had in trying to sell Python as a general Win32 scripting tool at work mostly boil down to that there's no effortless way to do it that doesn't risk picking up info from-- or forcing info onto --pre-existing or future distinct Python installations (in contrast, Perl "just works" in this respect). IOW, the three of us find getting path info out of the registry intolerable because we are in fact trying to do the opposite of what the registry mechanism was *designed* for: we want perfect isolation, not perfect sharing. This has come up on Python-Help a few times too, in the guise of someone installing a product that in turn installs an older version of Python, which in turn confuses another product that relies on features in a newer version of Python. So while the traditional Windows .ini file (like Unix this-or-that.rc file) model was replaced by the registry for excellent reasons, those reasons don't apply to the way we're using Python! The .ini file model was exactly right for what most of us seem to want to do, and the registry model is exactly wrong. just-thought-i'd-cheer-you-up-ly y'rs - tim From skip at mojam.com Thu Dec 9 08:38:36 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 9 Dec 1999 01:38:36 -0600 (CST) Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <000701bf4210$10925a40$36a2143f@tim> References: <199912081707.MAA04242@eric.cnri.reston.va.us> <000701bf4210$10925a40$36a2143f@tim> Message-ID: <14415.23676.775163.786028@dolphin.mojam.com> Tim> So while the traditional Windows .ini file (like Unix Tim> this-or-that.rc file) model was replaced by the registry for Tim> excellent reasons, those reasons don't apply to the way we're using Tim> Python! The .ini file model was exactly right for what most of us Tim> seem to want to do, and the registry model is exactly wrong. Alright! Now I understand what all the hubbub is about! My eyes have mostly been glazing over trying to follow all this Windows registry/path/ini stuff. MS believes that Python is the application. Those of us writing Python programs view those programs as the applications, not the Python interpreter per se. Is there some way that people writing applications in Python can set up registry entries that are specific to their application (e.g. tabnanny.py) instead of only specific to the Python interpreter? Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From gmcm at hypernet.com Thu Dec 9 15:17:27 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 9 Dec 1999 09:17:27 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <000701bf4210$10925a40$36a2143f@tim> References: <199912081707.MAA04242@eric.cnri.reston.va.us> Message-ID: <1267374045-37047016@hypernet.com> [Guido] > > It's not Mark's fault, it's Microsoft's fault. If you don't do > > things the way MS wants you to, experienced Windows users will > > gripe, misunderstand what you do, etc. [Tim] > Something just occurred to me: MS's guidelines aren't arbitrary, > they actually have very good reasons. In the case of putting all > an app's crucial info in the Registry, it's the only way to allow > a site administrator to set policy and site options remotely (an > admin can fiddle other machines' registries remotely). This > works very well indeed when there's only "one copy" of an app on > a machine (or at most one copy "per user"). And actually, the business about separate subtrees for the machine's configuration and the user's configuration is pretty clever. MS doesn't explain it well, and it gets misused, but when done right, it's a lot simpler than the maze of .xxxrc files you sometimes find in other OSes. > What just occurred to me is that JimA is concerned with *not* > letting any info from a previously-installed Python affect the > app he's installing. Similarly, Gordon's Win32 "standalone > installer" modifies python.exe and pythonw.exe to use a > PYTHONPATH he forces, leaving the registry out of it. Similarly, > the woes I've had in trying to sell Python as a general Win32 > scripting tool at work mostly boil down to that there's no > effortless way to do it that doesn't risk picking up info from-- > or forcing info onto --pre-existing or future distinct Python > installations (in contrast, Perl "just works" in this respect). In my Linux version, I went to the heart of the matter - getpath.c. It occurs to me that getpath.c might do better to follow a normal bootstrap process - ie, create the absolute minimal sys.path required to go to the next step. Then the rest of what goes on in getpath.c could be written in Python. Maybe that Python code needs to get frozen in (to prevent bozos from destroying an installation by stepping on getpath.py), but it would make it a lot easier to create independent installations, and also reduce the variations between platforms at the C level. (Then again, I've never heard of anyone stepping on exceptions.py.) If some registry manipulation primitives were exposed (say, through ntpath) that would mean that Windows developers could (if they wanted) play by the MS rules with at least the option of not stepping on each other. - Gordon From jim at interet.com Thu Dec 9 16:02:18 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 10:02:18 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> Message-ID: <384FC47A.BB4DA517@interet.com> Tim Peters wrote: > Jim seems mostly interested in Win32-only to me, and his concerns haven't > been about the mechanics of distribution but about how-- regardless of > tool --to create a bulletproof Python installation by hook or by crook. Not exactly. I am interested in how to create a bullet-proof installation. But I am equally interested in Unix (especially Linux) and dislike the current dichotomy in the code base. Lately I have been more active in distribution via archive files. Part of the solution is an archive file format which is identical on Unix and Windows, and which can hold the Python library and packages as single files. For my own efforts on this see: ftp://ftp.interet.com/pub/pylib.html This is an archive file format similar to Gordon's format, although Gordon's work goes well beyond just file formats. I currently have fifth generation code for this format, and am adding features as suggested by Fredrik Lundt. I hope it gets considered as a candidate for a Python standard format. > Distutils hasn't talked about that at all (that I've seen, anyway); Gordon, Greg Stein and I have discussed file formats before. I think it was on distutils. Anyway that was months ago. JimA From guido at CNRI.Reston.VA.US Thu Dec 9 17:17:18 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 11:17:18 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 09:17:27 EST." <1267374045-37047016@hypernet.com> References: <199912081707.MAA04242@eric.cnri.reston.va.us> <1267374045-37047016@hypernet.com> Message-ID: <199912091617.LAA05742@eric.cnri.reston.va.us> > [Guido] > > > It's not Mark's fault, it's Microsoft's fault. If you don't do > > > things the way MS wants you to, experienced Windows users will > > > gripe, misunderstand what you do, etc. > [Tim] > > Something just occurred to me: MS's guidelines aren't arbitrary, > > they actually have very good reasons. In the case of putting all > > an app's crucial info in the Registry, it's the only way to allow > > a site administrator to set policy and site options remotely (an > > admin can fiddle other machines' registries remotely). This > > works very well indeed when there's only "one copy" of an app on > > a machine (or at most one copy "per user"). [Gordon] > And actually, the business about separate subtrees for the > machine's configuration and the user's configuration is pretty > clever. MS doesn't explain it well, and it gets misused, but > when done right, it's a lot simpler than the maze of .xxxrc files > you sometimes find in other OSes. I agree. And I am guilty of not even try to find MS' explanation -- I just looked in the registry at what other apps did and tried to mimic that (plus what Mark had already done), without really knowing what I was doing. I now know a little better -- see the end of this message. > In my Linux version, I went to the heart of the matter - > getpath.c. It occurs to me that getpath.c might do better to > follow a normal bootstrap process - ie, create the absolute > minimal sys.path required to go to the next step. Then the > rest of what goes on in getpath.c could be written in Python. > Maybe that Python code needs to get frozen in (to prevent > bozos from destroying an installation by stepping on > getpath.py), but it would make it a lot easier to create > independent installations, and also reduce the variations > between platforms at the C level. (Then again, I've never heard > of anyone stepping on exceptions.py.) Yes, this is exactly what was proposed in the thread on the Big Import Rewrite. > If some registry manipulation primitives were exposed (say, > through ntpath) that would mean that Windows developers > could (if they wanted) play by the MS rules with at least the > option of not stepping on each other. That's a good idea. These functions are already available through Mark's win32api extension -- much of which will eventually (I hope before 1.6 is out!) become part of the core distribution. In the mean time, I've been thinking a bit more about how Python should be using the Windows registry. (It's clear to me that Python should use the registry -- those who disagree can go build their own Python distribution.) The basic ideas of Python's current registry usage are sound: there's a resource built into the DLL which is part of the key into the registry used for all information. The problem lies in which key is used. All versions of Python 1.5.x (1.5, 1.5.1, 1.5.2) use the same key! This is a main cause of trouble, because it means that different versions cannot peacefully live together even if the user installs them into different directories -- they will all use the registry keys of the last version installed. This, in turn, means that someone who writes a Python application that has a dependency on a particular Python version (and which application worth distributing doesn't :-) cannot trust that if a Python installation is present, it is the right one. But they also cannot simply bundle the standard installer for the correct Python version with their program, because its installation would overwrite an existing Python application, thus breaking some *other* Python apps that the user might already have installed. (There's a solution for app builders who are willing to do a lot of work -- you can change the registry key resource in the DLL. For example, Alice comes with its own version of Python 1.5.1 and it uses "1.5.1-alice" as its registry key. The Alice installer installs Python in a subdirectory of the Alice installation directory and points the 1.5.1-alice registry entries there. The problem is that this is a lot of work for the average app builder.) I thought a bit about how VB solves this. I think that when you wrap up a VB app in, all the support code (mostly a big DLL) is wrapped with it. When the user runs the installer, the DLL is installed (probably in the WINDOWS directory). If a user installs several VB apps built with the same VB version, they all attempt to install the exact same DLL; of course the installers notice this and optimize it away, keeping a reference count. (Ignoring for now the fact that those reference counts don't always work!) If an app builty with a different VB version is installed, it has a DLL with a different name, and that is installed separately. Other support files, I presume, are dealt with in much the same way. Voila, there's the theory. How can we do something similar for Python? A app written in Python should need to install only three or four files: - a driver EXE to start the app - a copy of the Python DLL - the Python library in an archive - the app code in an archive The latter two could be combined into a single archive, but I propose that we use two archives so that the DLL and the Python library archive can be shared between installations of independent Python apps as long as they use the exact same Python version and don't need additional 3rd party packages. (I believe that Jim A's proposal combines the archives with the EXE and the DLL, reducing the number of files to two. That's fine too.) Is there a use for the registry here at all? Maybe not. (I notice that VB seems to have a single registry entry, pointing to a DLL; all other VB files also seem to live there.) Complications: - Some apps may need a custom extension module, which has to be installed as a PYD file. So it seems that there needs to be a directory per app, and perhaps per version of the app (if the app distributor cares). - Some apps need other, non-pyc files (e.g. data tables or help files); it would be handy if these could be stored in the archives as well. - Some standard extension modules are in their own PYD files; these also need to be installed. They aren't typically marked with a version, so perhaps a path directory per version of Python (if not per installed app) is wise. - How to distribute an app that needs 3rd party stuff, e.g. Tcl/Tk, or PIL, or NumPy? Their Python code can easily be wrapped up in another archive with a standard name incorporating a version number; but the required PYD and DLL files are a separate story. (E.g. for Tkinter, you need _tkinter.pyd which links against tcl80.dll.) Basically the same solution as for standard PYD files can work; the needed DLL files can be installed either systemwide (if they have a reliable version number in their name, like tcl80.dll) or in the per-app or per-package directory (like NumPy). - Presumably, the archives will contain PYC files only. This means that tracebacks will not show source code, only line numbers. For Jim A, this is probably exactly what he wants (if the user gets a traceback, his "robust app" has miserably failed, and he takes it in pride that this doesn't happen). But for some others, access to the sources could be essential. For example, I might want to distribute IDLE using this mechanism; users of IDLE who are curious about the standard library (or about IDLE itself) should be able to open the source for an arbitrary module (and maybe even edit it, although that's not a priority and perhaps should even be discouraged). Library source access is an important feature of the IDLE debugger as well. A way out for IDLE is to install a classic distribution of the Python library sources, into the filesystem at an IDLE specific location. Other apps, with only the need for source code in tracebacks, might choose to to have the PY files in the archives sitting next to the PYC files, and somehow the traceback mechanism should be accessing the archive to get a hold of the source. And yes, I realize that Jim A's latest offering solves most of these problems to a large extent -- well done. (Jim, would you care to comment on the issues that you don't address? Will you address them in a future version?) Final notes: There are two different problems here. One is how to distribute Python apps robustly to end users who don't particular care about Python. This is Jim A's problem (and he has a solution that works for him). In general the solutions here try to isolate the installed app from other Python installations. I'm proposing that at least the DLL and the Python library archive can probably be shared between apps without reducing robustness if we keep track more carefully of version numbers. The other problem is how to distribute packages of Python and extension modules for use by Python users. These typically need to drop into some existing Python installation. This is Paul Dubois' problem with NumPy (amongst others) and is the current focus of the distutil SIG. However I believe that there could be a lot of common infrastructure that would help us create better solutions for both problems. For package distribution, common infrastructure (a.k.a. standards) is essential. For app distribution, common infrastructure isn't so important (since the solutions strive for total isolation, there's no problem if different apps use solutions). However, this changes when app creators want to distribute robust self-sufficient apps that use 3rd party packages -- then the 3rd party packages must allow being packaged up using the app distribution creator of choice. Solving this compound problem (creating package distributions that can be redistributed easily as part of robust Python app distributions) should be an important goal for the infrastructure we're building here. The Big Import Rewrite ought to add this to its list of objectives if it isn't already on it. My guess is that the solution for this compound problem will increase the dependency of app distribution tools on the package distribution infrastructure; which to me seems like a Good Thing because it would lead to more code sharing. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at interet.com Thu Dec 9 17:24:40 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 11:24:40 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000701bf4210$10925a40$36a2143f@tim> Message-ID: <384FD7C8.12832BF1@interet.com> Tim Peters wrote: > Something just occurred to me: MS's guidelines aren't arbitrary, they > actually have very good reasons. In the case of putting all an app's > crucial info in the Registry, it's the only way to allow a site > administrator to set policy and site options remotely (an admin can fiddle > other machines' registries remotely). This works very well indeed when > there's only "one copy" of an app on a machine (or at most one copy "per > user"). The registry is still a bad idea because it lumps critical and app data into single files and brings up the ugly problem of protecting individual registry entries instead of just files. Microsoft should have put all app config into the app directory and provided for remote admin of that. But that is not really your point (just ranting about the registry again). > IOW, the three of us find getting path info out of the registry intolerable > because we are in fact trying to do the opposite of what the registry > mechanism was *designed* for: we want perfect isolation, not perfect > sharing. > > This has come up on Python-Help a few times too, in the guise of someone > installing a product that in turn installs an older version of Python, which > in turn confuses another product that relies on features in a newer version > of Python. Or, in other words, no isolation is possible if critical info depends on global data like PYTHONPATH or a _common_ registry entry. We could have different registry entries, but this is confusing and not documented. I think we can solve this with archive files in a way compatible with Unix without going off on a Windows-only wavelength. If the archive file contains everything, and it is in the dir of the app, and the app looks there and finds it, then it Just Works. See also my reply to Skip. JimA From akuchlin at mems-exchange.org Thu Dec 9 17:32:08 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Thu, 9 Dec 1999 11:32:08 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list Message-ID: <199912091632.LAA09236@amarok.cnri.reston.va.us> After poking around in the O'Reilly POSIX book, here's a list of POSIX functions that don't seem to be available in Python. Not all of them seem worth supporting. Ironically, Greg Ward's daemonize() Perl subroutine, which started me on this, doesn't actually seem to need anything that Python doesn't have. I'm looking for corrections to the list; are there other POSIX functions I've missed, or are some of them actually in Python? I think implementing most of these functions is straightforward, with the exception of opendir/readdir/closedir. Worth adding? ============= opendir(), readdir(), closedir() -- most of their functionality is available through os.listdir(), but it might be useful to have a direct interface. Downside is that this would require a new extension type for the C DIR struct. My (lazy) inclination is to not bother. Worth adding: ============= abort() -- used in Py_FatalError(), but not accessible to Python code ctermid(), ctermid_r() -- returns the terminal pathname -- probably just add ctermid(), but use ctermid_r() for thread-safety fpathconf(fd, name) -- Get configuration limit for a file -- would need constants from unistd.h getlogin() -- returns user's login name -- could do something similar with pwd.getpwuid( os.getuid() )[0], but getlogin() apparently looks in utmp getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs pathconf(path, name) -- Gets config variables for a path -- would need constants from unistd.h sysconf(int name) -- Gets system configuration information -- would need constants from unistd.h Not worth adding: ================= clearerr() -- looks like fileobjects call clearerr() before raising errors cuserid() -- returns user's login name -- ORA book says "Do not use this function" -- removed in 1990 POSIX difftime -- seems only required in C "because no addition properties are defined for time_t" (Solaris man page) tmpfile(), tmpnam() -- Create temp file, generate temp filename -- Similar functionality available in tempfile.py mblen(), mbstowcs(), mbtowc(), wcstombs(), wctomb() -- Multi-byte character functions: -- Don't bother; wait for the Unicode type. -- A.M. Kuchling http://starship.python.net/crew/amk/ I'm sorry I became abusive just now ... calling you worms... I was just speaking relatively, you understand. -- Dekko, in ZOT! #3 From jcw at equi4.com Thu Dec 9 17:38:13 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Thu, 09 Dec 1999 17:38:13 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> Message-ID: <384FDAF5.C25C447C@equi4.com> "James C. Ahlstrom" wrote: [...] > ftp://ftp.interet.com/pub/pylib.html Ouch - what's wrong with zip archives? There are utilities to convert to/from zip, to re-pack, to mount zip transparently so it's entries look like regular files, FTP servers, etc. Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format. Zips would seem natural with JPython. And suppose that scripting ever starts to consolidate to a common scripting kernel (yah, well), do you really want a system which is closing all doors to cross-fertilization? Zip has an advantage over .tar.gz in that its table of contents is available without having to decompress the whole kaboodle. Your format has no checksum, which for deployment and long-term storage can be important. If you want a marshalled TOC, then why not add a manifest entry for it, sort of like what ranlib does with ar? You designed the format so archives can be concatenated without any tool (other than "cat"), but this works just as well with zip files, as the Tcl Wrap approach demonstrates. Allow me to very, very loosely paraphrase Guido here: sure, everyone can design an archive format, but they are likely to make the same mistakes all over again - so why not adopt a format which is tried and tested? With all due respect - I sincerely hope you will reconsider and alter your code to work with zip files. It's probably a small adjustment? Unless your *intent* is to create a diverging standard, of course... -- Jean-Claude From jim at interet.com Thu Dec 9 17:46:35 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 11:46:35 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <199912081707.MAA04242@eric.cnri.reston.va.us> <000701bf4210$10925a40$36a2143f@tim> <14415.23676.775163.786028@dolphin.mojam.com> Message-ID: <384FDCEB.2226C1C1@interet.com> Skip Montanaro wrote: > MS believes that Python is the application. Those of us writing > Python programs view those programs as the applications, not the Python > interpreter per se. I think this is a good point. Windows app programmers (mostly) view Python as part of their app and try it install it in their app directory. Unix installs Python as a system app in multiple versions and users use PATH to pick a version. Unix users view the Python interpreter as a system service which is needed for running their app. I think this is because a Windows app is a visual program, and the Python release compiles to a console app (not really a visual program). So all (?most) Windows Python apps are custom mains with Python as a component, but the stock python.exe is not the main. This makes it difficult to document a way to install Python in the Unix fashion, since all apps need their own binary main and python15.dll is the only thing in common. IMHO archive files can solve this a lot more simply. JimA From guido at CNRI.Reston.VA.US Thu Dec 9 17:55:40 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 11:55:40 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 17:38:13 +0100." <384FDAF5.C25C447C@equi4.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> Message-ID: <199912091655.LAA05928@eric.cnri.reston.va.us> > "James C. Ahlstrom" wrote: > > [...] > > ftp://ftp.interet.com/pub/pylib.html Jean-Claude Wippler replied: > Ouch - what's wrong with zip archives? > > There are utilities to convert to/from zip, to re-pack, to mount zip > transparently so it's entries look like regular files, FTP servers, etc. > > Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format. > > Zips would seem natural with JPython. And suppose that scripting ever > starts to consolidate to a common scripting kernel (yah, well), do you > really want a system which is closing all doors to cross-fertilization? > > Zip has an advantage over .tar.gz in that its table of contents is > available without having to decompress the whole kaboodle. > > Your format has no checksum, which for deployment and long-term storage > can be important. > > If you want a marshalled TOC, then why not add a manifest entry for it, > sort of like what ranlib does with ar? > > You designed the format so archives can be concatenated without any tool > (other than "cat"), but this works just as well with zip files, as the > Tcl Wrap approach demonstrates. > > Allow me to very, very loosely paraphrase Guido here: sure, everyone can > design an archive format, but they are likely to make the same mistakes > all over again - so why not adopt a format which is tried and tested? > > With all due respect - I sincerely hope you will reconsider and alter > your code to work with zip files. It's probably a small adjustment? > > Unless your *intent* is to create a diverging standard, of course... Exactly my sentiments. We have rough Python code to deal with zip files; it's very rough because we got kind of carried away adding features and ended up with spaghetti code :-( But it's working code nevertheless and we're offering it up for anyone in this group to clean up (we could do that ourselves but it's not high on our current priority list). I don't know anything about Tcl Wrap. I do know a great deal about the ZIP format, but apparently I missed the concatenation feature. How does this work? Does that work for all zip tools, or just for the ZIP reader in Wrap? (I looked up how Jim A does it -- his central directory at the end of the file contains the total size of the data covered by that directory, so he seeks back to the beginning of it and sees if another magic number precedes it; and so on. Very simple.) I quickly looked at the Wrap page; it shows how to access data files stored in the archive. Question: does the wrap::open code go out to the regular filesystem if it finds there's no wrap archive? That would be handy so you can test the code in its unwrapped form without change. Python needs this too. --Guido van Rossum (home page: http://www.python.org/~guido/) From gward at cnri.reston.va.us Thu Dec 9 18:12:00 1999 From: gward at cnri.reston.va.us (Greg Ward) Date: Thu, 9 Dec 1999 12:12:00 -0500 Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Dec 09, 1999 at 11:32:08AM -0500 References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <19991209121159.B20179@cnri.reston.va.us> On 09 December 1999, Andrew M. Kuchling said: > After poking around in the O'Reilly POSIX book, here's a list of POSIX > functions that don't seem to be available in Python. Not all of them > seem worth supporting. Ironically, Greg Ward's daemonize() Perl > subroutine, which started me on this, doesn't actually seem to need > anything that Python doesn't have. I think I already pointed this your way, but don't forget the man page for Perl's POSIX module: "perldoc POSIX". I suspect POSIX functions that don't make sense in Perl also don't make sense in Python. I agree with all your assessments about what's worth adding and what's not, and that {close,read,open}dir() are questionable and probably not worth the bother. Random thoughts: > abort() -- used in Py_FatalError(), but not accessible to Python code Would this do the same as in C, ie. terminate the process and dump core? > getlogin() -- returns user's login name > -- could do something similar with pwd.getpwuid( os.getuid() )[0], but > getlogin() apparently looks in utmp With a documentation proviso that utmp is very old-fashioned, and you really should do the getuid() thing unless you definitely want to get the login ID from utmp. Perhaps an alternate "getlogin" (different name?) that does the getuid() thing could be provided. Greg From guido at CNRI.Reston.VA.US Thu Dec 9 18:16:03 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 12:16:03 -0500 Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: Your message of "Thu, 09 Dec 1999 12:12:00 EST." <19991209121159.B20179@cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> <19991209121159.B20179@cnri.reston.va.us> Message-ID: <199912091716.MAA06063@eric.cnri.reston.va.us> > > getlogin() -- returns user's login name > > -- could do something similar with pwd.getpwuid( os.getuid() )[0], but > > getlogin() apparently looks in utmp > > With a documentation proviso that utmp is very old-fashioned, and you > really should do the getuid() thing unless you definitely want to get > the login ID from utmp. Perhaps an alternate "getlogin" (different > name?) that does the getuid() thing could be provided. There's the getpass module which has a getuser() function that looks in various env vars and if all else fails uses getuid() and pwd. If the goal is to get the user ID without being fooled, using os.getuid() or os.geteuid() directly seems to be the right thing to do; I don't see the need for a shorthand for pwd.getpwuid(os.getuid())[0] (which is what getuser() uses). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Thu Dec 9 18:18:10 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 12:18:10 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 10:02:18 EST." <384FC47A.BB4DA517@interet.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> Message-ID: <199912091718.MAA06087@eric.cnri.reston.va.us> [Jim A] > Lately I have been more active in distribution via archive files. > Part of the solution is an archive file format which is identical on > Unix and Windows, and which can hold the Python library and packages > as single files. For my own efforts on this see: > > ftp://ftp.interet.com/pub/pylib.html Apart from agreeing with Jean-Claude's rant about inventing a new archive format, I think this is a good proposal because it is very clear about the problem it tries to solve and doesn't get distracted by other issues. I also commend Jim for building upon Greg Stein's imputil (like Gordon did). I wish I could present a solution this simple as The Standard Way, but (as explained in my long post earlier today) there just are so many wrinkles that I'd rather hold out for the Right Solution... But I've taken good notice of Jim's solution. --Guido van Rossum (home page: http://www.python.org/~guido/) From beazley at cs.uchicago.edu Thu Dec 9 18:16:57 1999 From: beazley at cs.uchicago.edu (David Beazley) Date: Thu, 9 Dec 1999 11:16:57 -0600 (CST) Subject: [Python-Dev] Missing POSIX functions: the list References: <199912091632.LAA09236@amarok.cnri.reston.va.us> <19991209121159.B20179@cnri.reston.va.us> Message-ID: <199912091716.LAA15624@gargoyle.cs.uchicago.edu> Greg Ward writes: > > I think I already pointed this your way, but don't forget the man page > for Perl's POSIX module: "perldoc POSIX". I suspect POSIX functions > that don't make sense in Perl also don't make sense in Python. > > I agree with all your assessments about what's worth adding and what's > not, and that {close,read,open}dir() are questionable and probably not > worth the bother. Random thoughts: > I disagree. I think that the POSIX module should strive to be as complete as possible--even if certain functions are closely related other functionality in the library (tmpfile for instance). I suspect that this sort of thing is probably the cause of the missing functionality in the current library (as in, "why would anyone want to do that?" when in fact there may be a perfectly good reason in certain situations). > > abort() -- used in Py_FatalError(), but not accessible to Python code > > Would this do the same as in C, ie. terminate the process and dump core? > Sure, why not? This might be a useful thing to do every so often---when trying to figure out what's wrong with a C extension module for instance. Cheers, Dave From jim at interet.com Thu Dec 9 18:43:57 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 09 Dec 1999 12:43:57 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> Message-ID: <384FEA5D.A07F23EC@interet.com> Jean-Claude Wippler wrote: > Ouch - what's wrong with zip archives? Thanks very much for looking over the format. In general Zip archives store whole branches of a file system. A Python ./Lib zip archive would contain: N:/python/Python-1.5.2/Lib/string.pyc N:/python/Python-1.5.2/Lib/os.pyc N:/python/Python-1.5.2/Lib/copy.pyc N:/python/Python-1.5.2/Lib/test/testall.pyc Zip archives are isomorphic to branches of a file system. That means there must be a sys.path for each zip archive file. How would this be specified? The archive format stores modules as dotted names, just as they appear in the import statement. The search path is "." in every archive file by definition. The import statement "import foo" just results in a dictionary lookup for key "foo", not a search through a zip directory along a local search path for "foo.something" where "something" can be pyc, pyo, py, etc. The intent was to link the archives to the import statement, not re-create a directory tree. It borrowed this feature from the archive formats of Greg and Gordon. > There are utilities to convert to/from zip, to re-pack, to mount zip > transparently so it's entries look like regular files, FTP servers, etc. Basic operations (to, from, repack) are easy in Python. > Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format. Hmmm.... > Your format has no checksum, which for deployment and long-term storage > can be important. Actually the pylib.py "dir()" method reads all *.pyc with marshal, and I am depending on marshal to object to bad data and also out-of-date magic numbers. But this is a good point. > If you want a marshalled TOC, then why not add a manifest entry for it, > sort of like what ranlib does with ar? Sorry, I don't understand. Please explain. > You designed the format so archives can be concatenated without any tool > (other than "cat"), but this works just as well with zip files, as the > Tcl Wrap approach demonstrates. Are you saying that cat zip1.zip zip2.zip > myzip.zip works? An important feature is the ability to concatenate to a binary: cat python.exe zip1.zip > myapp.exe Searching for this isn't fast unless magic numbers are at the end. Are zip files recognizable from the end (I don't know)? > Allow me to very, very loosely paraphrase Guido here: sure, everyone can > design an archive format, but they are likely to make the same mistakes > all over again - so why not adopt a format which is tried and tested? > > With all due respect - I sincerely hope you will reconsider and alter > your code to work with zip files. It's probably a small adjustment? > > Unless your *intent* is to create a diverging standard, of course... The intent is to create a standard but not a diverging standard. Are there any zip experts out there? Can zip files satisfy all the design requirements I listed in pylib.html? Is there zip code available? All my code is in Python. JimA From jcw at equi4.com Thu Dec 9 18:57:33 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Thu, 09 Dec 1999 18:57:33 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> Message-ID: <384FED8D.3C535D38@equi4.com> Guido van Rossum wrote: > > [... my not-really-meant-as-rant about adopting zip as format ...] > [zip concatenation feature] > How does this work? Does that work for all zip tools, or just for the > ZIP reader in Wrap? (I looked up how Jim A does it -- his central > directory at the end of the file contains the total size of the data > covered by that directory, so he seeks back to the beginning of it and > sees if another magic number precedes it; and so on. Very simple.) Same for Wrap. Standard tools would not see the preceding ZIP groups. In terms of maintenance, I'd avoid this trick. I merely wanted to point out that zip archives can be stacked, if the reader is set up to it. > Question: does the wrap::open code go out to the regular filesystem > if it finds there's no wrap archive? That would be handy so you can > test the code in its unwrapped form without change. IIRC, Wrap overrides "open" for embedded entries as "file.zip/abc.py". There's more being developed in this area: a "virtual file system" which lets you mount archives and such (VFS by Matt Newman, mentioned with his permission), so that the file-system model can be extended to navigate into a lot more things than real file systems. Andrew Kuchling's post hints at another tangent: opendir/readdir is of course simply an enumeration. There's a lot of "genericity" lurking in scanning across file systems, trees, networks, and resources in general. The filesystem <-> OO dichotomy needs a review. > Python needs this too. Concepts like these have a lot to offer - and would make even more sense if they were done in a way which benefits multiple scripting languages. Feel free to reply by email if you ever want to further discuss this. -- Jean-Claude From fdrake at acm.org Thu Dec 9 19:10:44 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 13:10:44 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <14415.61604.415084.520092@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > After poking around in the O'Reilly POSIX book, here's a list of POSIX > functions that don't seem to be available in Python. Not all of them > seem worth supporting. Ironically, Greg Ward's daemonize() Perl I think your assessment is reasonable. I looked at posixmodule.c and note also that the functions use PyArg_Parse() and PyArg_NoArgs() instead of using PyArg_ParseTuple(). The advantage of PyArg_ParseTuple() is that the name of the function can be specified for inclusion in TypeError messages when the arguments are not of the right type. I'm doing some work to correct this now. I've also added ctermid(), and will try to add at least a few more before I check in the changes. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw at cnri.reston.va.us Thu Dec 9 19:17:35 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 9 Dec 1999 13:17:35 -0500 (EST) Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> Message-ID: <14415.62015.856931.750279@anthem.cnri.reston.va.us> >>>>> "JW" == Jean-Claude Wippler writes: JW> Same for Wrap. Standard tools would not see the preceding ZIP JW> groups. JW> In terms of maintenance, I'd avoid this trick. I merely JW> wanted to point out that zip archives can be stacked, if the JW> reader is set up to it. I agree. I can't recall the details now, but I had a lot of problems with zip concatenation in JPython. I think at least some of the older Java tools for groking zips don't work with contatenation. -Barry From guido at CNRI.Reston.VA.US Thu Dec 9 19:21:42 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 13:21:42 -0500 Subject: [Python-Dev] Virtual filesystem APIs In-Reply-To: Your message of "Thu, 09 Dec 1999 18:57:33 +0100." <384FED8D.3C535D38@equi4.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> Message-ID: <199912091821.NAA06209@eric.cnri.reston.va.us> Jean-Claude Wippler: > There's more being developed in this area: a "virtual file system" which > lets you mount archives and such (VFS by Matt Newman, mentioned with his > permission), so that the file-system model can be extended to navigate > into a lot more things than real file systems. I agree. We have experimented with this a bunch in the Knowbot sofware, where we have some code that wants to look at a "filesystem" but could be talking to some kind of filesystem emulation across an RPC connection or alternatively could be accessing a zip file. Our conclusion is that a convenient interface is modeled after (a subset of) the os and os.path functionality. In fact, the only thing you would need to add to the os module would be a function to open a file object; I've proposed to add os.fopen() as an alias for the built-in open(). The idea that you could mount one VFS inside another is nice, although I'm not sure how practical it is. For one thing, in our fs code, os.path.sep and friends (e.g. os.path.normcase behavior) were set per filesystem; what would happen if you mounted a Unix filesystem in an NT tree? Doing the translations is hard too; e.g. on a Mac fs, the separator is ':' and a '/' can be part of a filename -- do you simply swap them? What if a Mac file has both '/' and '\' and you mount it on a Windows FS? I'd rather stay away from this. On the other hand the VFS concept could be used as a totally different solution to the sys.importers vs. sys.path > Andrew Kuchling's post hints at another tangent: opendir/readdir is of > course simply an enumeration. There's a lot of "genericity" lurking in > scanning across file systems, trees, networks, and resources in general. I'd still rather see listdir() (which our sample virtual FS API supported). I don't think it necessarily makes sense to do this on a more generic basis -- other trees and graphs have sufficiently different semantics that using a FS like API doesn't necessarily cut it. Take for example the Windows registry -- looks a lot like a filesystem, doesn't it? Yet it has one fundamental property that a typical FS doesn't: directory nodes can have data *and* children... I've written a tree widget and found that it's remarkably hard to come up with a workable API to talk to trees *in general*. Trees are a universal concept, but code sharing is still elusive... Perhaps because the concept is so simple? > The filesystem <-> OO dichotomy needs a review. I think that my proposal above should cover this. (We looked briefly at doing a similar thing for Java, and found that it's actually harder there -- they have all these nice objects representing paths, but it's not easily subclassable to represent paths in some virtual filesystem.) > Concepts like these have a lot to offer - and would make even more sense > if they were done in a way which benefits multiple scripting languages. > Feel free to reply by email if you ever want to further discuss this. I see only very hope for this point of view, but I will refrain to comment more. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Thu Dec 9 19:23:14 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 9 Dec 1999 13:23:14 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <384FEA5D.A07F23EC@interet.com> Message-ID: <1267359311-37934097@hypernet.com> James C. Ahlstrom wrote: > Jean-Claude Wippler wrote: > > > Ouch - what's wrong with zip archives? > In general Zip archives store whole branches of a file > system. > The archive format stores modules as dotted names, just as they > appear in the import statement. The search path is "." in every > archive file by definition. The import statement "import foo" > just results in a dictionary lookup for key "foo", not a search > through a zip directory along a local search path for > "foo.something" where "something" can be pyc, pyo, py, etc. > > The intent was to link the archives to the import statement, not > re-create a directory tree. It borrowed this feature from the > archive formats of Greg and Gordon. As I've stated before, I have 2 archive formats. This may seem a needless complication, but my suspicion is that sooner or later, people will want 2 different kinds. One is a .pyz format, which corresponds closely to Jim's .pyl format (with a number of minor differences: it's compressed, the archive as a whole has the Python magic number, instead of each entry, and it's not designed for concatenation). The other is like a zip, and probably should be zip format. It's designed to hold _anything_, and can be manipulated from C and from Python. It can be concatenated and / or embedded (and the innner one opened without extraction). It's table of contents is more file-system like. Importing from one is slower, but that's not really what it's for. It's for packaging up arbitrary resources. Like .pyz's, or Tcl/Tk for Tkinter apps, or configuration files. Jim is correct that a good importer (which can say "No, it's not mine" as quickly as possible) is better satisfied by a simple dictionary lookup than fooling with file extensions and directories (virtual or real). > > If you want a marshalled TOC, then why not add a manifest entry > > for it, sort of like what ranlib does with ar? > > Sorry, I don't understand. Please explain. The table of contents is just another entry. > An important feature is the ability to concatenate to a binary: > cat python.exe zip1.zip > myapp.exe > Searching for this isn't fast unless magic numbers are at the > end. Are zip files recognizable from the end (I don't know)? Where do you think we got this idea? > Are there any zip experts out there? Can zip files satisfy all > the design requirements I listed in pylib.html? Is there zip > code available? All my code is in Python. Hmm. My bookmark appears to be dead (I was there not long ago): http://www.cubic.org/source/archive/fileform/packers/appnote.t xt There have been several references on this list to Guido et al having some Python / zip code. - Gordon From guido at CNRI.Reston.VA.US Thu Dec 9 19:23:27 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 13:23:27 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 13:17:35 EST." <14415.62015.856931.750279@anthem.cnri.reston.va.us> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> <14415.62015.856931.750279@anthem.cnri.reston.va.us> Message-ID: <199912091823.NAA06243@eric.cnri.reston.va.us> > I agree. I can't recall the details now, but I had a lot of problems > with zip concatenation in JPython. I think at least some of the older > Java tools for groking zips don't work with contatenation. The Java "jar" tool mostly ignores the central directory -- it seems to read the archive from the front, using the local header records, and ignoring the central directory (of course it writes one when it creates an archive). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Thu Dec 9 19:32:15 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 13:32:15 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Thu, 09 Dec 1999 12:43:57 EST." <384FEA5D.A07F23EC@interet.com> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <384FEA5D.A07F23EC@interet.com> Message-ID: <199912091832.NAA06287@eric.cnri.reston.va.us> > In general Zip archives store whole branches of a file > system. A Python ./Lib zip archive would contain: > > N:/python/Python-1.5.2/Lib/string.pyc > N:/python/Python-1.5.2/Lib/os.pyc > N:/python/Python-1.5.2/Lib/copy.pyc > N:/python/Python-1.5.2/Lib/test/testall.pyc > > Zip archives are isomorphic to branches of a file system. > That means there must be a sys.path for each zip archive file. > How would this be specified? Not true. It's easy (using the proper Zip tools) to creat an archive containing this instead: string.pyc os.pyc copy.pyc testall.pyc Thus the entire archive is considered the directory. The Java "jar" tool uses this approach. It's also easy to have packages in there (again this is what Java does): test/ test/__init__.pyc test/pystone.pyc test_support.pyc (etc.) > The archive format stores modules as dotted names, just as they > appear in the import statement. The search path is "." in every > archive file by definition. The import statement "import foo" > just results in a dictionary lookup for key "foo", not a search > through a zip directory along a local search path for "foo.something" > where "something" can be pyc, pyo, py, etc. > > The intent was to link the archives to the import statement, not > re-create a directory tree. It borrowed this feature from > the archive formats of Greg and Gordon. Maybe you've gone overboard. The time it takes to translate the dots into slashes really isn't the big deal. > Are there any zip experts out there? Can zip files satisfy all the > design requirements I listed in pylib.html? Is there zip code > available? All my code is in Python. Yes (all of us here at CNRI), yes, yes (we have the spaghetti code). While zip files support compression, they support uncompressed files as well and we could go either way. Their most popular compression format is gzip compatible and can be read and written with the zlib module, which is in the standard Python distribution (even on Windows) -- though to build it you need the zlib C library which is of course external (but solid open source). --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 9 19:41:22 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 13:41:22 -0500 (EST) Subject: [Python-Dev] Virtual filesystem APIs In-Reply-To: <199912091821.NAA06209@eric.cnri.reston.va.us> References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> <199912091821.NAA06209@eric.cnri.reston.va.us> Message-ID: <14415.63442.92911.748132@weyr.cnri.reston.va.us> Guido van Rossum writes: > os.path.sep and friends (e.g. os.path.normcase behavior) were set per Hah! Caught you in public! "sep" & friends are defined in the os module; this is where the separation breaks down. I think these should be located in os.path, and os can just pick them up from there to be backward compatible. os.pathsep is a problem, somewhat; it is related to os.sep, but is very different in many ways. I don't think there's a good way to deal with it. > filesystem; what would happen if you mounted a Unix filesystem in an > NT tree? Doing the translations is hard too; e.g. on a Mac fs, the > separator is ':' and a '/' can be part of a filename -- do you simply > swap them? What if a Mac file has both '/' and '\' and you mount it > on a Windows FS? I'd rather stay away from this. And this is tightly related to the sep/pathsep problem as well. I agree, we should stay away from it. > I think that my proposal above should cover this. (We looked briefly > at doing a similar thing for Java, and found that it's actually harder > there -- they have all these nice objects representing paths, but it's > not easily subclassable to represent paths in some virtual But it was easy to create a set of interfaces with a reasonable API; getting back to the "typical" Java classes was what really changed the most. For those of us not working on the KOE: I set up Filesystem and FSFile interfaces; the Filesystem represented the entire filesystem and the FSFile was very similar to the java.io.File class, but had additional methods to get input and output stream objects (of the standard Java flavor); all the buffering and such could be wrapped on top of that just like any other Java I/O. The specific application was to provide access to an isolated directory structure which untrusted code "owned", but ensured that parent directories were unreachable. Additional security checks can be worked into such a structure as applicable. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake at acm.org Thu Dec 9 20:06:32 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 14:06:32 -0500 (EST) Subject: [Python-Dev] posix module test suite Message-ID: <14415.64952.780974.8124@weyr.cnri.reston.va.us> There's not a test for the posix or os modules; if anyone would like to contribute one, this would be a good time! ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jcw at equi4.com Thu Dec 9 21:51:11 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Thu, 09 Dec 1999 21:51:11 +0100 Subject: [Python-Dev] Virtual filesystem APIs References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <199912091655.LAA05928@eric.cnri.reston.va.us> <384FED8D.3C535D38@equi4.com> <199912091821.NAA06209@eric.cnri.reston.va.us> Message-ID: <3850163F.80BDCB75@equi4.com> Guido van Rossum wrote: > [... horrors of cross-OS mounts and ":\/" separators ...] I agree, this has some very hairy sides to it. But VFS is really more about mounting non-FS things in a "root" FS (presumably the real one). > On the other hand the VFS concept could be used as a totally different > solution to the sys.importers vs. sys.path Heck, I'll be the "enfant terrible" once more: yes, and this stuff could well be implemented generically across scripting languages. Of course the act of "importing" is a very Pythonic issue - but FS/VFS traversal and the actual shared library load need not be. Anyway, enough of that. > Take for example the Windows registry -- looks a lot like a > filesystem, doesn't it? Yet it has one fundamental property that a > typical FS doesn't: directory nodes can have data *and* children... What you're saying is that dir = set-of-subdirs + set-of-files, and that this is a more general requirement than plain FS's. Doesn't that simply mean that the more general model is needed as basis to handle both? > Trees are a universal concept, but code sharing is still elusive... Ah, but think of the implications: archives, networks, XML, the world! -- Jean-Claude From fdrake at acm.org Thu Dec 9 22:16:00 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 16:16:00 -0500 (EST) Subject: [Python-Dev] forwarded message from Fred L. Drake Message-ID: <14416.7184.255000.342231@weyr.cnri.reston.va.us> OK, I've checked in some changes to the posix module to add support for a few of the POSIX interfaces Andrew expressed interest in seeing (and some he said weren't such a good idea, or at least not necessary, but about which I decided I disagreed after all). For those of you who aren't on the checkins list (??), I've attached the message so you'll know what functions were added. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives -------------- next part -------------- An embedded message was scrubbed... From: "Fred L. Drake" Subject: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.115,2.116 Date: Thu, 9 Dec 1999 16:13:10 -0500 (EST) Size: 3800 URL: From guido at CNRI.Reston.VA.US Thu Dec 9 22:19:57 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 16:19:57 -0500 Subject: [Python-Dev] forwarded message from Fred L. Drake In-Reply-To: Your message of "Thu, 09 Dec 1999 16:16:00 EST." <14416.7184.255000.342231@weyr.cnri.reston.va.us> References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> Message-ID: <199912092119.QAA06731@eric.cnri.reston.va.us> > OK, I've checked in some changes to the posix module to add support > for a few of the POSIX interfaces Andrew expressed interest in seeing > (and some he said weren't such a good idea, or at least not necessary, > but about which I decided I disagreed after all). I wish you'd made your disagreement public before checking it in... But it's not too late... --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Thu Dec 9 22:32:26 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Thu, 9 Dec 1999 16:32:26 -0500 (EST) Subject: [Python-Dev] forwarded message from Fred L. Drake In-Reply-To: <14416.7184.255000.342231@weyr.cnri.reston.va.us> References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> Message-ID: <14416.8170.18298.33796@amarok.cnri.reston.va.us> Fred L. Drake, Jr. writes (in a CVS checkin): >Added support for abort(), ctermid(), tmpfile(), tempnam(), tmpnam(), >and TMP_MAX. For those of you following along, the tmpfile(), tempnam(), tmpnam() functions were ones I listed as probably not worth adding. On the other hand, David Beazley wrote: > I think that the POSIX module should strive to be as >complete as possible--even if certain functions are closely related >other functionality in the library (tmpfile for instance). I suspect ... and that's a good point, too. The POSIX functions may provide adaptability that a Python analog doesn't; for example, you could read /etc/passwd in pure Python, but that wouldn't handle NIS or shadow passwords. So I guess I'll vote for completeness over lack of overlap; leave tmpfile() & friends in. -- A.M. Kuchling http://starship.python.net/crew/amk/ This supports reflection, which is the 90s way of writing self-modifying code. -- John Aycock at IPC7, during his parsing talk From guido at CNRI.Reston.VA.US Thu Dec 9 22:38:42 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 16:38:42 -0500 Subject: [Python-Dev] forwarded message from Fred L. Drake In-Reply-To: Your message of "Thu, 09 Dec 1999 16:32:26 EST." <14416.8170.18298.33796@amarok.cnri.reston.va.us> References: <14416.7184.255000.342231@weyr.cnri.reston.va.us> <14416.8170.18298.33796@amarok.cnri.reston.va.us> Message-ID: <199912092138.QAA06790@eric.cnri.reston.va.us> > ... and that's a good point, too. The POSIX functions may provide > adaptability that a Python analog doesn't; for example, you could read > /etc/passwd in pure Python, but that wouldn't handle NIS or shadow > passwords. So I guess I'll vote for completeness over lack of > overlap; leave tmpfile() & friends in. OK, I agree now. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 9 23:30:52 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 17:30:52 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <14416.11676.888918.511932@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > After poking around in the O'Reilly POSIX book, here's a list of POSIX Ok, here's my comments on the remainder of these. > Worth adding? > ============= > opendir(), readdir(), closedir() -- > most of their functionality is available through > os.listdir(), but it might be useful to have a direct > interface. Downside is that this would require a new > extension type for the C DIR struct. My (lazy) inclination > is to not bother. [rewinddir() and seekdir() should be considered as well, where supported.] There's more tedium than anything in implementing a new C type. I'm a little concerned that there might not be any real value here, but it's hard to be sure about that. Is there any real reason not to use os.listdir(). > Worth adding: > ============= ... > fpathconf(fd, name) -- Get configuration limit for a file > -- would need constants from unistd.h This is mostly a matter of setting up the constants; not hard, just more distracting than I want to deal with right now. > getlogin() -- returns user's login name > -- could do something similar with pwd.getpwuid( os.getuid() )[0], but > getlogin() apparently looks in utmp Per Guido's comments, I'm not sure how valuable it is. It may make sense strictly for completeness, but I've never heard of utmp being considered reliable in any way. Maybe I'm too new at all this. > getgroups(gidsetsize, grouplist) -- Gets supplementary group IDs This should be easy enough. > pathconf(path, name) -- Gets config variables for a path > -- would need constants from unistd.h (Same as for fpathconf().) > sysconf(int name) -- Gets system configuration information > -- would need constants from unistd.h > > Not worth adding: > ================= Aside from the ones I've already added, I agree. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim at digicool.com Fri Dec 10 00:31:40 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 09 Dec 1999 18:31:40 -0500 Subject: [Python-Dev] Thankyou for fsync :) Message-ID: <38503BDC.CB91FB29@digicool.com> I found recently that I needed fsync and was pleasantly surprized to find that it is provided in the posix module, where available. Can I count on it staying in the posix module, when available, for the forseeable future? Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gstein at lyra.org Fri Dec 10 01:32:33 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 9 Dec 1999 16:32:33 -0800 (PST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <14416.11676.888918.511932@weyr.cnri.reston.va.us> Message-ID: On Thu, 9 Dec 1999, Fred L. Drake, Jr. wrote: > Andrew M. Kuchling writes: >... > > opendir(), readdir(), closedir() -- > > most of their functionality is available through > > os.listdir(), but it might be useful to have a direct > > interface. Downside is that this would require a new > > extension type for the C DIR struct. My (lazy) inclination > > is to not bother. > > [rewinddir() and seekdir() should be considered as well, where > supported.] > > There's more tedium than anything in implementing a new C type. I'm > a little concerned that there might not be any real value here, but > it's hard to be sure about that. Is there any real reason not to use > os.listdir(). No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic number if you're worried about mixing CObjects. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at CNRI.Reston.VA.US Fri Dec 10 03:03:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 09 Dec 1999 21:03:04 -0500 Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: Your message of "Thu, 09 Dec 1999 18:31:40 EST." <38503BDC.CB91FB29@digicool.com> References: <38503BDC.CB91FB29@digicool.com> Message-ID: <199912100203.VAA07410@eric.cnri.reston.va.us> > I found recently that I needed fsync and was pleasantly surprized > to find that it is provided in the posix module, where available. > > Can I count on it staying in the posix module, when available, > for the forseeable future? Since we seem to be on an adding spree, I don't see why not -- as long as POSIX keeps it available :) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Fri Dec 10 07:28:56 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 10 Dec 1999 00:28:56 -0600 (CST) Subject: [Python-Dev] posix module test suite In-Reply-To: <14415.64952.780974.8124@weyr.cnri.reston.va.us> References: <14415.64952.780974.8124@weyr.cnri.reston.va.us> Message-ID: <14416.40360.611743.143624@dolphin.mojam.com> Fred> There's not a test for the posix or os modules; if anyone would Fred> like to contribute one, this would be a good time! ;-) Not having ever written any tests for the core Python modules, it seems natural to ask if there are any guidelines for the construction of such tests or the test equivalent of the Modules/xxmodule.c file. Are there standard behaviors expected for passing and failing a test? Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From tim_one at email.msn.com Fri Dec 10 09:48:59 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 10 Dec 1999 03:48:59 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <14415.23676.775163.786028@dolphin.mojam.com> Message-ID: <000501bf42eb$66529860$412d153f@tim> [Skip Montanaro] > Alright! Now I understand what all the hubbub is about! My eyes have > mostly been glazing over trying to follow all this Windows > registry/path/ini stuff. MS believes that Python is the application. > Those of us writing Python programs view those programs as the > applications, not the Python interpreter per se. Eww -- that's a helpful and insightful way to put it, Skip! Now maybe *I* can understand what the hubbub is about . > Is there some way that people writing applications in Python can set > up registry entries that are specific to their application (e.g. > tabnanny.py) instead of only specific to the Python interpreter? Yes, but they can't get Python to look at those before it's too late. I spent a whole evening a month or two ago just trying to figure out where all the cruft in my Windows sys.path *came* from. This is out-of-the-box; I haven't added anything myself: ['', 'D:\\Python\\win32', 'D:\\Python\\win32\\lib', 'D:\\Python', 'D:\\Python\\Pythonwin', 'D:\\Python\\Lib\\plat-win', 'D:\\Python\\Lib', 'D:\\Python\\DLLs', 'D:\\Python\\Lib\\lib-tk', 'D:\\PYTHON\\DLLs', 'D:\\PYTHON\\lib', 'D:\\PYTHON\\lib\\plat-win', 'D:\\PYTHON\\lib\\lib-tk', 'D:\\PYTHON'] That's bizarre on the face of it, and tracking it all down was draining. I've forgotten the details. I do remember concluding that it was impossible to do what I wanted to do without changing the implementation, though, and nobody on Python-Dev disputed that at the time. In a pragmatic crunch, I wrote the little app I needed to distribute at the time in Perl instead, meaning to come back to this. I haven't had time. IIRC, the ultimate problem wasn't really that Python looked at the registry to get *some* path info, it was a combination of A) It looked at the registry so early that it was impossible to stop it from executing whatever site.py the registry pointed at (well, I could with the -S option -- but then there was no way to get it to do the site.py that was *wanted* instead). B) No way to override what was in the registry; e.g., I was greatly surprised to discover that setting a PYTHONPATH envar didn't override anything, it simply plunked the PYTHONPATH entries into sys.path along with everything else -- and too late to stop anything anyway. In a long msg I haven't yet read all the way thru, Guido at least suggested associating different registry path info with different Python versions. That would address a number of otherwise currently intractable problems. I suspect it still wouldn't help with the problem I was facing, though. That is, I wanted to be able to tell people to run \\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py which is just a Windows way of saying "run a Python executable from a shared network location". When they tried that, though, the network Python looked in *their* individual registries for its Python path info, and some of the hackers with mondo customized Python setups on their own machines watched things go down in flames. This certainly can't be a common problem, but it speaks to an unforgiving rigidity in the current approach. There seemed to be nothing I could do to guarantee this would work, short of telling users to edit their registries before running this tool (that's a non-starter on Windows -- editing the registry is dangerous) or putting a customized Python on the network pointing to a bogus registry key (it was faster to write the app in Perl! Perl doesn't *try* to be so infernally helpful , so doesn't get in the way either). I'm left wondering what purpose putting Python library path info into the Windows registry serves. Is there anyone on Windows who *doesn't* have their Python Lib/ etc as direct subdirectories of the directory containing python.exe? Not that I've seen. Python puts *those* in sys.path too -- but only after it (in the normal case; see my sys.path above) pulls identically redundant paths out of the registry first, or (in the cases we're griping about) pulls irrelevant or downright harmful paths out of the registry first (paths appropriate to the last Python you *installed*, not to the Python that's *running*!). Perhaps all this cruft is needed to support embedded Python, though (something I've never done). Regardless, I expect it would have been enough for me if PYTHONPATH simply worked the way I mistakenly assumed it would (that is, this is sys.path, and that's *it*; feel free to prepend the current directory when initialization is complete, but before then looking at any file not reached from PYTHONPATH is verboten). the-cleverer-the-code-the-more-vital-that-there-be-a-way-to- short-circuit-it-ly y'rs - tim From jim at interet.com Fri Dec 10 13:16:31 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 07:16:31 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000501bf42eb$66529860$412d153f@tim> Message-ID: <3850EF1F.158445B6@interet.com> Tim Peters wrote: > > [Skip Montanaro] > > Is there some way that people writing applications in Python can set > > Yes, but they can't get Python to look at those before it's too late. I > spent a whole evening a month or two ago just trying to figure out where all > the cruft in my Windows sys.path *came* from. This is out-of-the-box; I > ..... Excellent discussion Tim! > I suspect it still wouldn't help with the problem I was facing, though. > That is, I wanted to be able to tell people to run > > \\dragres01\mrec\reduce\python \\dragres01\mrec\reduce\reduce.py > > which is just a Windows way of saying "run a Python executable from a shared > network location". When they tried that, though, the network Python looked > in *their* individual registries for its Python path info, and some of the > hackers with mondo customized Python setups on their own machines watched > things go down in flames. I think a sensible way to run little apps is to put everything in an archive file including the main.py. On Windows you concattenate that to python.exe, and it Just Works. > Windows registry serves. Is there anyone on Windows who *doesn't* have > their Python Lib/ etc as direct subdirectories of the directory containing > python.exe? Not that I've seen. Point on the curve. We don't. We freeze everything except the main.py. JimA From jim at interet.com Fri Dec 10 14:38:28 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 08:38:28 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> Message-ID: <38510254.ED15D32B@interet.com> Jean-Claude Wippler wrote: > Ouch - what's wrong with zip archives? > With all due respect - I sincerely hope you will reconsider and alter > your code to work with zip files. It's probably a small adjustment? OK, you talked me into it. Ya, small adjustment, no problem ;-) JimA From jack at oratrix.nl Fri Dec 10 14:51:10 1999 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 10 Dec 1999 14:51:10 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Message by "James C. Ahlstrom" , Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com> Message-ID: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> Is it possible nowadays to have two files with the same name but different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive? That's the one thing that always struck me as very very silly about zipfiles. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From gmcm at hypernet.com Fri Dec 10 15:28:51 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Fri, 10 Dec 1999 09:28:51 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> References: Message by "James C. Ahlstrom" , Thu, 09 Dec 1999 12:43:57 -0500 , <384FEA5D.A07F23EC@interet.com> Message-ID: <1267287023-386248@hypernet.com> Jack Jansen asks: > Is it possible nowadays to have two files with the same name but > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same > archive? Depends on how you do it. If the user imports foo.spam.bar, an importer will be asked for: foo (return foo.__init__) foo.spam (return foo.bar.__init__) foo.spam.bar (return foo.spam.bar) But the API allows lots of variations. This is another possible interaction: foo (return None) foo.__init__ (return foo.__init__) foo.spam (return None) foo.bar.__init__ (return foo.bar.__init__) foo.spam.bar (return foo.spam.bar) Or, by looking at different args to get_code, you could look at the requests as: foo in context of None spam in context of foo bar in context of foo.spam With another variation where the request for __init__ becomes explicit. The first way seems the natural way for archives, and makes it easy to keep foo.bar.spam distinct from foo.spam. > That's the one thing that always struck me as very very silly > about zipfiles. Huh? - Gordon From guido at CNRI.Reston.VA.US Fri Dec 10 15:51:39 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 09:51:39 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 14:51:10 +0100." <19991210135111.2F83C370CF2@snelboot.oratrix.nl> References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> Message-ID: <199912101451.JAA07786@eric.cnri.reston.va.us> > Is it possible nowadays to have two files with the same name but different > paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive? > > That's the one thing that always struck me as very very silly about zipfiles. Zip files contain the full path, there's no problem with that. Was there ever? --Guido van Rossum (home page: http://www.python.org/~guido/) From jack at oratrix.nl Fri Dec 10 15:52:26 1999 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 10 Dec 1999 15:52:26 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Message by "Gordon McMillan" , Fri, 10 Dec 1999 09:28:51 -0500 , <1267287023-386248@hypernet.com> Message-ID: <19991210145227.01F99370CF2@snelboot.oratrix.nl> > Jack Jansen asks: > > > Is it possible nowadays to have two files with the same name but > > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same > > archive? > > Depends on how you do it. Apparently I mis-phrased my question, I'll try again. When people suggested to use zip format as the standard Python archive format I was a bit worried, becuase I've had it happen to me various times that I was unable to create a ZIP archive with two files with the same name but different paths (i.e. create an archive of a directory that contains both a foo/bar.py and a foo/spam/bar.py). So, my question was: has this happened to me because the winzip I used was braindead, or is there possibly a problem with the ZIP file format that disallows two files with the same name in one archive? Most zip programs I've seen also seem to present filenames as the primary metaphore, with full pathnames somewhat "tacked on". If the latter is the case I wonder whether zip is the right format to use... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido at CNRI.Reston.VA.US Fri Dec 10 16:00:51 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 10:00:51 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 15:52:26 +0100." <19991210145227.01F99370CF2@snelboot.oratrix.nl> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> Message-ID: <199912101500.KAA07863@eric.cnri.reston.va.us> Again, the zip format does not have this problem. Some zip tools may -- then we don't use those. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Fri Dec 10 16:40:21 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 10:40:21 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: References: <14416.11676.888918.511932@weyr.cnri.reston.va.us> Message-ID: <14417.7909.511437.230915@weyr.cnri.reston.va.us> Greg Stein writes: > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic > number if you're worried about mixing CObjects. That's certainly one option, but I would have made readdir(), seekdir(), rewinddir() and closedir() into the methods read(), seek(), rewind() and close(). So it's a question of what interface you prefer; functions with magically interpreted token parameters (kind of like file descriptors, hey!), or something that is more recognizably object-oriented. I know my preference. ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From mal at lemburg.com Fri Dec 10 16:55:02 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 16:55:02 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> Message-ID: <38512256.F9287E24@lemburg.com> Jack Jansen wrote: > > > Jack Jansen asks: > > > > > Is it possible nowadays to have two files with the same name but > > > different paths (i.e. foo/bar.py and foo/spam/bar.py) in the same > > > archive? > > > > Depends on how you do it. > > Apparently I mis-phrased my question, I'll try again. > > When people suggested to use zip format as the standard Python archive format > I was a bit worried, becuase I've had it happen to me various times that I was > unable to create a ZIP archive with two files with the same name but different > paths (i.e. create an archive of a directory that contains both a foo/bar.py > and a foo/spam/bar.py). > > So, my question was: has this happened to me because the winzip I used was > braindead, or is there possibly a problem with the ZIP file format that > disallows two files with the same name in one archive? Most zip programs I've > seen also seem to present filenames as the primary metaphore, with full > pathnames somewhat "tacked on". > > If the latter is the case I wonder whether zip is the right format to use... Hmm, I've been doing the above for years now... never had a problem with it (I use Info-ZIPs tools, BTW), e.g. /home/lemburg> unzip -l projects/distribution/mxODBC-1.1.1.zip Archive: projects/distribution/mxODBC-1.1.1.zip Length Date Time Name -------- ---- ---- ---- 131316 06-09-99 14:10 ODBC/EasySoft/mxODBC.c 131316 06-09-99 14:10 ODBC/Informix/mxODBC.c ... Would be cool if I could use my packages as ZIP files :-) So here's another vote for using the ZIP format. BTW, wouldn't it make sense to include the zlib code in the core distribution much like the pcre stuff is now ? AFAIK, it is public domain and including it would remedy many of the compatibility issues with the different zlib versions around. Ok, that was my wish for Xmas :-) The rest is up to Mr. Clause... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Dec 10 17:04:24 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:04:24 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 16:55:02 +0100." <38512256.F9287E24@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> Message-ID: <199912101604.LAA14100@eric.cnri.reston.va.us> > BTW, wouldn't it make sense to include the zlib code > in the core distribution much like the pcre stuff is now ? > AFAIK, it is public domain and including it would remedy many of the > compatibility issues with the different zlib versions around. What compatibility issues? Note that the Win32 distri already comes with zlib statically linked into zlib.pyd. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Fri Dec 10 17:15:48 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:15:48 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> Message-ID: <38512734.CF6E4489@lemburg.com> Guido van Rossum wrote: > > > BTW, wouldn't it make sense to include the zlib code > > in the core distribution much like the pcre stuff is now ? > > AFAIK, it is public domain and including it would remedy many of the > > compatibility issues with the different zlib versions around. > > What compatibility issues? Note that the Win32 distri already comes > with zlib statically linked into zlib.pyd. There were issues with zlib 1.0.4 and later ones. Also, many Linux distributions don't have the zlib header files installed. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Dec 10 17:19:47 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:19:47 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 17:15:48 +0100." <38512734.CF6E4489@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> Message-ID: <199912101619.LAA14174@eric.cnri.reston.va.us> > There were issues with zlib 1.0.4 and later ones. Also, many > Linux distributions don't have the zlib header files installed. Hm. I don't recall having any problems reported to me. I'd rather not include the entire zlib distri in the Python distri -- zlib is rather big. Adding only the Unix source would be cheating. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Fri Dec 10 17:25:23 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:25:23 -0500 Subject: [Python-Dev] dbm clone with serious specs wanted Message-ID: <199912101625.LAA14216@eric.cnri.reston.va.us> Someone has asked me for a dbm clone that can store 16M keys of 350 bytes each, and runs on Linux, HPUX, and NT. That's 5.6 Gigabyte in keys alone! I presume most classic approaches won't cut it since total file size is typicall limited by the seek system call, internal data structures and/or file index format to 2Gb (signed longs) or 4Gb (unsigned longs). Does anyone have an idea where to start looking? Would a Python extension already exist? --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli at amber.org Fri Dec 10 17:29:27 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Fri, 10 Dec 1999 11:29:27 -0500 Subject: [Python-Dev] dbm clone with serious specs wanted In-Reply-To: <199912101625.LAA14216@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Fri, Dec 10, 1999 at 11:25:23AM -0500 References: <199912101625.LAA14216@eric.cnri.reston.va.us> Message-ID: <19991210112927.A14102@trump.amber.org> Guido van Rossum [guido at CNRI.Reston.VA.US] wrote: > Someone has asked me for a dbm clone that can store 16M keys of 350 > bytes each, and runs on Linux, HPUX, and NT. That's 5.6 Gigabyte in > keys alone! I presume most classic approaches won't cut it since > total file size is typicall limited by the seek system call, internal > data structures and/or file index format to 2Gb (signed longs) or 4Gb > (unsigned longs). > > Does anyone have an idea where to start looking? Would a Python > extension already exist? Assuming you mean an interface to a ddbm-style situation, you could easily use berkeley DB, I belive it is limited in the 4TB range... Chris -- | Christopher Petrilli | petrilli at amber.org From mal at lemburg.com Fri Dec 10 17:26:10 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:26:10 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> Message-ID: <385129A2.6FAF4E81@lemburg.com> Guido van Rossum wrote: > > > There were issues with zlib 1.0.4 and later ones. Also, many > > Linux distributions don't have the zlib header files installed. > > Hm. I don't recall having any problems reported to me. I'd rather > not include the entire zlib distri in the Python distri -- zlib > is rather big. Adding only the Unix source would be cheating. How about only adding those parts which would be needed to at least deflate the ZIP archive contents ? If the ZIP archive format becomes the standard for Python, we'd have to ensure that all Python users can read them. Well, at least that's what I would expect from a standard format :-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Dec 10 17:29:36 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 10 Dec 1999 11:29:36 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: Your message of "Fri, 10 Dec 1999 17:26:10 +0100." <385129A2.6FAF4E81@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> Message-ID: <199912101629.LAA14274@eric.cnri.reston.va.us> > How about only adding those parts which would be needed to > at least deflate the ZIP archive contents ? Ditto -- still lots of portability issues I bet. > If the ZIP archive format becomes the standard for Python, we'd > have to ensure that all Python users can read them. Well, at > least that's what I would expect from a standard format :-) There's a simple solution: don't use compression. With current disk prices it's really not worth it. Let the installer do the decompression (installers travel across networks where compression *is* worth it). --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Fri Dec 10 17:34:09 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Fri, 10 Dec 1999 11:34:09 -0500 (EST) Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: <38512734.CF6E4489@lemburg.com> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> Message-ID: <14417.11137.562474.99270@amarok.cnri.reston.va.us> M.-A. Lemburg writes: >There were issues with zlib 1.0.4 and later ones. Also, many >Linux distributions don't have the zlib header files installed. For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm, and zlib.XXX.rpm only contains libz.so. On the other hand, anyone who's compiling Python should really have the various -devel RPMs installed. I'd argue against including it, because it might cause odd versioning problems. For example, what if I have PIL compiled against zlib1.1.2 (zlib is used for writing PNGs) and the Python binary includes zlib1.1.3? There might be hard-to-debug problems caused by calling the wrong symbol. PCRE is a special case, because we've actually hacked the code a lot; it's not the PCRE code as Philip Hazel distributes it. Just received Guido's email suggesting skipping compression in archives; not a bad idea. You'd use less CPU, but might do more I/O because you're reading more sectors off disk. There probably isn't much need for compression when the archive is on-disk; Java needed it because of applets. -- A.M. Kuchling http://starship.python.net/crew/amk/ The NSA response was, "Well, that was interesting, but there aren't any ciphers like that." -- Gus Simmons, "The History of Subliminal Channels" From petrilli at amber.org Fri Dec 10 17:39:44 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Fri, 10 Dec 1999 11:39:44 -0500 Subject: [Python-Dev] dbm clone with serious specs wanted In-Reply-To: <19991210112927.A14102@trump.amber.org>; from petrilli@amber.org on Fri, Dec 10, 1999 at 11:29:27AM -0500 References: <199912101625.LAA14216@eric.cnri.reston.va.us> <19991210112927.A14102@trump.amber.org> Message-ID: <19991210113944.B14102@trump.amber.org> Christopher Petrilli [petrilli at amber.org] wrote: > Guido van Rossum [guido at CNRI.Reston.VA.US] wrote: > > Does anyone have an idea where to start looking? Would a Python > > extension already exist? > > Assuming you mean an interface to a ddbm-style situation, you could easily > use berkeley DB, I belive it is limited in the 4TB range... I just did some checking... first Robin Dunn has an interface, but it's not currently compatible with BerkeleyDB 3.x, which just came out... it shouldn't be hard to retrofit. Anyway, the limits are based on page size... 512b page: 2TB 64K page: 256TB It uses 32bit numbers for pages, so I assume that is also a reflection of the number of keys allowed... given I belive one key must use a minimum of one page. I know that I've pushed earlier releases o around 50Gb without trouble, but you might see issues relatd to the number of keys. I'd ask Sleepycat directly, as they'r amazingly responsive. Chris -- | Christopher Petrilli | petrilli at amber.org From mal at lemburg.com Fri Dec 10 17:37:30 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:37:30 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> <199912101629.LAA14274@eric.cnri.reston.va.us> Message-ID: <38512C4A.ADB63C2B@lemburg.com> Guido van Rossum wrote: > > > How about only adding those parts which would be needed to > > at least deflate the ZIP archive contents ? > > Ditto -- still lots of portability issues I bet. Hmm, not sure: zlib is pretty portable. Its the interface changes that can break code, not so much the zlib portability. > > If the ZIP archive format becomes the standard for Python, we'd > > have to ensure that all Python users can read them. Well, at > > least that's what I would expect from a standard format :-) > > There's a simple solution: don't use compression. With current disk > prices it's really not worth it. Let the installer do the > decompression (installers travel across networks where compression > *is* worth it). That's a possibility, right. It would still let us use the many ZIP tools while not adding complexity to the core. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 10 17:43:11 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 10 Dec 1999 17:43:11 +0100 Subject: [Python-Dev] dbm clone with serious specs wanted References: <199912101625.LAA14216@eric.cnri.reston.va.us> Message-ID: <38512D9F.2AE9DC8B@lemburg.com> Guido van Rossum wrote: > > Someone has asked me for a dbm clone that can store 16M keys of 350 > bytes each, and runs on Linux, HPUX, and NT. That's 5.6 Gigabyte in > keys alone! I presume most classic approaches won't cut it since > total file size is typicall limited by the seek system call, internal > data structures and/or file index format to 2Gb (signed longs) or 4Gb > (unsigned longs). > > Does anyone have an idea where to start looking? Would a Python > extension already exist? I'd suggest using a dbm style wrapper around the DB-API and then trying out the many cross-platform databases. IBM DB2 comes to mind... it can certainly handle these sizes given the right hardware. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake at acm.org Fri Dec 10 18:35:01 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 12:35:01 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <199912100203.VAA07410@eric.cnri.reston.va.us> References: <38503BDC.CB91FB29@digicool.com> <199912100203.VAA07410@eric.cnri.reston.va.us> Message-ID: <14417.14789.306365.439782@weyr.cnri.reston.va.us> Guido van Rossum writes: > Since we seem to be on an adding spree, I don't see why not -- as long > as POSIX keeps it available :) fsync() isn't listed in O'Reilly's POSIX book, so it's probably not in the POSIX spec. Neither is the tempnam() function I added in yesterdays spree, though tmpfile() and tmpnam() are. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim at digicool.com Fri Dec 10 19:37:53 1999 From: jim at digicool.com (Jim Fulton) Date: Fri, 10 Dec 1999 18:37:53 +0000 Subject: [Python-Dev] Thankyou for fsync :) References: <38503BDC.CB91FB29@digicool.com> <199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us> Message-ID: <38514881.5C124E36@digicool.com> "Fred L. Drake, Jr." wrote: > > Guido van Rossum writes: > > Since we seem to be on an adding spree, I don't see why not -- as long > > as POSIX keeps it available :) > > fsync() isn't listed in O'Reilly's POSIX book, so it's probably not > in the POSIX spec. It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;) I'd still like it to stay, where available. :) Jim -- Jim Fulton mailto:jim at digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fdrake at acm.org Fri Dec 10 19:36:44 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 13:36:44 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <38514881.5C124E36@digicool.com> References: <38503BDC.CB91FB29@digicool.com> <199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us> <38514881.5C124E36@digicool.com> Message-ID: <14417.18492.932392.608912@weyr.cnri.reston.va.us> Jim Fulton writes: > It's not, it's in XPG3 (sp?), but I wasn't going t bring that up. ;) I don't have that one, but I certainly don't have any plans on ripping out fsync(). Not today, at any rate. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim at interet.com Fri Dec 10 19:37:50 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 13:37:50 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210135111.2F83C370CF2@snelboot.oratrix.nl> Message-ID: <3851487E.F610BE17@interet.com> Jack Jansen wrote: > > Is it possible nowadays to have two files with the same name but different > paths (i.e. foo/bar.py and foo/spam/bar.py) in the same archive? Yes, I just made one with WinZip. JimA From gmcm at hypernet.com Fri Dec 10 19:41:56 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Fri, 10 Dec 1999 13:41:56 -0500 Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <38514881.5C124E36@digicool.com> Message-ID: <1267271840-1299809@hypernet.com> Fred L. Drake, Jr. wrote: > > Guido van Rossum writes: > > Since we seem to be on an adding spree, I don't see why not > > -- as long as POSIX keeps it available :) > > fsync() isn't listed in O'Reilly's POSIX book, so it's > probably not > in the POSIX spec. > It's in the other O'Reilly POSIX book, p 348 of POSIX.4. - Gordon From fdrake at acm.org Fri Dec 10 19:43:56 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 13:43:56 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <1267271840-1299809@hypernet.com> References: <38514881.5C124E36@digicool.com> <1267271840-1299809@hypernet.com> Message-ID: <14417.18924.461115.906914@weyr.cnri.reston.va.us> Gordon McMillan writes: > It's in the other O'Reilly POSIX book, p 348 of POSIX.4. Ah, I don't have that either. I thought POSIX.4 was real-time stuff. (If anyone wants to send a copy along, I'd be glad to consider adding reasonable interfaces for Python. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim at interet.com Fri Dec 10 19:43:18 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 13:43:18 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> Message-ID: <385149C6.DF942F36@interet.com> Jack Jansen wrote: > When people suggested to use zip format as the standard Python archive format > I was a bit worried, becuase I've had it happen to me various times that I was > unable to create a ZIP archive with two files with the same name but different > paths (i.e. create an archive of a directory that contains both a foo/bar.py > and a foo/spam/bar.py). No problem. But most zip tools will create an archive with either no path (file name is "bar.py") or full path (filename "foo/bar.py". If paths are different Ok, not sure about duplicate bare names. The difference is an option and has nothing to do with how the file name is specified to the utility. JimA From jim at interet.com Fri Dec 10 19:48:47 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 10 Dec 1999 13:48:47 -0500 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> Message-ID: <38514B0F.84A546C6@interet.com> "M.-A. Lemburg" wrote: > How about only adding those parts which would be needed to > at least deflate the ZIP archive contents ? > > If the ZIP archive format becomes the standard for Python, we'd > have to ensure that all Python users can read them. Well, at > least that's what I would expect from a standard format :-) I think that for now we will need to create archives with compression method zero: no compression. That is a valid compression method all ZIP utilities support. The point is that zlib just isn't part of Python. Jim From jcw at equi4.com Fri Dec 10 19:57:00 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Fri, 10 Dec 1999 19:57:00 +0100 Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <385129A2.6FAF4E81@lemburg.com> <38514B0F.84A546C6@interet.com> Message-ID: <38514CFC.47C8A8E0@equi4.com> "James C. Ahlstrom" wrote: [...] > I think that for now we will need to create archives with > compression method zero: no compression. That is a valid > compression method all ZIP utilities support. Sounds good. This is also exactly how Java started out with jar. -jcw From gmcm at hypernet.com Fri Dec 10 20:06:59 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Fri, 10 Dec 1999 14:06:59 -0500 Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us> References: <1267271840-1299809@hypernet.com> Message-ID: <1267270337-1390160@hypernet.com> Fred wrote: > Gordon McMillan writes: > > It's in the other O'Reilly POSIX book, p 348 of POSIX.4. > > Ah, I don't have that either. I thought POSIX.4 was real-time > stuff. Well, it says it is, but having done some stuff with automated warehouses, I'm always amazed at how people will use the term "real-time". I'd say "pretty likely to be responsive" ;-). > (If anyone wants to send a copy along, I'd be glad to consider > adding reasonable interfaces for Python. ;) Only around 70 documented functions, but many of them appear to be tweaks, or redocumenting stuff in view of new kernel behaviors. - Gordon From fdrake at acm.org Fri Dec 10 20:18:16 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 14:18:16 -0500 (EST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <1267270337-1390160@hypernet.com> References: <1267271840-1299809@hypernet.com> <1267270337-1390160@hypernet.com> Message-ID: <14417.20984.151867.630871@weyr.cnri.reston.va.us> Gordon McMillan writes: > Well, it says it is, but having done some stuff with automated > warehouses, I'm always amazed at how people will use the > term "real-time". I'd say "pretty likely to be responsive" ;-). Oh, a manager's interpretation of real-time: "I want this by close of business next Wednesday!" > Only around 70 documented functions, but many of them > appear to be tweaks, or redocumenting stuff in view of new > kernel behaviors. Anything that should be added anywhere? Failing all else, I can probably read the man pages if I know what to look for. ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake at acm.org Fri Dec 10 22:40:29 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 10 Dec 1999 16:40:29 -0500 (EST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <199912091632.LAA09236@amarok.cnri.reston.va.us> References: <199912091632.LAA09236@amarok.cnri.reston.va.us> Message-ID: <14417.29517.238124.767279@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > fpathconf(fd, name) -- Get configuration limit for a file ... > pathconf(path, name) -- Gets config variables for a path ... > sysconf(int name) -- Gets system configuration information > -- would need constants from unistd.h I'm almost done with these, and also confstr (from POSIX.2). I don't have time to finish them today; I'll check them in next week. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From skip at mojam.com Sat Dec 11 00:20:21 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 10 Dec 1999 17:20:21 -0600 (CST) Subject: [Python-Dev] Thankyou for fsync :) In-Reply-To: <14417.18924.461115.906914@weyr.cnri.reston.va.us> References: <38514881.5C124E36@digicool.com> <1267271840-1299809@hypernet.com> <14417.18924.461115.906914@weyr.cnri.reston.va.us> Message-ID: <14417.35509.284749.924066@dolphin.mojam.com> Fred> I thought POSIX.4 was real-time stuff. This all seems to be happening in real-time to me... ;-) Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From andy at robanal.demon.co.uk Sat Dec 11 01:11:28 1999 From: andy at robanal.demon.co.uk (Andy Robinson) Date: Sat, 11 Dec 1999 00:11:28 GMT Subject: [Python-Dev] Zip format (was: Questions about distutils strategy ) In-Reply-To: <199912101619.LAA14174@eric.cnri.reston.va.us> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> Message-ID: <38519531.15439641@post.demon.co.uk> On Fri, 10 Dec 1999 11:19:47 -0500, you wrote: >> There were issues with zlib 1.0.4 and later ones. Also, many >> Linux distributions don't have the zlib header files installed. > >Hm. I don't recall having any problems reported to me. I'd rather >not include the entire zlib distri in the Python distri -- zlib >is rather big. Adding only the Unix source would be cheating. > Minor data point on the importance of zlib. I spent a long time figuring out what Adobe PDF's "flate filter" was before I discovered it was the inverse of "deflate" (yes, there were loud sounds of head-slapping when I clicked) and discovered that zlib.compress() was EXACTLY what you need to create compressed streams in PDF documents. Being a Windows person, I naively assumed zlib was in the standard distribution everywhere, and subsequently discovered Mac and Unix users were not so happy. So if you want to make PDFs, having zlib around is very useful indeed... - Andy From akuchlin at mems-exchange.org Sat Dec 11 01:35:58 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Fri, 10 Dec 1999 19:35:58 -0500 (EST) Subject: [Python-Dev] Enabling more modules by default In-Reply-To: <38519531.15439641@post.demon.co.uk> References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <38519531.15439641@post.demon.co.uk> Message-ID: <14417.40046.850655.491684@amarok.cnri.reston.va.us> Andy Robinson writes: >... So if you want to make PDFs, having zlib >around is very useful indeed... This raises a good point, though I still dislike the idea of including the zlib library. It would be nice if Setup.in would be autogenerated to compile all the modules it can -- bsddb if it finds libdb, zlib if it finds libz.a. I vaguely recall once working on a Python script that would generate a customized Setup.in file, though I can't find it at the moment. Given that someone has already suggested automatically enabling threads on those platforms that support it, why not go all the way? (But a Python script that generates a Setup.in isn't going to work, unless we compile a minipython first and then create a more complete Setup file.) -- A.M. Kuchling http://starship.python.net/crew/amk/ The most merciful thing in the world... is the inability of the human mind to correlate all its contents. -- H.P. Lovecraft From petrilli at amber.org Sat Dec 11 06:54:41 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Sat, 11 Dec 1999 00:54:41 -0500 Subject: [Python-Dev] Enabling more modules by default In-Reply-To: <14417.40046.850655.491684@amarok.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Fri, Dec 10, 1999 at 07:35:58PM -0500 References: <19991210145227.01F99370CF2@snelboot.oratrix.nl> <38512256.F9287E24@lemburg.com> <199912101604.LAA14100@eric.cnri.reston.va.us> <38512734.CF6E4489@lemburg.com> <199912101619.LAA14174@eric.cnri.reston.va.us> <38519531.15439641@post.demon.co.uk> <14417.40046.850655.491684@amarok.cnri.reston.va.us> Message-ID: <19991211005441.A20923@trump.amber.org> Andrew M. Kuchling [akuchlin at mems-exchange.org] wrote: > Andy Robinson writes: > >... So if you want to make PDFs, having zlib > >around is very useful indeed... > > This raises a good point, though I still dislike the idea of including > the zlib library. It would be nice if Setup.in would be autogenerated > to compile all the modules it can -- bsddb if it finds libdb, zlib if > it finds libz.a. I vaguely recall once working on a Python script that > would generate a customized Setup.in file, though I can't find it at > the moment. Given that someone has already suggested automatically > enabling threads on those platforms that support it, why not go all > the way? WEll, one warning about BSDdb, is that it comes in 3 incarnations that all might be -ldb :-): 1.85 2.x 3.x and they are NOT compatible with eachother. 1.85 has serious brain damage, and honestly I'd love to see Robin Dunn's 2.x work rolled in to replace it, but not sure how viable that is---people might actually want the 1.85 breakage. Chris -- | Christopher Petrilli | petrilli at amber.org From gstein at lyra.org Sat Dec 11 12:23:30 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 03:23:30 -0800 (PST) Subject: [Python-Dev] Zip format In-Reply-To: <1267287023-386248@hypernet.com> Message-ID: On Fri, 10 Dec 1999, Gordon McMillan wrote: >... > If the user imports foo.spam.bar, an importer will be asked for: > foo (return foo.__init__) > foo.spam (return foo.bar.__init__) ^^^ foo.spam.__init__ > foo.spam.bar (return foo.spam.bar) The above sequence is what currently happens. > But the API allows lots of variations. This is another possible > interaction: > foo (return None) > foo.__init__ (return foo.__init__) > foo.spam (return None) > foo.bar.__init__ (return foo.bar.__init__) > foo.spam.bar (return foo.spam.bar) The core of imputil has no knowledge of the __init__ thingy. That is specific to the filesystem-based stuff. So in this sense, "possible" means "imputil could be changed to do this". I would argue against the change, however :-) > Or, by looking at different args to get_code, you could look at > the requests as: > foo in context of None > spam in context of foo > bar in context of foo.spam Bing! Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Sat Dec 11 12:26:59 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 03:26:59 -0800 (PST) Subject: [Python-Dev] Zip format In-Reply-To: <14417.11137.562474.99270@amarok.cnri.reston.va.us> Message-ID: On Fri, 10 Dec 1999, Andrew M. Kuchling wrote: > M.-A. Lemburg writes: > >There were issues with zlib 1.0.4 and later ones. Also, many > >Linux distributions don't have the zlib header files installed. > > For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm, > and zlib.XXX.rpm only contains libz.so. On the other hand, anyone > who's compiling Python should really have the various -devel RPMs Exactly. The distro's *have* the headers -- it all depends on what you installed. I happen to have the headers on my system (because I installed zlib-devel, as AMK mentions). > installed. I'd argue against including it, because it might cause odd > versioning problems. For example, what if I have PIL compiled against > zlib1.1.2 (zlib is used for writing PNGs) and the Python binary > includes zlib1.1.3? There might be hard-to-debug problems > caused by calling the wrong symbol. I totally agree. >... > Just received Guido's email suggesting skipping compression in > archives; not a bad idea. You'd use less CPU, but might do > more I/O because you're reading more sectors off disk. There > probably isn't much need for compression when the archive is on-disk; > Java needed it because of applets. There are all kinds of things that we can do here. Consider mmap'ing the archive into a shared memory segment, used by all the Python processes on the system... woo! :-) IMO, the standard distro can use zip files, and just bail if they are compressed, but Python cannot load zlib. Obvious failure with an obvious remedy. No big deal. As Guido also mentions, an installer can just bring along zlib if they want to use a compressed archive. i.e. their choice. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Sat Dec 11 12:33:47 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 03:33:47 -0800 (PST) Subject: [Python-Dev] Missing POSIX functions: the list In-Reply-To: <14417.7909.511437.230915@weyr.cnri.reston.va.us> Message-ID: On Fri, 10 Dec 1999, Fred L. Drake, Jr. wrote: > Greg Stein writes: > > No need to do a new type. Just wrap the DIR* into a PyCObject. Add a magic > > number if you're worried about mixing CObjects. > > That's certainly one option, but I would have made readdir(), > seekdir(), rewinddir() and closedir() into the methods read(), seek(), > rewind() and close(). So it's a question of what interface you > prefer; functions with magically interpreted token parameters (kind of > like file descriptors, hey!), or something that is more recognizably > object-oriented. > I know my preference. ;-) Well, I know my preference of those two alternatives, too :-), but if we're going with the Pythonic minimalism, then I'd think you would expose the functions "as close as possible." Would I argue if you went with a method-based approach? No :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Sat Dec 11 14:07:08 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 11 Dec 1999 14:07:08 +0100 Subject: [Python-Dev] Zip format References: Message-ID: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com> Greg Stein wrote: > There are all kinds of things that we can do here. Consider mmap'ing the > archive into a shared memory segment, used by all the Python processes on > the system... woo! :-) it doesn't really look like this, but I hope we're defining interfaces here, and not just "one true solution". I'd be very annoyed if it turned out that we couldn't use works' archives with the new standard importer... > As Guido also mentions, an installer can just bring along zlib if they > want to use a compressed archive. i.e. their choice. in the pythonworks universe, the installer and the application is the same thing... From fredrik at pythonware.com Sat Dec 11 14:12:12 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 11 Dec 1999 14:12:12 +0100 Subject: [Python-Dev] Thankyou for fsync :) References: <38503BDC.CB91FB29@digicool.com><199912100203.VAA07410@eric.cnri.reston.va.us> <14417.14789.306365.439782@weyr.cnri.reston.va.us> Message-ID: <006c01bf43d9$57bc0f90$f29b12c2@secret.pythonware.com> Fred L. Drake, Jr. wrote: > fsync() isn't listed in O'Reilly's POSIX book, so it's probably not > in the POSIX spec. Neither is the tempnam() function I added in > yesterdays spree, though tmpfile() and tmpnam() are. instead of guessing, you can get a complete list from: http://www.unix-systems.org/apis.html reading up on the "single unix specification" should also help: http://www.unix-systems.org/online.html (registration required; contains complete man pages for all functions covered by the UNIX95 and UNIX98 specification) From gstein at lyra.org Sat Dec 11 14:10:00 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 11 Dec 1999 05:10:00 -0800 (PST) Subject: [Python-Dev] Zip format In-Reply-To: <005f01bf43d8$a22b0af0$f29b12c2@secret.pythonware.com> Message-ID: On Sat, 11 Dec 1999, Fredrik Lundh wrote: > Greg Stein wrote: > > There are all kinds of things that we can do here. Consider mmap'ing the > > archive into a shared memory segment, used by all the Python processes on > > the system... woo! :-) > > it doesn't really look like this, but I hope we're defining > interfaces here, and not just "one true solution". I'd be Oh, I was just having fun there :-). I don't see "one true solution" at all. Just some standards. > very annoyed if it turned out that we couldn't use works' > archives with the new standard importer... get_code() and its processing is not going anywhere. Some stuff will change under the covers, and we'll be using sys.path (typically) rather than chaining (although chaining will still exist!). I would think that your Importer subclass would be directly usable, but the installation could/would be a bit different. Heck, worst case, nothing is going to invalidate your archive format -- feel free to berate me if I ever break that! Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim at interet.com Mon Dec 13 15:50:11 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 13 Dec 1999 09:50:11 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000301bf4206$b39e5b80$36a2143f@tim> <384FC47A.BB4DA517@interet.com> <384FDAF5.C25C447C@equi4.com> <38510254.ED15D32B@interet.com> Message-ID: <385507A3.9F6AAF0F@interet.com> > Jean-Claude Wippler wrote: > > > Ouch - what's wrong with zip archives? > > > With all due respect - I sincerely hope you will reconsider and alter > > your code to work with zip files. It's probably a small adjustment? OK, I now have a new module "zipfile" which reads and writes ZIP files. It is written in Python and has been tested on Windows and Linux. I tested it with WinZip and found that the files it creates are read OK with WinZip, and WinZip files are read OK with zipfile. So I am withdrawing my Python archive file format, and re-writing all my stuff using zipfile. It should all be done in a week. Basically everything works fine. But there are some problems. Python seems to lack a CRC-32 function, so I wrote one in Python. It is slow. We need to add a CRC-32 function to some Python built-in module that it always present, like md5 or binascci. The zlib module is not necessarily present. I can't seem to get WinZip to record a partial path. That is, I want the ./Lib/test package to have these ZIP paths: test/__init__.pyc test/testall.pyc ... but WinZip creates files with either no path at all or the fully specified path. Am I missing something? Do all other ZIP tools do this too? JimA Return-Path: Delivered-To: python-dev at dinsdale.python.org Received: from python.org (parrot.python.org [132.151.1.90]) by dinsdale.python.org (Postfix) with ESMTP id EFDA11CDB9 for ; Mon, 13 Dec 1999 10:21:56 -0500 (EST) Received: from cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1] (may be forged)) by python.org (8.9.1a/8.9.1) with ESMTP id KAA06423 for ; Mon, 13 Dec 1999 10:21:55 -0500 (EST) Received: from kaluha.cnri.reston.va.us (kaluha.cnri.reston.va.us [132.151.7.31]) by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id KAA04774 for ; Mon, 13 Dec 1999 10:21:56 -0500 (EST) Received: from eric.cnri.reston.va.us (eric.cnri.reston.va.us [10.27.10.23]) by kaluha.cnri.reston.va.us (8.9.1b+Sun/8.9.1) with ESMTP id KAA04556 for ; Mon, 13 Dec 1999 10:22:34 -0500 (EST) Received: from CNRI.Reston.VA.US (localhost [127.0.0.1]) by eric.cnri.reston.va.us (8.9.3+Sun/8.9.1) with ESMTP id KAA18858 for ; Mon, 13 Dec 1999 10:22:34 -0500 (EST) Resent-Message-Id: <199912131522.KAA18858 at eric.cnri.reston.va.us> Message-Id: <199912131522.KAA18858 at eric.cnri.reston.va.us> To: "James C. Ahlstrom" Subject: Re: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-reply-to: Your message of "Mon, 13 Dec 1999 09:50:11 EST." <385507A3.9F6AAF0F at interet.com> References: <000301bf4206$b39e5b80$36a2143f at tim> <384FC47A.BB4DA517 at interet.com> <384FDAF5.C25C447C at equi4.com> <38510254.ED15D32B at interet.com> <385507A3.9F6AAF0F at interet.com> Date: Mon, 13 Dec 1999 10:22:12 -0500 From: Guido van Rossum Resent-Cc: python-dev at python.org Resent-Date: Mon, 13 Dec 1999 10:22:34 -0500 Resent-From: Guido van Rossum Sender: python-dev-admin at python.org Errors-To: python-dev-admin at python.org X-BeenThere: python-dev at python.org X-Mailman-Version: 1.2 (experimental) Precedence: bulk List-Id: Python core developers > OK, I now have a new module "zipfile" which reads and > writes ZIP files. It is written in Python and has been tested > on Windows and Linux. I tested it with WinZip and found that > the files it creates are read OK with WinZip, and WinZip > files are read OK with zipfile. So I am withdrawing my > Python archive file format, and re-writing all my stuff > using zipfile. It should all be done in a week. Ah, good! (This saves me the trouble of cleaning up our own zip code :-) > Basically everything works fine. But there are some problems. > > Python seems to lack a CRC-32 function, so I wrote one > in Python. It is slow. We need to add a CRC-32 function > to some Python built-in module that it always present, like > md5 or binascci. The zlib module is not necessarily present. > > I can't seem to get WinZip to record a partial path. That is, > I want the ./Lib/test package to have these ZIP paths: > test/__init__.pyc > test/testall.pyc > ... > but WinZip creates files with either no path at all or the > fully specified path. Am I missing something? Do all > other ZIP tools do this too? Unclick the "Save Extra Folder Info" and then drag the *parent* folder into the archive. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Mon Dec 13 18:00:26 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 13 Dec 1999 12:00:26 -0500 (EST) Subject: [Python-Dev] confstr(), fpathconf(), pathconf(), sysconf() Message-ID: <14421.9770.623399.673010@weyr.cnri.reston.va.us> I've just checked in bindings for these POSIX.1 and POSIX.2 functions, and thought I'd explain the interfaces for those who don't want to read the diffs. ;) These functions expect a "name" parameter (that's how it's described in the man pages and the O'Reilly book). The value for "name" is an integer that's defined in the system headers. The constants all have the form _XX_SOME_NAME where XX is PC for fpathconf()- and pathconf()-related names, SC for sysconf()-related names, and CS for confstr()-related names. Some names are defined by the standards, but additional names are defined by implementations (there are a *lot* of sysconf() names under Solaris!). We don't want to expose enormous numbers of constants in the module's interface, however, as there are already a lot of names in the posix module. That would also slow down module initialization. We also don't want to force callers to use magic numbers in code that uses these functions, especially since the values may be system-specific. The best way to call these functions, then, is to use a *string* that corresponds to the name of the C #define sysmbol with the leading underscore stripped off. For example, to get the length of the arguments to exec(), you could say: num_args = os.sysconf("SC_ARG_MAX") The string will be mapped to the appropriate numeric value defined in an internal table. If the name isn't defined for the platform, a ValueError will be raised. >>> num_args = os.sysconf("FOO_BAR") Traceback (innermost last): File "", line 1, in ? ValueError: unrecognized configuration name To allow retrieval for platform-dependent configuration information, integers can also be passed in. On Solaris, this is equivalent to using "SC_ARG_MAX": num_args = os.sysconf(1) (Ignoring the portability and readability issues, ha!) There are three separate tables used for this; one for confstr(), one for sysconf(), and one shared by fpathconf() and pathconf(). The names used to build the tables come from Linux and Solaris; we can add other names as needed. To add names, I'd need the names to add and how to test for their existence at compile time (#ifdef, etc.). -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake at acm.org Mon Dec 13 19:35:49 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 13 Dec 1999 13:35:49 -0500 (EST) Subject: [Python-Dev] CVS: python/dist/src/Modules posixmodule.c,2.116,2.117 In-Reply-To: References: <199912131637.LAA17318@weyr.cnri.reston.va.us> Message-ID: <14421.15493.28263.387680@weyr.cnri.reston.va.us> Greg Stein writes: > I'm not very familiar with these APIs, but should you let go of the > interpreter lock when you call them? > (and for the other new funcs) None of these should be doing an I/O as far as I can determine. Whenever I get to getlogin() (which AMK & I decided should be included, based on the specs that /F pointed us to), I will release the interpreter lock for the getlogin_r() variant. I'm not sure I should release it for the non-reentrant getlogin(), however; the specification for getlogin*() pretty much requires that it read from utmp. ;( -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From gstein at lyra.org Mon Dec 13 21:31:22 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 13 Dec 1999 12:31:22 -0800 (PST) Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <385507A3.9F6AAF0F@interet.com> Message-ID: On Mon, 13 Dec 1999, James C. Ahlstrom wrote: >... > OK, I now have a new module "zipfile" which reads and > writes ZIP files. It is written in Python and has been tested > on Windows and Linux. I tested it with WinZip and found that > the files it creates are read OK with WinZip, and WinZip > files are read OK with zipfile. So I am withdrawing my > Python archive file format, and re-writing all my stuff > using zipfile. It should all be done in a week. Can you post zipfile.py so that people can starting reviewing that? >... > Python seems to lack a CRC-32 function, so I wrote one > in Python. It is slow. We need to add a CRC-32 function > to some Python built-in module that it always present, like > md5 or binascci. The zlib module is not necessarily present. See zlib.crc32() This is interesting, of course, because we have previously stated that zlib (and its compression) is optional. But if we need the CRC-32 function... hehe... Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one at email.msn.com Mon Dec 13 23:11:33 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 13 Dec 1999 17:11:33 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <385507A3.9F6AAF0F@interet.com> Message-ID: <000401bf45b7$04edfaa0$96a2143f@tim> [James C. Ahlstrom] > ... > Python seems to lack a CRC-32 function, so I wrote one > in Python. It is slow. We need to add a CRC-32 function > to some Python built-in module that it always present, like > md5 or binascci. The zlib module is not necessarily present. Unfortunately, there are many different CRC functions in common use. None belong in md5; if the intent is to support just zip's version, adding a (say) zipcrc32 function to binascii would be ok; if we expect to support others as well, a new parameterized crc module would be in order. > I can't seem to get WinZip to record a partial path. That is, > I want the ./Lib/test package to have these ZIP paths: > test/__init__.pyc > test/testall.pyc > ... > but WinZip creates files with either no path at all or the > fully specified path. Am I missing something? Do all > other ZIP tools do this too? No, it's a clumsiness unique to WinZip (damn GUIs <0.9 wink>). In the Add dialog box, you need to cd to the *Lib* directory, check the "Save extra folder info" box, and then, e.g., 1. Put test\*.pyc in the Add Files line, and click Add With Wildcards. Then all test\*.pyc files will be added, with paths test/__init__.pyc etc. or 2. Put "test\__init__.pyc" "test\testall.pyc" (including the quotes!) in the Add Files line, and click Add. Since #2 can be unbearable, other useful strategies include: 3. Use #1 (e.g. with dir\*.*) then delete the files you didn't really want. 4. Use #1 repeatedly, cleverly using a number of wildcard patterns that cover the files of interest. 5. Mixtures of #3 and #4. 6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has an "experimental" cmdline add-on too, but haven't tried it). From jim at interet.com Tue Dec 14 14:13:03 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 08:13:03 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: Message-ID: <3856425F.8C5E7A42@interet.com> Greg Stein wrote: > > Can you post zipfile.py so that people can starting reviewing that? Yes, it will be available by next Monday. I just want to get it really working and pretty, and with documentation. JimA From jim at interet.com Tue Dec 14 14:26:50 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 08:26:50 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000401bf45b7$04edfaa0$96a2143f@tim> Message-ID: <3856459A.BF5A798A@interet.com> Tim Peters wrote: > > [James C. Ahlstrom] > > ... > > Python seems to lack a CRC-32 function, so I wrote one > > Unfortunately, there are many different CRC functions in common use. None > belong in md5; if the intent is to support just zip's version, adding a > (say) zipcrc32 function to binascii would be ok; if we expect to support > others as well, a new parameterized crc module would be in order. OK, a CRC-32 in binascii it is. The CRC-32 I have comes with these comments which seem to indicate it is a more "official standard" CRC-32 than average: # * Crc - 32 BIT ANSI X3.66 CRC checksum files #*********************************************************************\ #* *| #* Demonstration program to compute the 32-bit CRC used as the frame *| #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71 *| #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level *| #* protocol). The 32-bit FCS was added via the Federal Register, *| #* 1 June 1982, p.23798. I presume but don't know for certain that *| #* this polynomial is or will be included in CCITT V.41, which *| #* defines the 16-bit CRC (often called CRC-CCITT) polynomial. FIPS *| #* PUB 78 says that the 32-bit FCS reduces otherwise undetected *| #* errors by a factor of 10^-5 over 16-bit FCS. *| #* *| #********************************************************************* #* Copyright (C) 1986 Gary S. Brown. You may use this program, or #* code or tables extracted from it, as desired without restriction. I can submit this as a patch to binascii, or if the Copyright bothers anyone, maybe it is better for Guido to use his CRC-32 from his ZIP code. Preference? > > I can't seem to get WinZip to record a partial path. That is, > > dialog box, you need to cd to the *Lib* directory, check the "Save extra > folder info" box, and then, e.g., Thanks. I knew there had to be some magic incantation to do it. > 6. Use a comand-line zip tool instead (e.g., pkzip; I think WinZip has > an "experimental" cmdline add-on too, but haven't tried it). Actually pkzip 2.04g doesn't work because it writes names in upper case and is limited to 8.3 names (I think). My zipfile.py can be used as a basis for a command line tool. Actually I use makefiles with imbedded Python programs and find this easier than command line tools. JimA From guido at CNRI.Reston.VA.US Tue Dec 14 15:53:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 09:53:04 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: Your message of "Tue, 14 Dec 1999 08:26:50 EST." <3856459A.BF5A798A@interet.com> References: <000401bf45b7$04edfaa0$96a2143f@tim> <3856459A.BF5A798A@interet.com> Message-ID: <199912141453.JAA23429@eric.cnri.reston.va.us> > OK, a CRC-32 in binascii it is. The CRC-32 I > have comes with these comments which seem to indicate it is a > more "official standard" CRC-32 than average: > > # * Crc - 32 BIT ANSI X3.66 CRC checksum files > #*********************************************************************\ > #* *| > #* Demonstration program to compute the 32-bit CRC used as the frame *| > #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71 *| > #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level *| > #* protocol). The 32-bit FCS was added via the Federal Register, *| > #* 1 June 1982, p.23798. I presume but don't know for certain that *| > #* this polynomial is or will be included in CCITT V.41, which *| > #* defines the 16-bit CRC (often called CRC-CCITT) polynomial. FIPS *| > #* PUB 78 says that the 32-bit FCS reduces otherwise undetected *| > #* errors by a factor of 10^-5 over 16-bit FCS. *| > #* *| > #********************************************************************* > #* Copyright (C) 1986 Gary S. Brown. You may use this program, or > #* code or tables extracted from it, as desired without restriction. > > I can submit this as a patch to binascii, or if the Copyright bothers > anyone, maybe it is better for Guido to use his CRC-32 from his ZIP > code. Preference? I looked, but "my" crc32 in the zlib module (which was actually contributed by Andrew Kuchling) is just a wrapper around the crc32 function in zlib, which is copyrighted by Mark Adler and follows the zlib rules. I propose to use Gary Brown's code. I'll defend this to CNRI's lawyers if need be. Jim, have you checked that this is the right CRC to use for zip's CRC? (This in the light of Tim's assertion that there are many CRCs around.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at interet.com Tue Dec 14 16:22:56 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 10:22:56 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000401bf45b7$04edfaa0$96a2143f@tim> <3856459A.BF5A798A@interet.com> <199912141453.JAA23429@eric.cnri.reston.va.us> Message-ID: <385660D0.C6C0C7B9@interet.com> Guido van Rossum wrote: > I propose to use Gary Brown's code. I'll defend this to CNRI's > lawyers if need be. > > Jim, have you checked that this is the right CRC to use for zip's CRC? > (This in the light of Tim's assertion that there are many CRCs around.) The CRC it calculates agrees with the CRC of WinZip for all files I have tried. The original Gary Brown code was much longer and included file reading. Here is the shortened version: JimA # * Crc - 32 BIT ANSI X3.66 CRC checksum files #*********************************************************************\ #* *| #* Demonstration program to compute the 32-bit CRC used as the frame *| #* check sequence in ADCCP (ANSI X3.66, also known as FIPS PUB 71 *| #* and FED-STD-1003, the U.S. versions of CCITT's X.25 link-level *| #* protocol). The 32-bit FCS was added via the Federal Register, *| #* 1 June 1982, p.23798. I presume but don't know for certain that *| #* this polynomial is or will be included in CCITT V.41, which *| #* defines the 16-bit CRC (often called CRC-CCITT) polynomial. FIPS *| #* PUB 78 says that the 32-bit FCS reduces otherwise undetected *| #* errors by a factor of 10^-5 over 16-bit FCS. *| #* *| #********************************************************************* # #* Copyright (C) 1986 Gary S. Brown. You may use this program, or #* code or tables extracted from it, as desired without restriction. # First, the polynomial itself and its table of feedback terms. The # polynomial is # X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X^1+X^0 # Note that we take it "backwards" and put the highest-order term in # the lowest-order bit. The X^32 term is "implied"; the LSB is the # X^31 term, etc. The X^0 term (usually shown as "+1") results in # the MSB being 1. # Note that the usual hardware shift register implementation, which # is what we're using (we're merely optimizing it by doing eight-bit # chunks at a time) shifts bits into the lowest-order term. In our # implementation, that means shifting towards the right. Why do we # do it this way? Because the calculated CRC must be transmitted in # order from highest-order term to lowest-order term. UARTs transmit # characters in order from LSB to MSB. By storing the CRC this way, # we hand it to the UART in the order low-byte to high-byte; the UART # sends each low-bit to hight-bit; and the result is transmission bit # by bit from highest- to lowest-order term without requiring any bit # shuffling on our part. Reception works similarly. # The feedback terms table consists of 256, 32-bit entries. Notes: # # 1. The table can be generated at runtime if desired; code to do so # is shown later. It might not be obvious, but the feedback # terms simply represent the results of eight shift/xor opera- # tions for all combinations of data and CRC register values. # # 2. The CRC accumulation logic is the same for all CRC polynomials, # be they sixteen or thirty-two bits wide. You simply choose the # appropriate table. Alternatively, because the table can be # generated at runtime, you can start by generating the table for # the polynomial in question and use exactly the same "updcrc", # if your application needn't simultaneously handle two CRC # polynomials. (Note, however, that XMODEM is strange.) # # 3. For 16-bit CRCs, the table entries need be only 16 bits wide; # of course, 32-bit entries work OK if the high 16 bits are zero. # # 4. The values must be right-shifted by eight bits by the "updcrc" # logic; the shift must be unsigned (bring in zeroes). On some # hardware you could probably optimize the shift in assembler by # using byte-swap instructions. # Converted to Python by James C. Ahlstrom crc_32_tab = [ # CRC polynomial 0xedb88320 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, 0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, 0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, 0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7, 0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, 0x14015c4f, 0x63066cd9, 0xfa0f3d63, 0x8d080df5, 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, 0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, 0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59, 0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423, 0xcfba9599, 0xb8bda50f, 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, 0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, 0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433, 0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d, 0x91646c97, 0xe6635c01, 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, 0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, 0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65, 0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7, 0xa4d1c46d, 0xd3d6f4fb, 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, 0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, 0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f, 0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81, 0xb7bd5c3b, 0xc0ba6cad, 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, 0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, 0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1, 0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb, 0x196c3671, 0x6e6b06e7, 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, 0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, 0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b, 0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55, 0x316e8eef, 0x4669be79, 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, 0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, 0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d, 0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f, 0x72076785, 0x05005713, 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, 0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, 0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777, 0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69, 0x616bffd3, 0x166ccf45, 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, 0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, 0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9, 0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693, 0x54de5729, 0x23d967bf, 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, 0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d ] def crc32(string): crc = 0xFFFFFFFF for ch in string: crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) & 0xFFFFFF) return ~crc From tim_one at email.msn.com Tue Dec 14 18:06:36 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 14 Dec 1999 12:06:36 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <199912141453.JAA23429@eric.cnri.reston.va.us> Message-ID: <000101bf4655$94e40840$3a2d153f@tim> [Guido] > I propose to use Gary Brown's code. I'll defend this to CNRI's > lawyers if need be. If there's a hassle, I can do a clean-room implementation easily enough -- although I'd rather not. > Jim, have you checked that this is the right CRC to use for zip's CRC? If WinZip unzips Jim's files without griping, the odds that he's got the wrong CRC are about 1 in 2**36 . > (This in the light of Tim's assertion that there are many CRCs > around.) There are, and several others are hiding in assorted communications stds (e.g., Ethernet uses a different 32-bit CRC); but the zip CRC is the one you'll find most commonly described on the Web. All the same, once Jim releases his code, I'll do an anal verification that it's the right one. From jim at interet.com Tue Dec 14 18:54:35 1999 From: jim at interet.com (James C. Ahlstrom) Date: Tue, 14 Dec 1999 12:54:35 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000101bf4655$94e40840$3a2d153f@tim> Message-ID: <3856845B.6C3C7330@interet.com> Tim Peters wrote: > If WinZip unzips Jim's files without griping, the odds that he's got the > wrong CRC are about 1 in 2**36 . You mean 2**32, right? Oh, sorry, you must be using a DEC-10 . JimA From gstein at lyra.org Tue Dec 14 20:23:36 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 11:23:36 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <3856425F.8C5E7A42@interet.com> Message-ID: On Tue, 14 Dec 1999, James C. Ahlstrom wrote: > Greg Stein wrote: > > > > > Can you post zipfile.py so that people can starting reviewing that? > > Yes, it will be available by next Monday. I just want to > get it really working and pretty, and with documentation. My point was that people could possibly use it *before* then. Not everybody needs it to be pretty, needs doc, or needs it fully working. Maybe people would like to provide feedback on the API. Maybe they'd like to start their own modules that use your library. This goes back to my years-old statement: release it now rather than later -- people can always use it now, and there might not be a later. Release early. Release often. :-) People are too hesitant to release code. Why? Just send it out there. When you update it, send out another. It doesn't hurt anybody to have more than one release. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one at email.msn.com Wed Dec 15 05:20:25 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 14 Dec 1999 23:20:25 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <3856845B.6C3C7330@interet.com> Message-ID: <000501bf46b3$b6184f40$05a0143f@tim> [Tim] > If WinZip unzips Jim's files without griping, the odds that he's > got the wrong CRC are about 1 in 2**36 . [JimA] > You mean 2**32, right? Nope! For each of the 2**32 polynomials you may have pulled out of thin air, there are about a dozen common variations in the details of CRC algorithms. For example, a CRC used for hashing usually initializes "the register" to 0, but a CRC used to protect against transmission errors usually initializes to a block of 1 bits (since leading zeroes don't affect the result, and a common transmission error is dropping a prefix of the msg). Similarly, algorithms vary in the order they scan the data; in whether they use the raw data or its complement; and in whether they return the actual remainder, the complement of the remainder, or a checksum cleverly computed so that "the other end" always sees a fixed remainder other than 0 (or ~0). > Oh, sorry, you must be using a DEC-10 . I used a Univac 1108 in college, back when ASCII was in its infancy. They couldn't decide on the natural size for a character, so the 36-bit 1108 could be configured to treat each word as either 6 6-bit bytes or 4 9-bit ones. If they had been thinking ahead, they would have defined it as two Unicode characters plus a 4-bit tag field for the Python implementation to play with . now-they-make-their-living-suing-.gif-bandits-ly y'rs - tim From tim_one at email.msn.com Wed Dec 15 08:40:11 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 15 Dec 1999 02:40:11 -0500 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy In-Reply-To: <385660D0.C6C0C7B9@interet.com> Message-ID: <000b01bf46cf$9ebe27e0$05a0143f@tim> [JimA posts his Python rendering of Gary Brown's code] Yup! That's the zip algorithm, right down to the absurdly bit-reversed polynomial. > def crc32(string): > crc = 0xFFFFFFFF > for ch in string: > crc = crc_32_tab[((crc) ^ ord(ch)) & 0xff] ^ (((crc) >> 8) & > 0xFFFFFF) > return ~crc Note that the last line is better (whether in Python or C!) as return crc ^ 0xffffffff Else you'll get a surprising result in a 64-bit Python, and in some 64-bit C implementations. it's-a-32-bit-algorithm-not-an-"int"-or-"long"-one-ly y'rs - tim From fredrik at pythonware.com Wed Dec 15 10:31:29 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 15 Dec 1999 10:31:29 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000101bf4655$94e40840$3a2d153f@tim> Message-ID: <002601bf46e0$06e25ca0$f29b12c2@secret.pythonware.com> > [Guido] > > I propose to use Gary Brown's code. I'll defend this to CNRI's > > lawyers if need be. > > If there's a hassle, I can do a clean-room implementation easily enough -- > although I'd rather not. or you can grab the code from PIL, which already comes with a Python compatible license... (it's based on ISO 3307, but judging from the table James posted, it's the same thing...) From fredrik at pythonware.com Wed Dec 15 10:39:19 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 15 Dec 1999 10:39:19 +0100 Subject: [Python-Dev] Re: [Distutils] Questions about distutils strategy References: <000b01bf46cf$9ebe27e0$05a0143f@tim> Message-ID: <003001bf46e0$43860b20$f29b12c2@secret.pythonware.com> Tim Peters wrote: > Yup! That's the zip algorithm, right down to the absurdly bit-reversed > polynomial. also known as ISO 3307, according to some strange comments in PIL's sources... From jim at interet.com Wed Dec 15 16:53:34 1999 From: jim at interet.com (James C. Ahlstrom) Date: Wed, 15 Dec 1999 10:53:34 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <3857B97E.3684224F@interet.com> Greg Stein wrote: > Release early. Release often. :-) You are right of course. OK, the zipfile.py code and docs are at: ftp://ftp.interet.com/pub/pylib.html Despite the ftp URL, clicking on it should display the html. Please don't panic if is seems to be slow. It uses a Python CRC-32 which is slow. You may want to hack it to use zlib.crc32() if you have it. I am testing with WinZip. If you have another zip tool, it would be interesting to see how compatible it is. JimA From guido at CNRI.Reston.VA.US Wed Dec 15 17:38:47 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 15 Dec 1999 11:38:47 -0500 Subject: [Python-Dev] Writers wanted for Linux Journal Python special issue Message-ID: <199912151638.LAA02522@eric.cnri.reston.va.us> Linux Journal is preparing a special issue devoted to Python (actually more like a pullout section or whatever I think). They are looking for writers, e.g. to write a piece about Python's history and/or an introduction. And probably anything else Python related. If you're interested, please write to Marjorie Richardson , who is coordinating. Also direct any questions to her. This is for the June issue which will be on newsstands mid-May and mailed to subscribers even earlier, I believe. The deadline is February 1st (magazine production takes forever!). --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Wed Dec 15 19:17:53 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Wed, 15 Dec 1999 13:17:53 -0500 (EST) Subject: [Python-Dev] fwd. from Paul Prescod Message-ID: <14423.56145.877163.395736@amarok.cnri.reston.va.us> This is a forwarded e-mail from the XML-SIG mailing list, in which Paul makes some good points. Some context: I've been arguing against adding more XML stuff to the base Python distribution, because 1) it's bloat for those people don't care about XML, and 2) the Distutils is supposed to fix this by making installing things easier. Paul's response, below, has shaken my conviction a bit (*only* a bit, though). If it's deemed valuable, perhaps the XML-SIG could concentrate on the minimal set of parser + SAX + DOM that could be included in 1.6. Please join the XML-SIG to follow the specifics of this thread further, as it relates only to XML. As a more general philosophical question for python-dev: do we want to add things to 1.6 following the "batteries included" philosophy? Or should we wave in the direction of the distutils and say they'll fix the problem? (In which case they should be given high priority, as in "1.6 doesn't ship until they're done".) -- A.M. Kuchling http://starship.python.net/crew/amk/ And after all, why should I go to bed every night? Sleep is only a habit. -- Cornelius Van Horne Paul Prescod writes: >"Andrew M. Kuchling" wrote: >> >> Huh? There's obviously a good deal of stuff in there, some of it >> perhaps too esoteric, but I don't see where there's overlap. > >Well, there are several parsers and parser wrappers. How is a user >supposed to choose? And there is PyDOM, Minidom and qp_dom. > >> Or are >> you talking about Python tools in general, where there are 3 DOM >> implementations? (PyDOM, 4DOM, and ZDOM hiding inside Zope.) > >That too. > >> I lean against shoveling more stuff into 1.6; better to get the >> Distutils widely used, which makes it easier to install *all* Python >> extensions. > >I don't think that XML is any more of an "add-on" to a modern scripting >language than URL support or regular expression support. I'm in the >"batteries included" camp for this and several other reasons: > > * standard Python libraries may soon need XML support. If WebDAV takes >off then there should be a libWebDAV right alongside libftp and libhttp. >And libWebDAV will require XML > > * there is a difference between theory and practice. In theory, >distutils will be done soon and everything will be easy. In practice, it >is the end of 1999 and at every conference I have to install the XML sig >package on the machines of several people who haven't been able to get >it going themselves. In practice, we can't wait for distutils because >people are choosing their XML tools now. > >> >Ideally we would have one (or at most two!) implementation of each of >> >the major specs: >> >XML >SAX >Unicode >XPath >XPointer >XSLT >DOM >> >> Do you mean "one implementation of each in a single package", or "one >> implementation existing for Python, distributed separately"? > >With the possible exception of XSLT, one implementation of each *in >Python 1.6*. > >> We need to come up with a position paper for developer's day, stating >> what needs to be discussed. Suggestions? I'd propose focusing on >> getting the XML-SIG package to 1.0, but that's just an idea. > >I don't see how the XML-SIG package can ever get to 1.0. Anybody can >contribute code at anytime and thus far we've been totally flexible >about putting it in. I think that's great. It just won't ever lead to a >stable, carefully maintained, tightly interoperable package. Some of the >maintainers of the individual pieces have probably lost interest and >there is probably nobody that understands it all enough to integrate it >nicely. > >-- > Paul Prescod - ISOGEN Consulting Engineer speaking for himself > From fdrake at acm.org Wed Dec 15 20:47:01 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 15 Dec 1999 14:47:01 -0500 (EST) Subject: [Python-Dev] posix module Message-ID: <14423.61493.90107.433664@weyr.cnri.reston.va.us> Ok, I think I'm done with the posix module updates, modulo bugs and additional symbols for the *conf*() tables. That leaves us with the following status for interfaces that Andrew brought up in the message that started this spate of additions: Worth adding? ============= opendir(), readdir(), closedir() -- not added The only thing these give us that os.listdir() doesn't is the inode numbers. Unless someone actually wants those, it's not worth having. Worth adding: ============= abort() -- added ctermid(), ctermid_r() -- added fpathconf(fd, name) -- added getlogin() -- added getgroups(gidsetsize, grouplist) -- added pathconf(path, name) -- added sysconf(int name) -- added; also added confstr(int name) Not worth adding: ================= clearerr() -- not added cuserid() -- not added difftime -- not added tmpfile(), tmpnam() -- added, also tempnam() mblen(), mbstowcs(), mbtowc(), wcstombs(), wctomb() -- not added -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jeremy at cnri.reston.va.us Wed Dec 15 20:58:16 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Wed, 15 Dec 1999 14:58:16 -0500 (EST) Subject: [Python-Dev] zipfile.py In-Reply-To: References: <3856425F.8C5E7A42@interet.com> Message-ID: <14423.62168.576273.719577@goon.cnri.reston.va.us> >>>>> "GS" == Greg Stein writes: GS> On Tue, 14 Dec 1999, James C. Ahlstrom wrote: >> Greg Stein wrote: > >> >> > Can you post zipfile.py so that people can starting reviewing >> that? >> >> Yes, it will be available by next Monday. I just want to get it >> really working and pretty, and with documentation. GS> My point was that people could possibly use it *before* GS> then. Not everybody needs it to be pretty, needs doc, or needs GS> it fully working. Maybe people would like to provide feedback GS> on the API. Maybe they'd like to start their own modules that GS> use your library. GS> This goes back to my years-old statement: release it now rather GS> than later -- people can always use it now, and there might not GS> be a later. Ok. I think we need some kind of zip file support in the core so that it can be used as a standard distribution format. I'd be happy if Jim's zipfile module ended up being it. We've got some zip code that we developed at CNRI; it's a bit of a mess, but it might be helpful to see what we did. Our code is at ftp://www.python.org/pub/tmp/zip.zip Jeremy From jim at interet.com Thu Dec 16 16:41:56 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 16 Dec 1999 10:41:56 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> Message-ID: <38590844.769C3025@interet.com> Did anyone look at this yet? ftp://ftp.interet.com/pub/pylib.html ftp://ftp.interet.com/pub/zipfile.py JimA From skip at mojam.com Thu Dec 16 16:46:28 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 16 Dec 1999 09:46:28 -0600 (CST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38590844.769C3025@interet.com> References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> Message-ID: <14425.2388.529932.61119@dolphin.mojam.com> JA> Did anyone look at this yet? JA> ftp://ftp.interet.com/pub/pylib.html JA> ftp://ftp.interet.com/pub/zipfile.py I thought it wasn't supposed to be out until Monday? You're looking for, perhaps, a time machine? ;-) (More seriously, it won't have any effect on my "gotta have this done yesterday" list, so I will let others comment...) Skip From jim at interet.com Thu Dec 16 18:16:21 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 16 Dec 1999 12:16:21 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> Message-ID: <38591E65.4885A39D@interet.com> "James C. Ahlstrom" wrote: > ftp://ftp.interet.com/pub/pylib.html I just changed zipfile.py so that regular zip compression works. And if zlib is available, its crc32() is used instead of the Python version. I should mention that the current code rejects zip files which have an archive comment added to the end. Accepting them would require a search, and I am not sure it is worth it. JimA From fdrake at acm.org Thu Dec 16 18:19:23 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 16 Dec 1999 12:19:23 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: References: <199912151831.NAA02685@weyr.cnri.reston.va.us> Message-ID: <14425.7963.347400.763562@weyr.cnri.reston.va.us> [Note that Greg's message went to python-checkins since he responded to a checkin message, but I suspect he meant to change the header to point to python-dev. ;) If not, too bad!] Greg Stein writes: > But this means that your tables no long reside in "const" space. Yet More > Per-Process Memory... > > It would be nice to have those tables marked as "const". Perhaps; as Guido points out, there haven't been a lot of complaints about this issue. I will note that only the tables aren't constant; the strings that are pointed to are still constant. I'm inclined to let the compiler/ linker care about this, and not change the code without a really clear need to do so. Here are the sizes of those tables and the strings they point to (including terminating null bytes for the strings): pathconf_names: 14 entries, 112 bytes, 176 string bytes confstr_names: 25 entries, 200 bytes, 576 string bytes sysconf_names: 108 entries, 864 bytes, 1774 string bytes Figures are for Solaris7. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From gstein at lyra.org Thu Dec 16 19:10:14 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 10:10:14 -0800 (PST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: <14425.7963.347400.763562@weyr.cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Fred L. Drake, Jr. wrote: > [Note that Greg's message went to python-checkins since he responded > to a checkin message, but I suspect he meant to change the header to > point to python-dev. ;) If not, too bad!] I didn't really care too much where it went. I would actually suggest that the Reply-To: on the checkin list is set to python-dev if that is where replies are Supposed To Go. [ I do this with mod_dav checkins; replies to dav-checkins mail goes to dav-dev. ] > Greg Stein writes: > > But this means that your tables no long reside in "const" space. Yet More > > Per-Process Memory... > > > > It would be nice to have those tables marked as "const". > > Perhaps; as Guido points out, there haven't been a lot of complaints > about this issue. > I will note that only the tables aren't constant; the strings that > are pointed to are still constant. I'm inclined to let the compiler/ > linker care about this, and not change the code without a really clear > need to do so. > Here are the sizes of those tables and the strings they point to > (including terminating null bytes for the strings): > > pathconf_names: 14 entries, 112 bytes, 176 string bytes > confstr_names: 25 entries, 200 bytes, 576 string bytes > sysconf_names: 108 entries, 864 bytes, 1774 string bytes > > Figures are for Solaris7. Ah. I just replied to that. Guess that one went to python-checkins :-) True, this is a small amount of memory. But they start to add up. non-const globals also pain me when I start to work on free-threading stuff (each must be examined to see if synchronization is needed), so reducing the number there is important. Regarding the memory itself: as I mentioned in the other note, I just want to ensure that Python's working set remains low (reasons given in that email). Cheers, -g -- Greg Stein, http://www.lyra.org/ From skip at mojam.com Thu Dec 16 19:09:11 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 16 Dec 1999 12:09:11 -0600 (CST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: References: <199912161553.KAA08428@eric.cnri.reston.va.us> Message-ID: <14425.10951.169751.843764@dolphin.mojam.com> >>>>> "Greg" == Greg Stein writes: Greg> On Thu, 16 Dec 1999, Guido van Rossum wrote: >> I don't think there's much of a need to worry about this. Why are >> you always bringing up this subject? No-one else that I know has >> ever had this concern... Greg> Somebody has to :-) Greg> Keeping the working set low is more efficient from a system Greg> standpoint. Not to mention the not-all-that-occasional-anymore requests to have Python on various itty-bitty things like Palm Pilots and WinCE devices. It's one thing to add size to modules people can live without for many applications, but I think the posix module and its other platform-specific relations are fairly heavily used. (I realize this specific example isn't likely to apply to PP/WinCE.) Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From gstein at lyra.org Thu Dec 16 19:21:54 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 10:21:54 -0800 (PST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released In-Reply-To: <199912161527.KAA08308@eric.cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Guido van Rossum wrote: >... > I realize it's just a rant. In this case (distutils) your advice is > correct. (I usually paraphrase it as "release early, release often".) True. I prefer that phrase, too, but I used it on JimA earlier in the day or the previous day. I didn't want to sound like a broken record :-). But that is why I moved into mode... it seems like the mindset was spreading :-) I've railed at AMK for it, too :-), when he was talking about 0.5.1pre1 or whatever, rather than just releasing 0.5.1 and doing an 0.5.2 if there was a problem. > However there are other situations, like core Python itself, where > it's really useful to have stable releases -- if only for those users > who won't touch anything with "beta" in its name. I still hear from > people who haven't upgraded to 1.5.2. But this doesn't explain why there isn't a 1.5.3b1, 1.5.3b2, etc. Or 1.6.0a1 or whatever (maybe "d" or "r" for dev release, as opposed to alpha). There are some people would like the releases rather than using CVS. Some people can't even use CVS because of firewall issues. Of course, an alternative is snapshot-tarballs of the CVS repository. But a snapshot could *really* be broken; something like 1.6.0d1 says "well, it's a development release, but I've hit a good point between some changes." > I wonder if perhaps for those cases (where there's a demand for stable > releases) some other strategy could be used? Such as labeling > releases "stable" after the fact? Or what Linus seems to do with the > Linux kernel (even = stable, odd = development; or was it the other > way around?). Yes: even are stable (e.g. 1.0, 1.2, 2.0, 2.2). The odd numbers are for development. Linus is currently working 2.3.x, but declared in the past couple days that things will be wrapping up to move towards 2.4. Once he thinks it is ready, he'll start off with 2.4.0pre1, pre2, pre3... At some point the "pre" suffix will drop and 2.4.0 will be released. You might have a bit of problem using that mechanism since the current stable release is 1.5 :-). Once 1.6 hits the street, then you could start doing 1.9 releases (dev) and shift to 2.0 once it is "stable". Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul at prescod.net Thu Dec 16 19:02:55 1999 From: paul at prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 10:02:55 -0800 Subject: [Python-Dev] Re: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> Message-ID: <3859294F.138FF398@prescod.net> "Andrew M. Kuchling" wrote: > > * Python revisions come out slowly, once every year or two. XML > standards have been revolving faster , and we don't want to wait > until 1.7 for SAX2, or DOM Level2, or other new revisions. > Keeping the modules out of the core lets them be updated at their > own pace. A counterargument is that the XML specs are slowing > down -- add namespace support to SAX, and finalize DOM > Level 2, and I don't think any other standards are very important > to basic XML programming. I agree with your counterargument. :) Anyhow, isn't there a logical fallacy in your original argument? Why can't we offer a DOM 3 module or extension after Python ships with DOM 2? > * We really want a C-based parser to be commonly available. > sgmlop is the only reasonable choice for this, because I'd be > against including Expat. To replay some arguments I made against > including the zlib library in 1.6, what if a C extension requires > a newer version of the library? Symbol conflicts if you're lucky, > hard-to-debug problems if you're not. I don't understand this issue. Why would a C extension build on sgmlop which is designed to make XML information available to *Python* programmers? > * We can drop various marginal bits of the CVS tree; the xmlarch > support is probably not of very wide interest, for example. How about "expat", "mac", "pyexpat", "utils", "windows". There is just too much stuff there! And I daresay that alot of it has not been "quality controlled" to the level that we would expect if it were a part of the real Python library. In other words, there is no single place to go to get only XML-processing software that works well and works together. > I think I'm on the record as saying that Python's major problems now > aren't language-related, but are with the development environment. > Language changes (from minor, like 'for i in 1..9', to major, like > fixing the type/class dichotomy or adding static types) aren't going > to bring in piles of new users, useful though they might be to > experienced Pythoneers, large projects, or some other specific > application. (irrelevant aside: I agree 100% that making things easier to install will actually improve newbies experience more than (e.g.) static type checking but I do not agree that it is a better "sales tool". Most people are sold based on the language and its libraries before they start trying to install extensions.) > If installing things is a problem, then we need to > buckle down and finish the distutils. So, overall, I'd still vote > against inclusion in 1.6. So are you saying that Python 2 might have only five packages and everything else must be downloaded? No httplib, no pickle, no random or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? When people download Python and go to the library documentation that impressive array of BUILT-IN-FEATURES is part of what sells them on Python. Hell, I can download all of that stuff for Scheme but what makes Python beautiful is that I don't have to download it for Python. It's just there. But if an XML person comes to Python after hearing us rant about how great it is for processing XML and all they find is xmllib...they will be underwhelmed. > No, it's *got* to reach 1.0. The point of the package is that it's > exactly *one* thing to install that gives basic XML tools; you don't > need to chase down the SAX modules from Lars' page, PyExpat from > ftp.cwi.nl, sgmlop from pythonware.com, and so forth. If the > Distutils made it as easy as: > > python fetchpackage.py SAX PyExpat DOM sgmlop > > > > etc... > > then much of the need for a single package goes away, but, as you > point out, that isn't currently the case. I'm a little lost here. We need xmllib to continue because distutils doesn't do what we need yet but we don't need to put the stuff in the Python library because disutils will work well enough soon. But there is an important issue that disutils will not solve. One of the beautiful things about the Python library is that everything is at the same version level. When you install it you know that everything works together or else it WILL in the next patch level if you report the incompatibility. When the xml package gets versioned incompatibly with the Python library you don't have that safe feeling. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From akuchlin at mems-exchange.org Thu Dec 16 19:50:48 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Thu, 16 Dec 1999 13:50:48 -0500 (EST) Subject: [Python-Dev] Re: [XML-SIG] Developer's Day In-Reply-To: <3859294F.138FF398@prescod.net> References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> <3859294F.138FF398@prescod.net> Message-ID: <14425.13448.737831.460241@amarok.cnri.reston.va.us> (Responding to the python-dev related portion of this...) Paul Prescod writes: >I don't understand this issue. Why would a C extension build on sgmlop >which is designed to make XML information available to *Python* >programmers? No, no; I'm arguing against shipping with Expat; sgmlop good! Consider this scenario: * Python includes Expat 1.0 * Some C library (for DAV or whatever) uses Expat 1.1 * Someone writes a Python interface to this C library and attempts to compile it statically. * Two versions of Expat in the same binary; symbol conflicts and core dumps, oh my! >So are you saying that Python 2 might have only five packages and >everything else must be downloaded? No httplib, no pickle, no random or >math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? I'm not arguing for dropping existing packages; I'm against adding many more of them. Existing library modules can stay where they are. But I wouldn't mind a minimalist Python too much, if it came with a script fetch-basic-packages: python fetch-packages.py httplib python fetch-packages.py imaplib ... 200 more lines ... >I'm a little lost here. We need xmllib to continue because distutils >doesn't do what we need yet but we don't need to put the stuff in the >Python library because disutils will work well enough soon. Basically, yes. -- A.M. Kuchling http://starship.python.net/crew/amk/ And now let us hasten to the station. I have commanded the rain to fall at exactly one-fifteen and I would hate to get my shoes wet. -- Lord Lavender, in SEBASTIAN O #2 From bwarsaw at cnri.reston.va.us Thu Dec 16 19:50:49 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 16 Dec 1999 13:50:49 -0500 (EST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released References: <199912161527.KAA08308@eric.cnri.reston.va.us> Message-ID: <14425.13449.954026.960703@anthem.cnri.reston.va.us> >> I wonder if perhaps for those cases (where there's a demand for >> stable releases) some other strategy could be used? Such as >> labeling releases "stable" after the fact? Or what Linus seems >> to do with the Linux kernel (even = stable, odd = development; >> or was it the other way around?). I really dislike the odd/even distinction for exactly this reason. -Barry From guido at CNRI.Reston.VA.US Thu Dec 16 20:02:16 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 16 Dec 1999 14:02:16 -0500 Subject: [Python-Dev] Batteries Included? Message-ID: <199912161902.OAA11345@eric.cnri.reston.va.us> I like the batteries included approach, but I also feel resistence against including stuff I cannot maintain. The XML code base is a point in case; I don't understand enough about XML. (I just read that xmllib.py is "illegal". Jeez! What happened? Did Congress pass a law against it?) I think it may be time for separate Python distributions, like Linux -- I can concentrate on the core, and keep it really small; others can make all-encompassing distributions. There are currently some drawbacks to this approach: non-core modules have less status; and the documentation process is fundamentally different for core and non-core modules. There's also the version dependency stuff, but I think resolving that is the responsibility of the distribution makers. I think the status problem will be gone once there is a respected distribution -- then you derive status from being in that distribution, rather than from being in the core distribution. (Well, you would still derive status from being in the core, but it would be much harder to obtain, since I can set a much higher standard.) The documentation problem is the one that's left. I think the doc-sig may be on its way as we speak to solve this, though. Fred? This isn't rocket science. Red Hat Python? I'm all for it! :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Thu Dec 16 20:05:05 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 16 Dec 1999 13:05:05 -0600 (CST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released In-Reply-To: <14425.13449.954026.960703@anthem.cnri.reston.va.us> References: <199912161527.KAA08308@eric.cnri.reston.va.us> <14425.13449.954026.960703@anthem.cnri.reston.va.us> Message-ID: <14425.14305.907618.978628@dolphin.mojam.com> >>> Or what Linus seems to do with the Linux kernel (even = stable, odd >>> = development; or was it the other way around?). BAW> I really dislike the odd/even distinction for exactly this reason. It's one saving grace is that it is a uniform format. There are no "optional" tokens like "pre", "alpha", "beta", etc for the most part. To remember which way it is, I find it useful to execute "uname -r", check the second digit, then look down at my shirt for a pocket protector. The two pieces of information together work for me. I currently get "2.2.13-4mdk" from uname. I don't even have a pocket, let alone a pocket protector, so even numbers must be stable releases... ;-) Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From fdrake at acm.org Thu Dec 16 20:05:22 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 16 Dec 1999 14:05:22 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules posixmodule.c,2.120,2.121 In-Reply-To: <14425.10951.169751.843764@dolphin.mojam.com> References: <199912161553.KAA08428@eric.cnri.reston.va.us> <14425.10951.169751.843764@dolphin.mojam.com> Message-ID: <14425.14322.355507.500813@weyr.cnri.reston.va.us> Skip Montanaro writes: > fairly heavily used. (I realize this specific example isn't likely to apply > to PP/WinCE.) Or any version of Windows, I suspect; perhaps Mark Hammond can elaborate. Appearantly none of the pathconf() constants are defined on that platform, at least not as #define constants. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jcw at equi4.com Thu Dec 16 20:09:42 1999 From: jcw at equi4.com (Jean-Claude Wippler) Date: Thu, 16 Dec 1999 20:09:42 +0100 Subject: [Python-Dev] Re: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> <3859294F.138FF398@prescod.net> Message-ID: <385938F6.C4164756@equi4.com> Paul Prescod wrote: [...] > (irrelevant aside: [...] Most people are sold based on the language > and its libraries before they start trying to install extensions.) > > [AMK] > > If installing things is a problem, then we need to > > buckle down and finish the distutils. So, overall, I'd still vote > > against inclusion in 1.6. > > So are you saying that Python 2 might have only five packages and > everything else must be downloaded? No httplib, no pickle, no random > or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? > > When people download Python and go to the library documentation that > impressive array of BUILT-IN-FEATURES is part of what sells them on > Python. Hell, I can download all of that stuff for Scheme but what > makes Python beautiful is that I don't have to download it for Python. > It's just there. But if an XML person comes to Python after hearing us > rant about how great it is for processing XML and all they find is > xmllib...they will be underwhelmed. (Nodding in agreement) Could this perhaps be solved with a large batteries-included standard distribution, plus a real easy/effective way to strip Python down and wrap things up for deployment? In other words, aim for two very distinct goals: everything within easy reach for development + fully signed-sealed-delivered products. The first goal can evolve to do fancy net-bourne distribution, even if it is a brittle process, because this is for Python developers. They want it all, so open the floodgate to give it all to them. The second becomes a matter or pruning down and wrapping up. All the way down to an single installation-less executable, if possible. I may well be wrong (and I'm not tracking distutils), but might it not be simpler to focus on 1) power users + 2) production-grade deployment, instead of trying to streamline a tangled-web-of-module-dependencies into a distribution system which tries to meet a wide range of needs? > [...] One of the beautiful things about the Python library is that > everything is at the same version level. When you install it you know > that everything works together or else it WILL in the next patch level > if you report the incompatibility. [...] More nods. So why not allow the Python distribution to become very large - with every release moving to a better-tuned combination of all the different parts (occasional mishaps can quickly be fixed)? Plus some tools to dist(ut)il(l) a turnkey solution from this big soup. Sort-of-from-violin-to-quartet-all-the-way-to-symphony-orchestra... -- Jean-Claude From gstein at lyra.org Thu Dec 16 21:02:46 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 12:02:46 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38590844.769C3025@interet.com> Message-ID: On Thu, 16 Dec 1999, James C. Ahlstrom wrote: > Did anyone look at this yet? > > ftp://ftp.interet.com/pub/pylib.html > > ftp://ftp.interet.com/pub/zipfile.py I went to look for it, but I think that was before you put zipfile up. Looking at it now... The writepy() as a method is questionable, I think. I think it should open the file at instantiation time. I don't see a reason to allow that to be deferred. Especially given that some of the methods fail if open() hasn't been called. It would be good to have symbolic names for the 0 and 8 compression constants, and to fail if 8 is passed and zlib is not available (otherwise, it doesn't fail until read/write time, and with a NameError). There should probably be a __del__ that calls close(). Oh, and a "closed" attribute that can be checked and an error raised if an operation is done after the file has been closed. I think dir() should return the contents, rather than print them. read() and write() ought to fail if the mode is incorrect. Oh, some symbolic constants for things like "PK\005\006" would be nice. Do you have a ZipImporter written? Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Thu Dec 16 21:12:30 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 12:12:30 -0800 (PST) Subject: [Python-Dev] Re: [XML-SIG] Developer's Day In-Reply-To: <14425.13448.737831.460241@amarok.cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Andrew M. Kuchling wrote: > Paul Prescod writes: > >I don't understand this issue. Why would a C extension build on sgmlop > >which is designed to make XML information available to *Python* > >programmers? > > No, no; I'm arguing against shipping with Expat; sgmlop good! > Consider this scenario: > > * Python includes Expat 1.0 > * Some C library (for DAV or whatever) uses Expat 1.1 > * Someone writes a Python interface to this C library and > attempts to compile it statically. > * Two versions of Expat in the same binary; symbol conflicts > and core dumps, oh my! We should ship pyexpat, not Expat. (IMO) > >So are you saying that Python 2 might have only five packages and > >everything else must be downloaded? No httplib, no pickle, no random or > >math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? > > I'm not arguing for dropping existing packages; I'm against adding > many more of them. Existing library modules can stay where they are. > But I wouldn't mind a minimalist Python too much, if it came with a > script fetch-basic-packages: > > python fetch-packages.py httplib > python fetch-packages.py imaplib > ... 200 more lines ... Considering that it would probably use HTTP to fetch the packages, I think you wouldn't be fetching httplib :-) But yes: I agree with the basic sentiment. Cheers, -g -- Greg Stein, http://www.lyra.org/ From petrilli at amber.org Thu Dec 16 21:55:16 1999 From: petrilli at amber.org (Christopher Petrilli) Date: Thu, 16 Dec 1999 15:55:16 -0500 Subject: [Python-Dev] Batteries Included? In-Reply-To: <199912161902.OAA11345@eric.cnri.reston.va.us>; from guido@CNRI.Reston.VA.US on Thu, Dec 16, 1999 at 02:02:16PM -0500 References: <199912161902.OAA11345@eric.cnri.reston.va.us> Message-ID: <19991216155516.A28037@trump.amber.org> Guido van Rossum [guido at CNRI.Reston.VA.US] wrote: > I think it may be time for separate Python distributions, like Linux > -- I can concentrate on the core, and keep it really small; others can > make all-encompassing distributions. My fear is what we face in the Zope world---different distributions break in totally diffrent ways, and sometimes we have to ask 30 questions to figure out what might be going wrong :/ The nice thing is hat if someone installes Python from the source, we know what's going to happen. I don't know if this is solvable, honestly. > This isn't rocket science. Red Hat Python? I'm all for it! :-) I think Guido just wants to IPO and retire :-) Chris -- | Christopher Petrilli | petrilli at amber.org From gward at cnri.reston.va.us Thu Dec 16 22:03:26 1999 From: gward at cnri.reston.va.us (Greg Ward) Date: Thu, 16 Dec 1999 16:03:26 -0500 Subject: [Python-Dev] distutils-sig/python-dev crosstalk Message-ID: <19991216160325.H4289@cnri.reston.va.us> Most recent threads on distutils-sig seem to have migrated to python-dev pretty quickly. This means that a) there are python-dev people on distutils-sig (duh), b) they think what goes on there is important enough to interest the other core developers (good!), and c) they assume there are people on python-dev who are not also on distutils-sig. Is this last assumption true? If you read python-dev, are interested in distutils issues, but do *not* read distutils-sig, please drop me a note. If no one says anything, I will (politely, tentatively) propose that we keep the distutils threads on distutils-sig and leave python-dev for, well, core Pythond development. If you think that the two are inextricably linked and I might as well just cross-post everything on distutils-sig to python-dev, let me know about that too. ;-) Greg -- Greg Ward - software developer gward at cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913 From gstein at lyra.org Thu Dec 16 22:18:50 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:18:50 -0800 (PST) Subject: [Python-Dev] distutils-sig/python-dev crosstalk In-Reply-To: <19991216160325.H4289@cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Greg Ward wrote: >... > If you think that the two are inextricably linked and I might as well > just cross-post everything on distutils-sig to python-dev, let me know > about that too. ;-) :-) I think distutils is about the mechanics. And it is a large and sophisticated problem (which why it has a SIG :-). You could almost view it as a spinoff of the python-dev grand problem set. When we get into the question of "what does Python ship with?", then I think it belongs in python-dev, as that is a discussion of what constitutes Python itself. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Thu Dec 16 22:21:12 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:21:12 -0800 (PST) Subject: [Python-Dev] distutils-sig/python-dev crosstalk In-Reply-To: <19991216160325.H4289@cnri.reston.va.us> Message-ID: On Thu, 16 Dec 1999, Greg Ward wrote: > Most recent threads on distutils-sig seem to have migrated to python-dev > pretty quickly. This means that a) there are python-dev people on > distutils-sig (duh), b) they think what goes on there is important > enough to interest the other core developers (good!), and c) they assume > there are people on python-dev who are not also on distutils-sig. Oh. One more thing. Actually, what I am somewhat worried about is whether there was relevant discussion on python-dev that should have been visible to the distutils people. Not sure if there was, but that is always a potential problem. Same with the recent xml-sig / python-dev crosstalk. Specifically, Paul Prescod is not on python-dev, so he may have missed a response or two. Cheers, -g -- Greg Stein, http://www.lyra.org/ From mal at lemburg.com Thu Dec 16 22:23:30 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 16 Dec 1999 22:23:30 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> Message-ID: <38595852.E8054741@lemburg.com> "James C. Ahlstrom" wrote: > > "James C. Ahlstrom" wrote: > > > ftp://ftp.interet.com/pub/pylib.html > > I just changed zipfile.py so that regular zip compression > works. And if zlib is available, > its crc32() is used instead of the Python version. > > I should mention that the current code rejects zip files which have > an archive comment added to the end. Accepting them would require > a search, and I am not sure it is worth it. I don't think it is needed for our purposes, but maybe a subclass could provide it ? FYI, I've tested the module against mxStack-0.3.0.zip which you can find on my Python Pages. It was created using Info-ZIP's zip 2.2 on Linux. Unfortunately, I always get the following traceback when trying to print the directory: >>> z.open('../projects/distribution/mxStack-0.3.0.zip','rb') >>> z.dir() File Name Modified Size Stack/mxStack/mxStack.h 1999-04-16 10:50:06 4368 Stack/mxStack/mxstdlib.h 1999-04-13 15:37:52 5433 Traceback (innermost last): File "", line 1, in ? File "/home/lemburg/lib/zipfile.py", line 120, in dir bytes = self.read(name) # Just to check CRC-32 File "/home/lemburg/lib/zipfile.py", line 133, in read bytes = zlib.decompress(bytes, -15) zlib.error: Error -5 while decompressing data Some notes on the API: ---------------------- * I would find it more convenient if the filename and mode would be constructor parameters, e.g. zfile = zipfile('myfile.zip','rb') with compression defaulting to 8 rather than 0 (most zip files will be deflated since this is the ZIP default). * Also, I would like a method much like the os.listdir() which returns a list of filenames rather than print it to stdout. * .is_zipfile() should probably be a separate function: it doesn't use any of the class' features. More wishes to come ;-) So far: Great Work ! Aside: I found that you are using undocumented arguments to zlib.compressobj() ... are these extra arguments left out of the documentation on purpose or by simple oversight ? I couldn't find them in the HTML docs and neither in the docstrings. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 15 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein at lyra.org Thu Dec 16 22:32:09 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:32:09 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38595852.E8054741@lemburg.com> Message-ID: On Thu, 16 Dec 1999, M.-A. Lemburg wrote: >... > Some notes on the API: > ---------------------- > * I would find it more convenient if the filename and mode > would be constructor parameters, e.g. > > zfile = zipfile('myfile.zip','rb') > > with compression defaulting to 8 rather than 0 (most zip files > will be deflated since this is the ZIP default). > > * Also, I would like a method much like the os.listdir() > which returns a list of filenames rather than print it > to stdout. The above two items were in my ramble, just not as clear as MAL :-) > * .is_zipfile() should probably be a separate function: it > doesn't use any of the class' features. Ah! Good call. It is even more important to shift it out if the constructor now opens a file. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fdrake at acm.org Thu Dec 16 22:33:36 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 16 Dec 1999 16:33:36 -0500 (EST) Subject: [Python-Dev] zipfile.py In-Reply-To: <38595852.E8054741@lemburg.com> References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> Message-ID: <14425.23216.636687.704436@weyr.cnri.reston.va.us> M.-A. Lemburg writes: > Aside: I found that you are using undocumented arguments to > zlib.compressobj() ... are these extra arguments left out of > the documentation on purpose or by simple oversight ? I couldn't > find them in the HTML docs and neither in the docstrings. The documentation is way out of date and Jeremy Hylton and Andrew Kuchling haven't updated it. I'm not sure which of them changed the signatures for that module, but I've pestered Jeremy about it a few times. If anyone would like to update the documentation, I'd certainly appreciate it. I don't know the details of those interfaces, and this is somewhere where the details are pretty critical. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw at cnri.reston.va.us Fri Dec 17 00:10:11 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 16 Dec 1999 18:10:11 -0500 (EST) Subject: [Python-Dev] Re: [Distutils] ANNOUNCE: Distutils 0.1.2 released References: <199912161527.KAA08308@eric.cnri.reston.va.us> <14425.13449.954026.960703@anthem.cnri.reston.va.us> <14425.14305.907618.978628@dolphin.mojam.com> Message-ID: <14425.29011.429867.485070@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> To remember which way it is, I find it useful to execute SM> "uname -r", check the second digit, then look down at my shirt SM> for a pocket protector. The two pieces of information SM> together work for me. I currently get "2.2.13-4mdk" from SM> uname. I don't even have a pocket, let alone a pocket SM> protector, so even numbers must be stable releases... What do you do if it's the second Thursday after the full moon, and the local hockey team has just skated to a 3-3 tie? -Barry From mal at lemburg.com Thu Dec 16 22:53:36 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 16 Dec 1999 22:53:36 +0100 Subject: [Python-Dev] Batteries Included? References: <199912161902.OAA11345@eric.cnri.reston.va.us> Message-ID: <38595F60.7C1B34FF@lemburg.com> Guido van Rossum wrote: > > I like the batteries included approach, but I also feel resistence > against including stuff I cannot maintain. > ... > This isn't rocket science. Red Hat Python? I'm all for it! :-) I think we should wait for distutils to get up and running perfectly for everyone before taking such a step. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 15 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein at lyra.org Fri Dec 17 09:31:38 1999 From: gstein at lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 00:31:38 -0800 (PST) Subject: [Python-Dev] Batteries Included? In-Reply-To: <38595F60.7C1B34FF@lemburg.com> Message-ID: On Thu, 16 Dec 1999, M.-A. Lemburg wrote: > Guido van Rossum wrote: > > I like the batteries included approach, but I also feel resistence > > against including stuff I cannot maintain. This is an interesting comment, and is similar to the Apache sentiment. Nothing gets added to the standard distribution unless somebody in the Group is willing to maintain it. It provides a good mechanism for keeping the module set to a reasonable size and a set that can/will actually be maintained. > > ... > > This isn't rocket science. Red Hat Python? I'm all for it! :-) > > I think we should wait for distutils to get up and running > perfectly for everyone before taking such a step. You can also operate on the assumption that it will be done by the time 1.6 is ready to be released. In other words: do the work (distutils and minimizing the release) in parallel, rather than in sequence. I would also think that a large distro isn't going to be assembled with distutils. Somebody will sit down, pull all the components together, and make a big release. However, I do see the distutils as being needed for the people who grab the minimal distro. They need it to grab add'l packages. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Fri Dec 17 10:06:20 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Dec 1999 10:06:20 +0100 Subject: [Python-Dev] zipfile.py References: Message-ID: <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com> James C. Ahlstrom wrote: > > Did anyone look at this yet? > > > > ftp://ftp.interet.com/pub/pylib.html > > > > ftp://ftp.interet.com/pub/zipfile.py > > I went to look for it, but I think that was before you put zipfile up. just a few comments (from reading the docs): -- it would be great if "open" could take an open file object as well as a file name. (in this case, you also need to document what you expect from the underlying file object: read, write, seek, tell should be enough, right? haven't looked at the code -- assuming it works, I'm only interested in the interface) -- or you could nuke "open" and pass those arguments to the constructor instead. -- I assume "open" adds "b" to the given mode argument. -- "dir" looks a bit strange. and hey, there's no "listdir" in there. I'd prefer a recursive "listdir" method, which takes an optional "depth" argument (e.g. 0=this dir, 1=this dir and first subdir, None=infinity, i.e. the full tree). that's all for now. From fredrik at pythonware.com Fri Dec 17 13:21:03 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Dec 1999 13:21:03 +0100 Subject: [Python-Dev] posix module References: <14423.61493.90107.433664@weyr.cnri.reston.va.us> Message-ID: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com> > Ok, I think I'm done with the posix module updates, modulo bugs and > additional symbols for the *conf*() tables. gcc -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H -c ./posixmodule.c ./posixmodule.c:3789: `_SC_AIO_LIST_MAX' undeclared here (not in a function) ./posixmodule.c:3789: initializer element for `posix_constants_sysconf[10].value' is not constant make[1]: *** [posixmodule.o] Error 1 make[1]: Leaving directory `/data/repository/BleedingEdge/python/dist/src/Modules' (current CVS stuff, on Red Hat 5.2) From jim at interet.com Fri Dec 17 15:33:31 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 09:33:31 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <385A49BB.4D064240@interet.com> Greg Stein wrote: > > On Thu, 16 Dec 1999, James C. Ahlstrom wrote: > > Did anyone look at this yet? > > > > ftp://ftp.interet.com/pub/pylib.html > > > > ftp://ftp.interet.com/pub/zipfile.py > > Looking at it now... The writepy() as a method is questionable, I think. > I think it should open the file at instantiation time. I don't see a > reason to allow that to be deferred. Especially given that some of the > methods fail if open() hasn't been called. I eliminated open and added its args to the constructor. > It would be good to have > symbolic names for the 0 and 8 compression constants, and to fail if 8 is > passed and zlib is not available (otherwise, it doesn't fail until > read/write time, and with a NameError). There should probably be a > __del__ that calls close(). Oh, and a "closed" attribute that can be > checked and an error raised if an operation is done after the file has > been closed. All done. > I think dir() should return the contents, rather than print > them. I added listdir() and documented self.TOC. I kept printdir() as example code. > read() and write() ought to fail if the mode is incorrect. Oh, some > symbolic constants for things like "PK\005\006" would be nice. All done. JimA From guido at CNRI.Reston.VA.US Fri Dec 17 15:43:23 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 17 Dec 1999 09:43:23 -0500 Subject: [Python-Dev] Batteries Included? In-Reply-To: Your message of "Thu, 16 Dec 1999 22:53:36 +0100." <38595F60.7C1B34FF@lemburg.com> References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> Message-ID: <199912171443.JAA12414@eric.cnri.reston.va.us> > Guido van Rossum wrote: > > > > I like the batteries included approach, but I also feel resistence > > against including stuff I cannot maintain. > > ... > > This isn't rocket science. Red Hat Python? I'm all for it! :-) MAL: > I think we should wait for distutils to get up and running > perfectly for everyone before taking such a step. Fair enough -- but in the mean time, no more pushing for new modules in the core distribution (distutils excluded). --Guido van Rossum (home page: http://www.python.org/~guido/) From gward at cnri.reston.va.us Fri Dec 17 15:59:09 1999 From: gward at cnri.reston.va.us (Greg Ward) Date: Fri, 17 Dec 1999 09:59:09 -0500 Subject: [Python-Dev] Batteries Included? In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us>; from guido@cnri.reston.va.us on Fri, Dec 17, 1999 at 09:43:23AM -0500 References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> <199912171443.JAA12414@eric.cnri.reston.va.us> Message-ID: <19991217095908.B8799@cnri.reston.va.us> On 17 December 1999, Guido van Rossum said: > Fair enough -- but in the mean time, no more pushing for new modules > in the core distribution (distutils excluded). So anyone who wants a new module snuck into the core just has to convince me to add it the distutils package, right? >snicker< Greg From jeremy at cnri.reston.va.us Fri Dec 17 19:30:37 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Fri, 17 Dec 1999 13:30:37 -0500 (EST) Subject: [Python-Dev] Batteries Included? In-Reply-To: <199912171443.JAA12414@eric.cnri.reston.va.us> References: <199912161902.OAA11345@eric.cnri.reston.va.us> <38595F60.7C1B34FF@lemburg.com> <199912171443.JAA12414@eric.cnri.reston.va.us> Message-ID: <14426.33101.757523.853781@goon.cnri.reston.va.us> >>>>> "GvR" == Guido van Rossum writes: >> Guido van Rossum wrote: I like the batteries included >> approach, but I also feel resistence against including stuff I >> cannot maintain. ... This isn't rocket science. Red Hat >> Python? I'm all for it! :-) >> MAL wrote: >> I think we should wait for distutils to get up and running >> perfectly for everyone before taking such a step. GvR> Fair enough -- but in the mean time, no more pushing for new GvR> modules in the core distribution (distutils excluded). Perhaps the right long-term solution (post-distutils) is to split Python into a core architected by Guido and a bazaar-style standard library maintained in a more apache-style. Jeremy From jim at interet.com Fri Dec 17 16:25:10 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 10:25:10 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> Message-ID: <385A55D6.A8A05EB9@interet.com> "M.-A. Lemburg" wrote: > Unfortunately, I always get the following traceback when trying > to print the directory: OK, I changed the decompress code (10:23 AM), please re-try. > with compression defaulting to 8 rather than 0 (most zip files > will be deflated since this is the ZIP default). The compress mode only applies to writing. On read, the method recorded in the file controls. JimA From jim at interet.com Fri Dec 17 15:49:20 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 09:49:20 -0500 Subject: [Python-Dev] zipfile.py References: <022e01bf486d$fcc901d0$f29b12c2@secret.pythonware.com> Message-ID: <385A4D70.A162C584@interet.com> Fredrik Lundh wrote: > > James C. Ahlstrom wrote: > > > > > > ftp://ftp.interet.com/pub/pylib.html > -- it would be great if "open" could take an open file > object as well as a file name. I put these arguments into the constructor now. > (in this case, you also need to document what you > expect from the underlying file object: read, write, > seek, tell should be enough, right? haven't looked > at the code -- assuming it works, I'm only interested > in the interface) OK, docs updated. > -- I assume "open" adds "b" to the given mode argument. Correct. The mode can be either "w" or "wb" etc., and it works. > -- "dir" looks a bit strange. and hey, there's no "listdir" > in there. I'd prefer a recursive "listdir" method, which > takes an optional "depth" argument (e.g. 0=this dir, > 1=this dir and first subdir, None=infinity, i.e. the full > tree). I added a plain listdir() and changed dir() to printdir(). I also documented self.TOC which gets you the values too. JimA From jim at interet.com Fri Dec 17 15:39:51 1999 From: jim at interet.com (James C. Ahlstrom) Date: Fri, 17 Dec 1999 09:39:51 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> Message-ID: <385A4B37.333B9443@interet.com> "M.-A. Lemburg" wrote: > > "James C. Ahlstrom" wrote: > > > ftp://ftp.interet.com/pub/pylib.html > > > Unfortunately, I always get the following traceback when trying > to print the directory: Yes, compression isn't there yet. I am looking into it. > Some notes on the API: > ---------------------- > * I would find it more convenient if the filename and mode > would be constructor parameters, e.g. > > zfile = zipfile('myfile.zip','rb') OK, done. > with compression defaulting to 8 rather than 0 (most zip files > will be deflated since this is the ZIP default). Until compression works, and zlib ships with Python I would rather default to no compression (method 0). Otherwise this is not useful as a Python import archive. > * Also, I would like a method much like the os.listdir() > which returns a list of filenames rather than print it > to stdout. OK, done. > * .is_zipfile() should probably be a separate function: it > doesn't use any of the class' features. OK, done. > Aside: I found that you are using undocumented arguments to > zlib.compressobj() ... are these extra arguments left out of > the documentation on purpose or by simple oversight ? I couldn't > find them in the HTML docs and neither in the docstrings. I am following the CNRI code blindly here. I don't have docs either. JimA From jack at oratrix.nl Fri Dec 17 23:54:03 1999 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 17 Dec 1999 23:54:03 +0100 Subject: [Python-Dev] Batteries Included? In-Reply-To: Message by Jeremy Hylton , Fri, 17 Dec 1999 13:30:37 -0500 (EST) , <14426.33101.757523.853781@goon.cnri.reston.va.us> Message-ID: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl> Recently, Jeremy Hylton said: > Perhaps the right long-term solution (post-distutils) is to split > Python into a core architected by Guido and a bazaar-style standard > library maintained in a more apache-style. I can't help feeling uncomfortable with this. I've had quite some work to get an Apache with SSL up and running, even though someone gave me quite precise instructions. With Perl I fared even worse, despite their distutils-like package, when I wanted to try a PalmPilot package for Unix that needed Perl. I finally had to give up after quite some effort because the addon installers kept finding the older version of Perl that the system mgr had installed in stead of my newer version. I think distutils will be wonderful for us, the Python community, but something more RedHattish is needed for the general world who just want Python plus a certain set of extensions because some application needs it, so they can just download a fresh copy of ParrotPython 3.4.4 and know the application will work, without interfering with another application that happens to use Inquisition 1a5 and lives elsewhere on the disk. And maybe the answer is a much simpler freezing process, like MacPython BuildApplication where any Python user can drop a script on it and end up with a fully self-contained app guaranteed (well.... No reports to the contrary have been heard so far, at least:-) to contain everything needed and not interfere with an existing MacPython installation (or be interfered with by it). Then a popular app will have prebuilt binaries available for all platforms quickly, made by the Python community, and the enduser interested in the app but not in Python can simply download that. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mal at lemburg.com Sat Dec 18 14:17:52 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 18 Dec 1999 14:17:52 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A4B37.333B9443@interet.com> Message-ID: <385B8980.11CDE9AC@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > > "James C. Ahlstrom" wrote: > > > > ftp://ftp.interet.com/pub/pylib.html > > > > > > Unfortunately, I always get the following traceback when trying > > to print the directory: > > Yes, compression isn't there yet. I am looking into it. Great :-) > > Some notes on the API: > > ---------------------- > > * I would find it more convenient if the filename and mode > > would be constructor parameters, e.g. > > > > zfile = zipfile('myfile.zip','rb') > > OK, done. > > > with compression defaulting to 8 rather than 0 (most zip files > > will be deflated since this is the ZIP default). > > Until compression works, and zlib ships with Python I > would rather default to no compression (method 0). Otherwise > this is not useful as a Python import archive. Point taken. Perhaps it would be even better to not have a default at all: that way people will have to think about the issue *before* implementing it, rather than debug code that produces tracebacks. > > * Also, I would like a method much like the os.listdir() > > which returns a list of filenames rather than print it > > to stdout. > > OK, done. > > > * .is_zipfile() should probably be a separate function: it > > doesn't use any of the class' features. > > OK, done. Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 13 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Sat Dec 18 16:16:44 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 18 Dec 1999 16:16:44 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> Message-ID: <385BA55C.9DFCA88D@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > Unfortunately, I always get the following traceback when trying > > to print the directory: > > OK, I changed the decompress code (10:23 AM), please re-try. Everything is fine now... it's really impressive how easy you can manipulate ZIP files with it. One thing I'd suugest is to include some way to delete and update contents, e.g. the write() method should overwrite any existing entry in the archive (if it not already does -- I haven't tested it, just read the code and it seems to raise an exception), plus maybe a .remove() method which deletes an entry. > > with compression defaulting to 8 rather than 0 (most zip files > > will be deflated since this is the ZIP default). > > The compress mode only applies to writing. On read, the > method recorded in the file controls. True. How about making the compression argument mandatory for file opened in 'wb' mode only ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 13 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From da at ski.org Sat Dec 18 18:35:00 1999 From: da at ski.org (David Ascher) Date: Sat, 18 Dec 1999 09:35:00 -0800 Subject: [Python-Dev] Year 2000 O'Reilly Python Conference Message-ID: <003501bf497e$368f6f60$e655cfc0@ski.org> I just got off the phone with someone at O'Reilly, who is starting to plan the next O'Reilly Open Source Convention. I've agreed to be the chair of the Python conference, just so that there are no delays in getting the conference organized. If someone feels that I should not be chair, speak now and we can figure out who takes the 'job'. There are short-term and long-term issues to discuss: Short term: - We need a program committee -- If you're interested in being on said committee or know someone who should be, let me know. I'd like to get representatives from various subconstituencies on there (web types, zope types, business types, scientist types, linux types, hackers, etc.) - The call for papers is going on the O'Reilly website soon. I will try and get them to pass things by me first, but if we want to emphasize specific kinds of paper submissions, we need to decide that soon. - Greg or Barry, is it possible for one of you to setup a mailman mailing list which will be used by the program committee? eGroups is easy for me to setup, but lots of people hated it last year. I don't want to pollute python-dev with conference discussions. Longer term: - The schedule for the conference is (supposedly) going to be the same as last year. conference-wide keynotes at the beginning of both days, and 4x90minute segments. - We have two parallel tracks - We have 4 half-day tutorial slots - All of the paper materials have to be 'in' by March 1. We need to decide how much time we need to go through the review/revision process ourselves. In other words, the deadline for submissions is up to us, but we don't have that much time. --david ascher From jeremy at cnri.reston.va.us Sat Dec 18 23:39:58 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Sat, 18 Dec 1999 17:39:58 -0500 (EST) Subject: [Python-Dev] zipfile.py In-Reply-To: <385A4B37.333B9443@interet.com> References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A4B37.333B9443@interet.com> Message-ID: <14428.3390.671438.663889@bitdiddle.cnri.reston.va.us> >>>>> "JCA" == James C Ahlstrom writes: >> Aside: I found that you are using undocumented arguments to >> zlib.compressobj() ... are these extra arguments left out of the >> documentation on purpose or by simple oversight ? I couldn't find >> them in the HTML docs and neither in the docstrings. JCA> I am following the CNRI code blindly here. I don't have docs JCA> either. The docs for the zlib module are quite out of date, although I think the docstrings may be better (not necessarily completely up-to-date thought :-). The specific parameters to pass to zlib don't seem to be documented anywhere either; IIRC I dug them out of some example C code somewhere that used zlib to read Zip files. Jeremy From gstein at lyra.org Sun Dec 19 00:14:02 1999 From: gstein at lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 15:14:02 -0800 (PST) Subject: [Python-Dev] Year 2000 O'Reilly Python Conference In-Reply-To: <003501bf497e$368f6f60$e655cfc0@ski.org> Message-ID: On Sat, 18 Dec 1999, David Ascher wrote: >... > - Greg or Barry, is it possible for one of you to setup a mailman mailing > list which will be used by the program committee? eGroups is easy for me to > setup, but lots of people hated it last year. I don't want to pollute > python-dev with conference discussions. Done. ora-pc at pythonpros.com. http://mailman.pythonpros.com/mailman/listinfo/ora-pc I also removed the old monterey-speakers mailing list :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From da at ski.org Sun Dec 19 08:24:51 1999 From: da at ski.org (David Ascher) Date: Sat, 18 Dec 1999 23:24:51 -0800 Subject: [Python-Dev] Year 2000 O'Reilly Python Conference References: Message-ID: <013301bf49f2$243946f0$df55cfc0@ski.org> From: Greg Stein > On Sat, 18 Dec 1999, David Ascher wrote: > >... > > - Greg or Barry, is it possible for one of you to setup a mailman mailing > > list which will be used by the program committee? > Done. ora-pc at pythonpros.com. > http://mailman.pythonpros.com/mailman/listinfo/ora-pc Thanks, Greg. Now, folks, please consider joining the program committee. We need a few volunteers - not too many, but somewhere between 5 and 10 would be good. You don't even have to commit to making it to the conference, if that's a concern. -- david From jim at interet.com Mon Dec 20 15:18:17 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 09:18:17 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <385E3AA9.162BE568@interet.com> Greg Stein wrote: > Do you have a ZipImporter written? Yes, it is ftp://ftp.interet.com/pub/importer.py JimA From jim at interet.com Mon Dec 20 15:35:58 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 09:35:58 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> Message-ID: <385E3ECE.F8DCDE28@interet.com> "M.-A. Lemburg" wrote: > One thing I'd suugest is to include some way to delete and > update contents, e.g. the write() method should overwrite > any existing entry in the archive (if it not already does -- > I haven't tested it, just read the code and it seems to raise > an exception), plus maybe a .remove() method which deletes > an entry. Currently, adding a file requires the "a" append mode, while the "w" mode re-writes the file. Adding a duplicate file name produces an error message. I can change this, but removing a file would either waste space, or else the file contents must be copied over the old file and all the offsets updated. I don't like this because it is complicated, and I think it is fast enough to just re-write the archive. But it could be added if people want. > True. How about making the compression argument mandatory > for file opened in 'wb' mode only ? The default of zero provides a little guidance that you should use zero. I added a warning message if 8 is used which should discourage people from using 8. Or I could disallow 8. Is that OK? JimA From jim at interet.com Mon Dec 20 16:34:02 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 10:34:02 -0500 Subject: [Python-Dev] Batteries Included? References: <19991217225408.D9AF3EDD20@oratrix.oratrix.nl> Message-ID: <385E4C6A.BEC0F728@interet.com> Jack Jansen wrote: > And maybe the answer is a much simpler freezing process, like > MacPython BuildApplication where any Python user can drop a script on > it and end up with a fully self-contained app guaranteed (well.... No > reports to the contrary have been heard so far, at least:-) to contain > everything needed and not interfere with an existing MacPython > installation (or be interfered with by it). Then a popular app will > have prebuilt binaries available for all platforms quickly, made by > the Python community, and the enduser interested in the app but not in > Python can simply download that. IMHO the "much simpler freezing process" is archive files. A simple script can build them, imputil can import them, and the only remaining problem is to find them. Please see: ftp://ftp.interet.com/pub/bootmodule.html ftp://ftp.interet.com/pub/pylib.html JimA From jack at oratrix.nl Mon Dec 20 17:50:32 1999 From: jack at oratrix.nl (Jack Jansen) Date: Mon, 20 Dec 1999 17:50:32 +0100 Subject: [Python-Dev] Batteries Included? In-Reply-To: Message by "James C. Ahlstrom" , Mon, 20 Dec 1999 10:34:02 -0500 , <385E4C6A.BEC0F728@interet.com> Message-ID: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl> > IMHO the "much simpler freezing process" is archive files. A simple > script can build them, imputil can import them, and the only > remaining problem is to find them. Please see: Archive files solves the problem for Python modules. But that leaves the problem of dynamically loaded modules. And resources for dialogs and such, if you use native GUI stuff on Mac or Windows. And most serious applications that I've seen (GRiNS and Zope, to name two, Mailman is the only exception I can think of) depend on non-standard plugin modules. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From mal at lemburg.com Mon Dec 20 15:44:42 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 20 Dec 1999 15:44:42 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> Message-ID: <385E40DA.37AD704F@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > One thing I'd suugest is to include some way to delete and > > update contents, e.g. the write() method should overwrite > > any existing entry in the archive (if it not already does -- > > I haven't tested it, just read the code and it seems to raise > > an exception), plus maybe a .remove() method which deletes > > an entry. > > Currently, adding a file requires the "a" append mode, while > the "w" mode re-writes the file. Adding a duplicate file name > produces an error message. I can change this, > but removing a file would either waste space, or else the file > contents must be copied over the old file and all the offsets > updated. I don't like this because it is complicated, and I think > it is fast enough to just re-write the archive. But it > could be added if people want. I guess it would be ok to waste space. You could provide a .cleanup() or .rewrite() method that takes care of reorganizing the file to fill up the gaps. > > True. How about making the compression argument mandatory > > for file opened in 'wb' mode only ? > > The default of zero provides a little guidance that you should > use zero. I added a warning message if 8 is used which should > discourage people from using 8. Or I could disallow 8. > Is that OK? Well the module seems to work just fine with compression on, so disallowing it or issuing a warning would reduce its value, IMHO. How about making compression a boolean value and then converting any true value to 8 ? -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 11 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake at acm.org Mon Dec 20 19:52:41 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 20 Dec 1999 13:52:41 -0500 (EST) Subject: [Python-Dev] posix module In-Reply-To: <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com> References: <14423.61493.90107.433664@weyr.cnri.reston.va.us> <036d01bf4889$30b63470$f29b12c2@secret.pythonware.com> Message-ID: <14430.31481.402469.896400@weyr.cnri.reston.va.us> Fredrik Lundh writes: > (current CVS stuff, on Red Hat 5.2) Ok, Guido figured it out; this is a typo in the header /usr/include/confname.h; the enum and the #define don't have the same name. Do you know a way to detect the Linux kernel version using pre-preprocessor macros? (Seems very fragile.) Would it be reasonable to only add that table entry for kernel versions >= 2.2? -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From jim at interet.com Mon Dec 20 20:25:27 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 14:25:27 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> Message-ID: <385E82A7.72345807@interet.com> "M.-A. Lemburg" wrote: > I guess it would be ok to waste space. You could provide > a .cleanup() or .rewrite() method that takes care of > reorganizing the file to fill up the gaps. OK, adding a duplicate name replaces the old file. > Well the module seems to work just fine with compression > on, so disallowing it or issuing a warning would reduce its value, > IMHO. Yes compression works, but 90% of Python installations don't have zlib, so it is an ERROR to create archives with compression when these archives are distributed to other sites. > How about making compression a boolean value and then > converting any true value to 8 ? It would close the door to future or other compression methods. Currently the method must be 0 or 8 or a traceback will result. JimA From jim at interet.com Mon Dec 20 20:33:11 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 14:33:11 -0500 Subject: [Python-Dev] Batteries Included? References: <19991220165032.B9E8B370CF2@snelboot.oratrix.nl> Message-ID: <385E8477.F727E0F8@interet.com> Jack Jansen wrote: > Archive files solves the problem for Python modules. But that leaves the > problem of dynamically loaded modules. And resources for dialogs and such, if > you use native GUI stuff on Mac or Windows. Point taken. For dynamically loaded modules, I believe in following the native system's DLL path, and not adding eccentric Python logic. But many disagreed a couple week's ago when I raised this. For resources, I think the archive file can accommodate this, although it seems highly system dependent. Anyway, any file at all can live in the archive and the import mechanism for *.pyc will not be damaged nor unduly slowed down by its presence. JimA From gstein at lyra.org Mon Dec 20 21:11:50 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 20 Dec 1999 12:11:50 -0800 (PST) Subject: [Python-Dev] zipfile.py In-Reply-To: <385E82A7.72345807@interet.com> Message-ID: On Mon, 20 Dec 1999, James C. Ahlstrom wrote: > "M.-A. Lemburg" wrote: > > I guess it would be ok to waste space. You could provide > > a .cleanup() or .rewrite() method that takes care of > > reorganizing the file to fill up the gaps. > > OK, adding a duplicate name replaces the old file. But it shouldn't print a warning(!). If an application wants to replace a file, then stuff shouldn't appear on stdout as a result. > > Well the module seems to work just fine with compression > > on, so disallowing it or issuing a warning would reduce its value, > > IMHO. > > Yes compression works, but 90% of Python installations don't have > zlib, so it is an ERROR to create archives with compression when > these archives are distributed to other sites. While it may be problem to distribute them to other sites, that is not up to the library. If I want compression, then I should get compression. A library module should not determine application-level policy. The warning that __init__ prints shouldn't be there. Really: there should not be a single "print" in the library (well, printdir() is fine... that's what it is supposed to do; printing in the test code would be fine). In normal, or even exceptional(!), operation there should never be a print. > > How about making compression a boolean value and then > > converting any true value to 8 ? > > It would close the door to future or other compression methods. > Currently the method must be 0 or 8 or a traceback will result. I definitely agree with JimA here. For example, maybe we want bzip compression in there. Sure, non-portable, but that's my problem :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim at interet.com Mon Dec 20 21:50:46 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 15:50:46 -0500 Subject: [Python-Dev] zipfile.py References: Message-ID: <385E96A6.40CCF285@interet.com> Greg Stein wrote: > > On Mon, 20 Dec 1999, James C. Ahlstrom wrote: > > "M.-A. Lemburg" wrote: > But it shouldn't print a warning(!). If an application wants to replace a > file, then stuff shouldn't appear on stdout as a result. OK, no warning. > The warning that __init__ prints shouldn't be there. OK, it is gone. > Really: there should not be a single "print" in the library (well, No print unless _debug > 0 JimA From mal at lemburg.com Mon Dec 20 22:16:39 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 20 Dec 1999 22:16:39 +0100 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com> Message-ID: <385E9CB7.5DE4848A@lemburg.com> "James C. Ahlstrom" wrote: > > "M.-A. Lemburg" wrote: > > > I guess it would be ok to waste space. You could provide > > a .cleanup() or .rewrite() method that takes care of > > reorganizing the file to fill up the gaps. > > OK, adding a duplicate name replaces the old file. Cool. > > Well the module seems to work just fine with compression > > on, so disallowing it or issuing a warning would reduce its value, > > IMHO. > > Yes compression works, but 90% of Python installations don't have > zlib, so it is an ERROR to create archives with compression when > these archives are distributed to other sites. Sure, for the sake of creating Python code archives, but your module is much more versatile: e.g. I could automatically create ZIP archives of log files or sets of other files and then have Python email them to someone who uses these archives through standard tools such as WinZip -- the target doesn't always have to be a Python process :-) > > How about making compression a boolean value and then > > converting any true value to 8 ? > > It would close the door to future or other compression methods. > Currently the method must be 0 or 8 or a traceback will result. Ok. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 11 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jim at interet.com Mon Dec 20 22:37:20 1999 From: jim at interet.com (James C. Ahlstrom) Date: Mon, 20 Dec 1999 16:37:20 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <38590844.769C3025@interet.com> <38591E65.4885A39D@interet.com> <38595852.E8054741@lemburg.com> <385A55D6.A8A05EB9@interet.com> <385BA55C.9DFCA88D@lemburg.com> <385E3ECE.F8DCDE28@interet.com> <385E40DA.37AD704F@lemburg.com> <385E82A7.72345807@interet.com> <385E9CB7.5DE4848A@lemburg.com> Message-ID: <385EA190.6AF511BD@interet.com> "M.-A. Lemburg" wrote: > > Sure, for the sake of creating Python code archives, but > your module is much more versatile: e.g. I could automatically > create ZIP archives of log files or sets of other files and OK, zipfile.py no longer complains about compression != 0 JimA From fdrake at acm.org Tue Dec 21 23:42:26 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 21 Dec 1999 17:42:26 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212238.RAA13660@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> Message-ID: <14432.594.33416.600794@weyr.cnri.reston.va.us> Guido van Rossum writes: > + > + class GetoptError(Exception): > + opt = '' > + msg = '' > + def __init__(self, *args): > + self.args = args > + if len(args) == 1: > + self.msg = args[0] > + elif len(args) == 2: > + self.msg = args[0] > + self.opt = args[1] > + > + def __str__(self): > + return self.msg > > ! error = GetoptError # backward compatibility This breaks as soon as the standard exceptions are strings; does this mean -X will be removed in the next release? (Please????) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From bwarsaw at cnri.reston.va.us Tue Dec 21 23:44:46 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 21 Dec 1999 17:44:46 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> Message-ID: <14432.734.155183.508785@anthem.cnri.reston.va.us> >>>>> "Fred" == Fred L Drake, Jr writes: Fred> This breaks as soon as the standard exceptions are Fred> strings; does this mean -X will be removed in the next Fred> release? (Please????) Pretty please? :) From guido at CNRI.Reston.VA.US Wed Dec 22 00:05:28 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:05:28 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 17:42:26 EST." <14432.594.33416.600794@weyr.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> Message-ID: <199912212305.SAA13722@eric.cnri.reston.va.us> > Guido van Rossum writes: > > + > > + class GetoptError(Exception): > > + opt = '' > > + msg = '' > > + def __init__(self, *args): > > + self.args = args > > + if len(args) == 1: > > + self.msg = args[0] > > + elif len(args) == 2: > > + self.msg = args[0] > > + self.opt = args[1] > > + > > + def __str__(self): > > + return self.msg > > > > ! error = GetoptError # backward compatibility [Fred Drake] > This breaks as soon as the standard exceptions are strings; does > this mean -X will be removed in the next release? (Please????) Not a bad idea. Anybody got a reason why -X should stay? (The next step would be to outlaw raise with a string argument; I think I can't make that for 1.6. But it would be a good idea to scan the standard library for string exceptions and convert all of them.) --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Wed Dec 22 00:21:38 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:21:38 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <14432.2946.857539.898577@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Anybody got a reason why -X should stay? Kill it. Guido> (The next step would be to outlaw raise with a string Guido> argument; I think I can't make that for 1.6. But it would Guido> be a good idea to scan the standard library for string Guido> exceptions and convert all of them.) Or require that exception classes be derived from exceptions.Exception :) -Barry From guido at CNRI.Reston.VA.US Wed Dec 22 00:23:29 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:23:29 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 18:21:38 EST." <14432.2946.857539.898577@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> Message-ID: <199912212323.SAA13803@eric.cnri.reston.va.us> [Barry] > Guido> Anybody got a reason why -X should stay? > > Kill it. You already said that. Anybody else? > Guido> (The next step would be to outlaw raise with a string > Guido> argument; I think I can't make that for 1.6. But it would > Guido> be a good idea to scan the standard library for string > Guido> exceptions and convert all of them.) > > Or require that exception classes be derived from exceptions.Exception > :) That's hard to require. But it could easily be a requirement checked by one of the hypothetical typecheckers that are being discussed in the types-sig. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Wed Dec 22 00:27:31 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:27:31 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> Message-ID: <14432.3299.404561.698836@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: BAW> Or require that exception classes be derived from BAW> exceptions.Exception :) Guido> That's hard to require. But it could easily be a Guido> requirement checked by one of the hypothetical typecheckers Guido> that are being discussed in the types-sig. Hmm, the raise could probably enforce this, but it might not be that useful. -Barry From guido at CNRI.Reston.VA.US Wed Dec 22 00:40:22 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:40:22 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 18:27:31 EST." <14432.3299.404561.698836@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> Message-ID: <199912212340.SAA13851@eric.cnri.reston.va.us> > >>>>> "Guido" == Guido van Rossum writes: > > BAW> Or require that exception classes be derived from > BAW> exceptions.Exception :) > > Guido> That's hard to require. But it could easily be a > Guido> requirement checked by one of the hypothetical typecheckers > Guido> that are being discussed in the types-sig. > > Hmm, the raise could probably enforce this, but it might not be that > useful. > > -Barry The raise could easily enforce this, but it would break lots of existing code. I wish I had done it right from the start -- then exceptions would have been classes from the start and would have required inheritance from the Exception base class. Like in Java. (And in C++?) --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at CNRI.Reston.VA.US Wed Dec 22 00:43:59 1999 From: bwarsaw at CNRI.Reston.VA.US (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:43:59 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> Message-ID: <14432.4287.543786.308468@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> The raise could easily enforce this, but it would break Guido> lots of existing code. Maybe not (I'm not sure). All the standard exceptions inherit from Exception, and of course there'd be nothing to enforce for existing user-defined string based exceptions. How pervasive are user-defined class based exceptions that don't inherit from Exception? (I don't know, and I haven't grepped, but I think we've been making that recommendation from day 1 of class-based standard exceptions, and I try to follow this recommendation in my own code). Guido> I wish I had done it right from the start -- then Guido> exceptions would have been classes from the start and would Guido> have required inheritance from the Exception base class. Guido> Like in Java. (And in C++?) All Hail, Python 2.0, our Savior and Redeemer! :) -Barry From guido at CNRI.Reston.VA.US Wed Dec 22 00:49:09 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:49:09 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Tue, 21 Dec 1999 18:43:59 EST." <14432.4287.543786.308468@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> <14432.4287.543786.308468@anthem.cnri.reston.va.us> Message-ID: <199912212349.SAA13892@eric.cnri.reston.va.us> > From: "Barry A. Warsaw" > >>>>> "Guido" == Guido van Rossum writes: > > Guido> The raise could easily enforce this, but it would break > Guido> lots of existing code. > > Maybe not (I'm not sure). All the standard exceptions inherit from > Exception, and of course there'd be nothing to enforce for existing > user-defined string based exceptions. How pervasive are user-defined > class based exceptions that don't inherit from Exception? (I don't > know, and I haven't grepped, but I think we've been making that > recommendation from day 1 of class-based standard exceptions, and I > try to follow this recommendation in my own code). Yes, but class-based user exceptions existed many Python versions before class-based standard exceptions! Two examples in the standard library: ConfigParser.py and xdrlib.py. > All Hail, Python 2.0, our Savior and Redeemer! :) Or, the perfect excuse for procrastination :) (But yes, 2.0 will enforce this.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Wed Dec 22 00:53:50 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 15:53:50 -0800 (PST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: On Tue, 21 Dec 1999, Guido van Rossum wrote: >... > [Fred Drake] > > This breaks as soon as the standard exceptions are strings; does > > this mean -X will be removed in the next release? (Please????) > > Not a bad idea. > > Anybody got a reason why -X should stay? Kill it. > (The next step would be to outlaw raise with a string argument; I > think I can't make that for 1.6. But it would be a good idea to scan > the standard library for string exceptions and convert all of them.) Keep string exceptions. I think there is probably a lot of code that still uses them. I know I do :-) We can issues warnings about string exceptions via the type-checking tool. Cheers, -g -- Greg Stein, http://www.lyra.org/ From bwarsaw at CNRI.Reston.VA.US Wed Dec 22 00:54:04 1999 From: bwarsaw at CNRI.Reston.VA.US (Barry A. Warsaw) Date: Tue, 21 Dec 1999 18:54:04 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> <14432.4287.543786.308468@anthem.cnri.reston.va.us> <199912212349.SAA13892@eric.cnri.reston.va.us> Message-ID: <14432.4892.908107.421149@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Yes, but class-based user exceptions existed many Python Guido> versions before class-based standard exceptions! True, but I suspect that legacy class-based user exceptions are rare. I might be wrong, but you're absolutely right that these would all be broken. Guido> Two examples in the standard library: ConfigParser.py and Guido> xdrlib.py. Fortunately these are fixed with two 11 character patches :) I'm not necessarily arguing for or against tightening this. -Barry From gmcm at hypernet.com Wed Dec 22 00:55:07 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Tue, 21 Dec 1999 18:55:07 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us> References: Your message of "Tue, 21 Dec 1999 18:27:31 EST." <14432.3299.404561.698836@anthem.cnri.reston.va.us> Message-ID: <1266302877-22249299@hypernet.com> [Guido] > I wish I had done it right from the start -- then exceptions > would have been classes from the start and would have required > inheritance from the Exception base class. Like in Java. (And > in C++?) In C++ you can throw anything at all. Strings, ints, that Warsaw blockhead... off-topic-ly y'rs - Gordon From tismer at appliedbiometrics.com Wed Dec 22 01:57:27 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 22 Dec 1999 01:57:27 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> Message-ID: <386021F7.4F94C458@appliedbiometrics.com> Guido van Rossum wrote: > > [Barry] > > Guido> Anybody got a reason why -X should stay? > > > > Kill it. > > You already said that. > > Anybody else? I'd say kill -X, but keep allowing string exceptions if it doesn't cost too much. I think of C++, like Gordon said. Also I'd take the chance and move the exceptions Python module back into the core, as a frozen mdule or whatever. Reason: At the moment, the CVS version of the Python library is incompatible to 1.5.2, which makes testing against the standard dist quite inconvenient. A compiled CVS Python does not run under PythonWin when I put it into my standard installation. Or is there an easy way to switch all settings to a completely different path? Anyway, I'm most probably off until Y2K. See ya all then, provided we survive - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido at CNRI.Reston.VA.US Wed Dec 22 02:01:16 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 20:01:16 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 01:57:27 +0100." <386021F7.4F94C458@appliedbiometrics.com> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <386021F7.4F94C458@appliedbiometrics.com> Message-ID: <199912220101.UAA14109@eric.cnri.reston.va.us> > I'd say kill -X, but keep allowing string exceptions if > it doesn't cost too much. I think of C++, like Gordon said. Agreed. > Also I'd take the chance and move the exceptions Python > module back into the core, as a frozen mdule or whatever. > > Reason: At the moment, the CVS version of the Python library > is incompatible to 1.5.2, which makes testing against the > standard dist quite inconvenient. A compiled CVS Python > does not run under PythonWin when I put it into my standard > installation. Or is there an easy way to switch all settings > to a completely different path? Point the PYTHONHOME variable to the top of your install directory. (On Windows you may have to kill the registry settings -- this is a bug.) > Anyway, I'm most probably off until Y2K. Ditto. > See ya all then, provided we survive - chris Best wishes to all, --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at digicool.com Wed Dec 22 14:54:41 1999 From: jim at digicool.com (Jim Fulton) Date: Wed, 22 Dec 1999 08:54:41 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <3860D821.576B3146@digicool.com> Guido van Rossum wrote: > > (The next step would be to outlaw raise with a string argument; I > think I can't make that for 1.6. But it would be a good idea to scan > the standard library for string exceptions and convert all of them.) This would be waaaaay to big a change for Python 1.x. There are alot of Python modules outside the standard distribution that use string exceptions. This would be a huge backward incompatability. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From fdrake at acm.org Wed Dec 22 15:23:29 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 22 Dec 1999 09:23:29 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <14432.57057.535205.558@weyr.cnri.reston.va.us> Guido van Rossum writes: > (The next step would be to outlaw raise with a string argument; I > think I can't make that for 1.6. But it would be a good idea to scan > the standard library for string exceptions and convert all of them.) I don't know if requiring class-based exceptions will make the runtime any simpler, but that seems the only reason to do it. The only reason to remove -X, and possibly the string exception fallback code, is to ensure that we *can* subclass Exception and friends without having to catch TypeError and do something different. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake at acm.org Wed Dec 22 15:25:33 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 22 Dec 1999 09:25:33 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <14432.2946.857539.898577@anthem.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> Message-ID: <14432.57181.944364.427093@weyr.cnri.reston.va.us> Barry A. Warsaw writes: > Or require that exception classes be derived from exceptions.Exception > :) Ok, it's early, and maybe I haven't had enough coffee(!). But is this serious? Does JPython gain some benefit from this, is it your preference, or are you just yanking on my leg? ("Pulling my arm" as my 5-year-old says!) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From guido at CNRI.Reston.VA.US Wed Dec 22 15:40:39 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 09:40:39 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 09:23:29 EST." <14432.57057.535205.558@weyr.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.57057.535205.558@weyr.cnri.reston.va.us> Message-ID: <199912221440.JAA16198@eric.cnri.reston.va.us> > From: "Fred L. Drake, Jr." > > Guido van Rossum writes: > > (The next step would be to outlaw raise with a string argument; I > > think I can't make that for 1.6. But it would be a good idea to scan > > the standard library for string exceptions and convert all of them.) > > I don't know if requiring class-based exceptions will make the > runtime any simpler, but that seems the only reason to do it. Do what? *Require* class exceptions? You're probably right, and I think the gain is minimal. There's another reason to scan the std library though -- not to set a bad example. I want to eventually (in 2.0) move to a class-derived-from-Exception-only scheme. > The only reason to remove -X, and possibly the string exception > fallback code, is to ensure that we *can* subclass Exception and > friends without having to catch TypeError and do something different. And that's a very good reason indeed. Let me repeat my plans for 1.6. - Remove -X; the standard exceptions are always class-based. - Change all standard library and other example code to use class-based exceptions with a standard exception as base class, to set an example. - Still allow string exceptions in user code. - Still allow class exceptions that don't use a standard exception base class in user code. --Guido van Rossum (home page: http://www.python.org/~guido/) From marangoz at python.inrialpes.fr Wed Dec 22 19:09:47 1999 From: marangoz at python.inrialpes.fr (Vladimir Marangozov) Date: Wed, 22 Dec 1999 19:09:47 +0100 (CET) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912221440.JAA16198@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 09:40:39 AM Message-ID: <199912221809.TAA25322@python.inrialpes.fr> Guido van Rossum wrote: > > [Fred Drake] > > I don't know if requiring class-based exceptions will make the > > runtime any simpler, but that seems the only reason to do it. > > Do what? *Require* class exceptions? You're probably right, and I > think the gain is minimal. Yes. Besides, I still think that string-based exceptions are just convenient for quick & dirty, throw-away test scripts. > > Let me repeat my plans for 1.6. > > - Remove -X; the standard exceptions are always class-based. > > - Change all standard library and other example code to use > class-based exceptions with a standard exception as base class, to set > an example. > > - Still allow string exceptions in user code. > > - Still allow class exceptions that don't use a standard exception > base class in user code. Sounds okay. --- PS: I'm particularly happy today :-) because I've finally published the new version of our Web site http://www.inrialpes.fr. Two things I'd like to mention: (1) it shouldn't have been possible without quick Python scripts ;) (2) I'll find the time to reinvoke some of the topics discussed here instead of being mute as a fish. That said, Merry Christmas and a Happy New Year to all of you! -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From guido at CNRI.Reston.VA.US Wed Dec 22 19:23:45 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 13:23:45 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 19:09:47 +0100." <199912221809.TAA25322@python.inrialpes.fr> References: <199912221809.TAA25322@python.inrialpes.fr> Message-ID: <199912221823.NAA16517@eric.cnri.reston.va.us> Vladimir.Marangozov at inrialpes.fr: > Yes. Besides, I still think that string-based exceptions are just > convenient for quick & dirty, throw-away test scripts. They have a hard-to-understand quirk though: the id() of the string is used to check rather than its value, so that except "foo" doesn't necessarily catch raise "foo"; but due to various optimization, this usually works, and people get bent out of shape when it doesn't. Since you have to give your exception a name, how hard is it to say class MyError(Exception): pass rathern than MyError = "MyError" ? --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Wed Dec 22 19:33:19 1999 From: gstein at lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 10:33:19 -0800 (PST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us> Message-ID: On Wed, 22 Dec 1999, Guido van Rossum wrote: > Vladimir.Marangozov at inrialpes.fr: > > Yes. Besides, I still think that string-based exceptions are just > > convenient for quick & dirty, throw-away test scripts. > > They have a hard-to-understand quirk though: the id() of the string is > used to check rather than its value, so that except "foo" doesn't > necessarily catch raise "foo"; but due to various optimization, this > usually works, and people get bent out of shape when it doesn't. > Since you have to give your exception a name, how hard is it to say > > class MyError(Exception): pass > > rathern than > > MyError = "MyError" > > ? It is very hard. My fingers do the typing for me, and they fill in strings. I'm trying to teach them otherwise, but they insist. You're also assuming that MyError gets defined. Sometimes, my little fingers like typing: try: foo except: raise "foo broke for some reason" Quick and dirty, indeed! :-) Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From fdrake at acm.org Wed Dec 22 20:59:55 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 22 Dec 1999 14:59:55 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212340.SAA13851@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> Message-ID: <14433.11707.607533.698901@weyr.cnri.reston.va.us> Guido van Rossum writes: > I wish I had done it right from the start -- then exceptions would > have been classes from the start and would have required inheritance > from the Exception base class. Like in Java. (And in C++?) I've seen this said or hinted at in a couple of places (the specific requirement that exception derive from Exception), but I've seen nothing that indicates any reason or derived value for this. Could someone please clarify? -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From guido at CNRI.Reston.VA.US Wed Dec 22 21:05:52 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 15:05:52 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Your message of "Wed, 22 Dec 1999 14:59:55 EST." <14433.11707.607533.698901@weyr.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> <14432.2946.857539.898577@anthem.cnri.reston.va.us> <199912212323.SAA13803@eric.cnri.reston.va.us> <14432.3299.404561.698836@anthem.cnri.reston.va.us> <199912212340.SAA13851@eric.cnri.reston.va.us> <14433.11707.607533.698901@weyr.cnri.reston.va.us> Message-ID: <199912222005.PAA17291@eric.cnri.reston.va.us> > From: "Fred L. Drake, Jr." > Guido van Rossum writes: > > I wish I had done it right from the start -- then exceptions would > > have been classes from the start and would have required inheritance > > from the Exception base class. Like in Java. (And in C++?) > > I've seen this said or hinted at in a couple of places (the specific > requirement that exception derive from Exception), but I've seen > nothing that indicates any reason or derived value for this. Could > someone please clarify? It's simply an extra bit of checking that your program is reasonable -- if you accidentally raise a non-exception class, there's probably something wrong with your program, and it gives the reader a hint about the intended use of the class. Other languages (e.g. Modula-3) have a specific exception type that can be used only for that one purpose. However it's useful to allow methods an subclassing of exceptions, so they might as well be classes. So, all exceptions are classes. But not all classes are exceptions. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Wed Dec 22 21:11:43 1999 From: gstein at lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 12:11:43 -0800 (PST) Subject: [Python-Dev] Please test new dynamic load behavior Message-ID: Hi all, I reorganized Python's dynamic load/import code over the past few days. Gudio provided some feedback, I did some more mods, and now it is checked into CVS. The new loading behavior has been tested on Linux, IRIX, and Solaris (and probably Windows by now). For people with CVS access, I'd like to ask that you grab an updated copy and shake out the new code. There have been updates to the "configure" process, so you'll need to run configure again. Make sure that you alter your Modules/Setup to build some shared modules, and then try it out. Here are some of the platforms that I believe need specific testing: - NetBSD, FreeBSD, OpenBSD, ... - AIX - HP/UX - BeOS - NeXT - Mac - OS/2 - Win16 I believe it should work for most people, but we may be looking for the wrong "init" symbol on some platforms. We might even be selecting the wrong import mechanism (or missing it altogether!) on some platforms. If you get a chance to test this, then please drop me a note with your platform and whether it succeeded or failed (and how it failed). Thanx! -g p.s. you can tell if dynamic loading is missing by watching for DYNLOADFILE in the configure process and seeing if it used dynload_stub. alternatively, you can import the "imp" module and see if "load_dynamic" is missing. -- Greg Stein, http://www.lyra.org/ From gvwilson at nevex.com Thu Dec 23 04:43:40 1999 From: gvwilson at nevex.com (gvwilson at nevex.com) Date: Wed, 22 Dec 1999 22:43:40 -0500 (EST) Subject: [Python-Dev] re: Open Source design competition / Python / software tools Message-ID: Hi, folks. I hope you don't mind another mail out of the blue, but I got notice on Saturday that the Department of Energy is giving me $860K over two years to support development of easier-to-use software engineering tools. All of the work will be Open Source, and will be done in Python, with a strong emphasis on design, testing, and documentation. The project's long-term objective is to encourage scientists and engineers to treat programs in the same way as they do other experiments, i.e. to calibrate, test, peer review, and so on. To kick-start things, we're going to be holding a two-round design competition. Anyone (individual or team, professional or student) can submit a short entry for the first round; the judges will pick four candidates to go forward in each of four categories, and those individuals or teams will be asked to submit full entries. The four categories are: * an issue tracking system to replace Gnats and Bugzilla; * a build system to replace make; * a platform inspection and configuration system to replace autoconf; and * a testing framework to replace XUnit, Expect, and DejaGnu. Would you be interested in participating in any way---judging, entering a design, critiquing things from the pointer of view of end users, or anything else? I realize that you're probably up past your eyeballs with work, and that the money on offer is nothing special, but I think this could be a lot of fun, and could help to shift the emphasis of the Open Source community from hacking to design (both by drawing attention to, and rewarding, design, and by creating a corpus of examples and commentary for programmers to refer to). It could also make life a lot easier for computational scientists and engineers... Please let me know if you'd like to be involved, or if you'd like more information than is contained in the FAQ (attached). Timescales are a bit tight---I'd like to be able to make an announcement on January 14---but I'll be reading email at this address several times a day during the holiday. I look forward to hearing from you, Greg Wilson p.s. please note that the attached FAQ is a first draft; I'd be grateful if you could show it to anyone you think might be interested, but I'd also be grateful if you wouldn't broadcast it until it's gone through one more editing pass. -------------- next part -------------- Software Carpentry FAQ

Software Carpentry FAQ

General information

  1. What is the Software Carpentry project?
    The aim of the Software Carpentry project is to make it easier for programmers in general, and scientific programmers in particular, to adopt better software development practices. The project will achieve this by creating tools that are easier to learn and use, and by documenting those tools and the practices they embody.
  2. Where does the name come from?
    The name is a play on "software engineering", and is meant to indicate that this project is initially concerned with medium-sized teams (up to a dozen or two programmers) and medium-term timescales (a year or two).
  3. How did the project get started?
    The project has its origins in a series of articles that Greg Wilson organized for the Fall 1996 and Winter 1996 issues of IEEE Computational Science and Engineering. These articles outlined what their authors thought computer scientists should teach to physical scientists and engineers. Most authors recommended numerical methods or the standard Unix toolset, but Steve McConnell argued that better programming practices would have the greatest impact on productivity.
    As a result of that observation, Greg Wilson, Brent Gorda, and Steve McConnell put together a 3-day course on software engineering for scientists and engineers, which they taught several times at the Los Alamos National Laboratory. Feedback on the course was very positive, but many participants felt that the tools being taught---Perl, Make, CVS, and so on---were unnecessarily difficult to install, learn, and use. They were also frustrated by the scarcity of examples of design documents, testing plans, and all of the other things the course was trying to teach them.
  4. Why Open Source?
    There are three reasons why the Software Carpentry project is following the Open Source model:
    1. Leveraging existing knowledge.
      A closed project can only take advantage of a few minds. As Linux and other projects have shown, a well-run Open Source project can harness the experience and insight of thousands of people.
    2. Lowering barriers to adoption.
      Freely-available tools are more likely to be picked up than their commercial equivalents. This is particularly true when the tool in question does something novel (at least from the point of the person adopting it), and in academia (where budgets are limited).
    3. Encouraging peer review.
      Dan Gezelter?s talk at the first Open Source/Open Science conference discussed how the scientific tradition of peer review fits with the philosophy of the Open Source movement. By designing and building these tools in the open, the Software Carpentry project will both encourage peer review of the tools themselves, and demonstrate how this ought to be done for scientific and commercial software.
  5. Where does the funding come from?
    The funding comes from the U.S. Department of Energy, through the Advanced Computing Laboratory at Los Alamos National Laboratory. The project is being administered by Code Sourcery. US$480,000 has been provided for 2000, and US$380,000 for 2001.
  6. Why would the Department of Energy fund something like this?
    The funding has been provided partly because the DoE would like scientists and engineers to be more productive, and partly because it would like to find out whether the Open Source model and community can meet the special needs of high-performance computational science. The last few years have seen most manufacturers of special-purpose supercomputers disappear or be bought out, and the rise of clusters based on commercial off-the-shelf (COTS) hardware, Linux, MPI, the GNU compiler toolset, and so on. There is a growing feeling that these machines could bring scalable supercomputing into the mainstream, but this will only happen if good tools and practices are accessible enough.
  7. I'm not a scientist or engineer---what's in it for me?
    The things that make many existing Open Source software development tools difficult to learn and use---obscure syntax, arbitrary or hard-to-follow behavior, and poor documentation---affect professional programmers and computer science students just as much as they do computational scientists and engineers. If the Open Source movement can build tools that are simple enough to be learned by people who have problems of their own to solve, and yet powerful enough to support distributed development of hundreds of thousands of lines of complex numerical and visualization code, then those tools will probably also help people who want to build Internet chat rooms and order-tracking systems.
    This project should also be interesting to the general programming community because it is going to place more emphasis on design and early feedback than most Open Source projects have to date. Instead of growing someone?s pet project, Software Carpentry is going to organize---and pay for---a design competition. If this works, it could be an interesting model for other Open Source projects to adopt.
  8. I think [tool] is good enough already---why are you re-inventing the wheel?
    The short answer to this is Alan Cooper's:
    The phrase "computer literate user" really means the person has been hurt so many times that the scar tissue is thick enough so he no longer feels the pain.
    -- Alan Cooper, The Inmates are Running the Asylum
    The longer answer is that the "accidental complexity" of the standard Unix command-line toolset is a major barrier to its adoption by people who are not full-time programmers, or for whom programming is just something that has to be done in order to do something else. Many professional programmers---particularly those who enjoy programming enough to be involved in the Open Source movement---have been using these tools for so long that they simply don't remember how hard it is to configure Gnats, or pass variable bindings between recursive calls to Make.
    And let's face it: if Make or Autoconf were built from scratch today, they would be written as extensible, embeddable modules in a high-level scripting language. This would not only make them easier to use, it would also make them easier to learn, since they would employ one syntax for all purposes. Microsoft Visual Basic has shown just how useful it can be to have a single general-purpose "glue" language capable of binding disparate tools together; the aim of the first half of this project is to bring those benefits to the Open Source community.

Development

  1. What projects are currently under way?
    Software Carpentry will start by producing:
    1. a platform inspection tool similar to Autoconf;
    2. a build management tool similar to Make;
    3. an issue tracking system similar to Gnats or Bugzilla; and
    4. a unit and regression testing harness with the functionality of XUnit, Expect, and DejaGnu.
  2. Why were those tools chosen?
    These four tools were chosen as initial targets for several reasons. First, the working practices they support are essential to medium-scale software engineering. Second, the tools they are intended to replace are generally recognized as being outdated or flawed. This creates demand, and increases the odds that rational reimplementations will be adopted. Third, enough people have enough experience with the tools that are to be replaced to participate in the design competition described later.
  3. Why isn?t [tool] on this list?
    There are several other tools that could have been on this list, and will be added if the first round of work goes well. A cross-platform version control system that corrects the many deficiencies in CVS, for example, is an obvious candidate, but is probably too large to be tackled initially, and any work done by Software Carpentry could well be superseded by BitKeeper. Similarly, the world needs a good Open Source project management tool with the functionality of Microsoft Project, but probably needs the four tools listed above more urgently.
  4. What languages and tools will be used?
    All development work will be done in Python.
  5. Why Python?
    This is actually three questions:
    1. Why mandate a language?
      Building everything in a single language will encourage projects to share code, which will both keep the total volume of code manageable and raise the quality of the implementations (since the shared code will be exercised, and tested, in many different ways). Using a single language will also improve the comprehensibility, and hence the maintainability and extensibility, of the tools. The varying syntax of Make, Autoconf, and other tools is a large practical barrier to their adoption by people who have better (or at least more pressing) things to do than learn yet another syntax. Microsoft?s Visual Basic has shown how powerful it is to use a single, flexible language everywhere.
    2. Why use a scripting language?
      A lot of anecdotal evidence shows that "relaxed" high-level languages (like Python, Perl, and Visual Basic) are more productive vehicles for process management, text processing, and similar tasks than their "strict" equivalents (like C++ and Java).
    3. Why use Python?
      The four candidates considered were Visual Basic, Perl, Tcl, and Python.
      1. Visual Basic
        Visual Basic is proprietary, and there is no indication that a credible Open Source implementation will appear any time soon.
      2. Perl
        Perl was a strong contender, primarily because of the many libraries that have been developed for it, and because of the number of books that document it. However, our experience teaching at Los Alamos was that Perl?s syntax is hard to learn, its behavior often arbitrary, and its size intimidating. While full-time professional programmers with several other languages under their belts might (and often do) say that it all makes sense once you know it, we want to make the learning curve as gentle as possible.
      3. Tcl
        Tcl is easier to learn and read than Perl, but is not as well documented, and doesn?t come with as many libraries. Had Python not existed, Tcl would probably have been chosen for this project.
      4. Python
        Python provides the same functionality as Perl or Tcl, but has proved to be easier to learn, read, and remember. (For example, words like "except" and "unless" appear much less often in Python reference material than they do in Perl reference material.) Python is not yet as extensively documented as Perl, but the number of books is growing, as is the number of modules and libraries. Finally, the Python community is still small enough for a project like this one to attract the attention of a significant proportion of it.
  6. How will development be organized and coordinated?
    Everything the project produces---designs, critiques of those designs, test suites, and examples, as well as actual source code---will be available through the project?s Web site at software-carpentry.codesourcery.com. Each project will have a coordinator, whose job it will be to moderate discussion, synchronize releases, track work items, and report on progress. The coordinator will also be responsible for collating and editing feedback from judges during the design competition.

Design competition

  1. Why a design competition?
    Most Open Source packages have their roots in someone?s pet hobby project, which others have picked up, extended, and modified. This kind of organic growth has a lot of good features, but a well-documented design is not one of them. As a result, programmers often have to rely on folklore and reverse engineering if they want to add to, or fix, these tools. In addition, there is a dearth of examples of good design for new programmers to learn from.
    The Software Carpentry project hopes to address both problems by running a two-stage design competition. The best entries in both rounds will be published, along with commentary from the competition?s judges. This material will serve both to inform and guide further development, and to show novices what experienced programmers think about before they start coding.
  2. Who can enter?
    Everyone: individuals and teams, students and professionals, from anywhere in the world.
  3. What are the rules?
    The full rules are available at:
    software-carpentry.codesourcery.com/design-competition/rules.html
    Basically, initial submissions must be written in English, and can be up to 10 pages long. Examples count against this limit, but diagrams and a Unix-style man page do not. Any person or team may submit only one entry in any given category, but can submit in as many of the four categories as desired.
    The best four entries in each category will be awarded US$2500, and asked to submit full designs. Participants will be strongly encouraged to pool their efforts for the second round. The best second-round submission will be awarded an additional US$7500, while the others will receive another US$2500 each. The real reward will be seeing the design implemented, and being in a good position to bid on the implementation work.
  4. What should first-round submissions contain?
    An example of what a submission should contain, and how it should be formatted is available at:
    software-carpentry.codesourcery.com/design-competition/example.html
    First-round entries should focus primarily on what the tool will do, and how it will be used: command-line options, input and output file formats, sketches of Web and GUI interfaces (where appropriate), and so on. Second-round submissions will then be expected to describe how it?s all going to be implemented.
  5. Who will the judges be?
    Need to firm up the list of judges ASAP.
  6. When are the deadlines?
    The deadline for first-round submissions is March 31, 2000. The five best proposals in each category will be announced on April 30, 2000. Full submissions are due on June 1, 2000, and winners will be announced on June 30, 2000.
  7. Won't prizes discourage co-operation?
    We don?t know. On the one hand, people might want to hoard their best ideas; on the other hand, the best designs in both rounds are going to be published, along with the judges? commentary, and we will be encouraging participants to pool their efforts. Most of the money that will be paid out will go to fund implementation, testing, and documentation; we hope that people will collaborate in the early stages, and treat the prizes as recognition for their effort, rather than treating US$10,000 as their retirement fund.

Documentation

  1. What documentation will be produced?
    The Software Carpentry project will produce several different kinds of documentation:
    1. Design documentation.
      As stated above, the best designs in each category will be published, along with the judges? commentary. This material ought to play the role that music criticism has played in the development of music, by giving newcomers (and experienced programmers) better insight into how good designers think.
    2. User guides.
      The project will pay for the development of man pages, user guides, online help, and all the other documentation needed to turn a program into a product.
    3. Test suites.
      The project will also pay for the development of industrial-strength test suites for all four tools. These suites will be published, both to serve as a starting point for other projects and to demonstrate good practice.
    4. Case studies.
      It is often easier to show someone how to do something than to explain it to them. The Software Carpentry project will pay for case studies that describe how these tools, and (more importantly) the working practices they support, have been deployed in practice. Checklists, templates for forms, and other errata can be submitted.
  2. What format(s) will be used?
    The primary format for all documentation will be HTML. The project will migrate to XML when and as feasible.
  3. What restrictions are there on using the documentation?
    Only those that also apply to the software, under the terms of its Open Source license. You can copy and distribute the documentation in any form, but only if its author(s) and origin are clearly shown, and if you include a description of how readers can access the originals. In particular, the documentation can be reproduced in books, but only if the authors, origin, and location of the originals is printed clearly on each page.
From jack at oratrix.nl Thu Dec 23 11:24:26 1999 From: jack at oratrix.nl (Jack Jansen) Date: Thu, 23 Dec 1999 11:24:26 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: Message by Guido van Rossum , Wed, 22 Dec 1999 13:23:45 -0500 , <199912221823.NAA16517@eric.cnri.reston.va.us> Message-ID: <19991223102426.CCB75370CF2@snelboot.oratrix.nl> > Vladimir.Marangozov at inrialpes.fr: > > > Yes. Besides, I still think that string-based exceptions are just > > convenient for quick & dirty, throw-away test scripts. > > They have a hard-to-understand quirk though: the id() of the string is > used to check rather than its value, so that except "foo" doesn't > necessarily catch raise "foo"; but due to various optimization, this > usually works, and people get bent out of shape when it doesn't. I sort-of use this feature when I'm debugging: if I want to know what happens in an exception that is usually caught somewhere higher up in the call stack I simply put quotes around the exception name and the exception will happen uncaught. The same trick works for except: clauses. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From harri.pasanen at trema.com Thu Dec 23 12:44:04 1999 From: harri.pasanen at trema.com (Harri Pasanen) Date: Thu, 23 Dec 1999 13:44:04 +0200 Subject: [Python-Dev] Re: [PSA MEMBERS] Please test new dynamic load behavior References: Message-ID: <38620B04.7CC64485@trema.com> Greg Stein wrote: > > Hi all, > > I reorganized Python's dynamic load/import code over the past few days. > Gudio provided some feedback, I did some more mods, and now it is checked > into CVS. The new loading behavior has been tested on Linux, IRIX, and > Solaris (and probably Windows by now). > ... What was the motivation behind this modification? Just curious, -Harri From marangoz at python.inrialpes.fr Thu Dec 23 13:12:40 1999 From: marangoz at python.inrialpes.fr (Vladimir Marangozov) Date: Thu, 23 Dec 1999 13:12:40 +0100 (CET) Subject: [Python-Dev] Please test new dynamic load behavior In-Reply-To: from "Greg Stein" at Dec 22, 1999 12:11:43 PM Message-ID: <199912231212.NAA26572@python.inrialpes.fr> Greg Stein wrote: > > Hi all, > > I reorganized Python's dynamic load/import code over the past few days. > Gudio provided some feedback, I did some more mods, and now it is checked > into CVS. The new loading behavior has been tested on Linux, IRIX, and > Solaris (and probably Windows by now). > Great work Greg! > Here are some of the platforms that I believe need specific testing: > > - NetBSD, FreeBSD, OpenBSD, ... > - AIX > - HP/UX > - BeOS > - NeXT > - Mac > - OS/2 > - Win16 AFAICT, the AIX version works perfectly okay. -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From jim at digicool.com Thu Dec 23 15:41:23 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 09:41:23 -0500 Subject: [Python-Dev] str(1L) -> '1' ? Message-ID: <38623493.E6BA6D6F@digicool.com> In November there was an interesting discussion on comp.lang.python about the meaning of __str__ and __repr__. One tidbit that came out of this discussion was that __str__ for longs should drop the trailing 'L'. Was there a decision on this? I'd really like this to happen. We do alot of work with RDBMS systems and long integers seem to come up alot with these systems (as do other fix-decimal number, but that's another topic ;). For example, our latest Sybase and Oracle support in Zope returns long integers for RDBMS types like NUMBER(10,0). The trailing 'L' in the string representation is causeing us some headaches. This seems also to be an issue when using the current standard ODBC interface with Oracle, as indicated in a DB-SIG post today. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido at CNRI.Reston.VA.US Thu Dec 23 15:46:58 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 09:46:58 -0500 Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: Your message of "Thu, 23 Dec 1999 09:41:23 EST." <38623493.E6BA6D6F@digicool.com> References: <38623493.E6BA6D6F@digicool.com> Message-ID: <199912231446.JAA22086@eric.cnri.reston.va.us> [Jim F] > In November there was an interesting discussion on comp.lang.python > about the meaning of __str__ and __repr__. One tidbit that came out > of this discussion was that __str__ for longs should drop the trailing > 'L'. Was there a decision on this? I'd really like this to happen. Yes, I'd like it to happen. I'd also like repr() of a float to return the full precision (using the "%.17g" sprintf format). I haven't done it for lack of time -- feel free to send a patch (don't forget the disclaimer from http://www.python.org/1.5/bugrelease.html). We haven't decided yet what to do with the greater topic of that discussion (or was it a different one?) -- whether the values printed by typing a bare expression in interactive mode should use str(), repr(), or str-special-casing-the-snot-out-of-strings(). --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at digicool.com Thu Dec 23 15:51:14 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 09:51:14 -0500 Subject: [Python-Dev] Fixed-decimal types Message-ID: <386236E2.F97109D3@digicool.com> While on the subject of RDBMS systems, a common need is to be able to work with fixed-decimal data. I think a standard Python fixed-decimal type would help to make Python database interfaces alot more robust. I even wonder if the Python long type might be hijacked for this purpose by adding a "scale" that indicates the number of digits to the right of the decimal point. For example, an expression like: 1000000000.2500L would create a fixed decimal number with a scale of 4. People have built Python classes for fixed-decimal types, but when working with RDBMS data, one often deals with lots of data and efficiency matters. I also suspect that adding scale to longs wouldn't be that hard and would be a fairly natural extension. In any case, a "standard" (being in the standard library would be sufficient) fixed-decimal type would probably lead to better database interfaces that (at least more) properly handled fixed-decimal data. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido at CNRI.Reston.VA.US Thu Dec 23 15:56:33 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 09:56:33 -0500 Subject: [Python-Dev] Fixed-decimal types In-Reply-To: Your message of "Thu, 23 Dec 1999 09:51:14 EST." <386236E2.F97109D3@digicool.com> References: <386236E2.F97109D3@digicool.com> Message-ID: <199912231456.JAA22134@eric.cnri.reston.va.us> What would be scale of the product of two fixed-decimal numbers? E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are arguments for either. Same question for division (harder, I think). I like the idea of using the dd.ddL notation for this. I have no time to implement it but would not be unwilling to accept patches. They would have to be accompanied with a wet signature, see http://www.python.org/1.5/wetsign.html. --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at digicool.com Thu Dec 23 16:00:25 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 10:00:25 -0500 Subject: [Python-Dev] re: Open Source design competition / Python / software tools References: Message-ID: <38623909.CDF41014@digicool.com> gvwilson at nevex.com wrote: > > Hi, folks. I hope you don't mind another mail out of the blue, but I got > notice on Saturday that the Department of Energy is giving me $860K over > two years to support development of easier-to-use software engineering > tools. All of the work will be Open Source, and will be done in Python, > with a strong emphasis on design, testing, and documentation. The > project's long-term objective is to encourage scientists and engineers to > treat programs in the same way as they do other experiments, i.e. to > calibrate, test, peer review, and so on. > > To kick-start things, we're going to be holding a two-round design > competition. Anyone (individual or team, professional or student) can > submit a short entry for the first round; the judges will pick four > candidates to go forward in each of four categories, and those > individuals or teams will be asked to submit full entries. The four > categories are: > > * an issue tracking system to replace Gnats and Bugzilla; > > * a build system to replace make; > > * a platform inspection and configuration system to replace autoconf; > and > > * a testing framework to replace XUnit, Expect, and DejaGnu. > > Would you be interested in participating in any way Are these categories fixed? I see a very strong need for an open-source UML modeling tool. UML is extremely powerful, but current UML tools largely suck and are very expensive. We are contemplating launching an open-source development effort to build UML modeling tools using Zope or the Zope object database as a repository. A contest like this could help to kick-start this effort, but tools to automate requirements and design seem to be missing. This is odd, considering that up-front activities like requirements and design have the largest impact on software-engineering project success. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From captainrobbo at yahoo.com Thu Dec 23 16:13:22 1999 From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=) Date: Thu, 23 Dec 1999 07:13:22 -0800 (PST) Subject: [Python-Dev] Fixed-decimal types Message-ID: <19991223151322.5698.qmail@web604.mail.yahoo.com> --- Guido van Rossum wrote: > What would be scale of the product of two > fixed-decimal numbers? > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to > 4.00L? There are > arguments for either. Same question for division > (harder, I think). Most commonly one is trying to avoid rounding errors when dealing with money - a few cents rounding error tends to result in a few billable hours with the accountants at the end of the year! SQL dialects and type-safe languages would make you specify the precision of the variable to be assigned, so the issue does not arise for other languages. For the work I do, simply taking the precision of the most precise input (4.00L)would do the trick, but your answer (4.0000L) is purer. We should provide a rounding function, and in practice anyone using such a function would round (or floor, or ceiling) to get to the desired precision immediately. I'm not sure on division either but I'm sure there are precedents to look at. On the subject of adding new types to the standard library, what are the plans on dates and times? Would a cut-down mxDateTime ever be considered? It is fully Open Source (unlike mxODBC) and was designed for the DBAPI. Regards, Andy ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. __________________________________________________ Do You Yahoo!? Thousands of Stores. Millions of Products. All in one place. Yahoo! Shopping: http://shopping.yahoo.com From guido at CNRI.Reston.VA.US Thu Dec 23 16:23:43 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 10:23:43 -0500 Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) In-Reply-To: Your message of "Thu, 23 Dec 1999 07:13:22 PST." <19991223151322.5698.qmail@web604.mail.yahoo.com> References: <19991223151322.5698.qmail@web604.mail.yahoo.com> Message-ID: <199912231523.KAA22232@eric.cnri.reston.va.us> > On the subject of adding new types to the standard > library, what are the plans on dates and times? Would > a cut-down mxDateTime ever be considered? It is fully > Open Source (unlike mxODBC) and was designed for the > DBAPI. I don't know much about date/time types, or about mxDateTime. My intuition is that there are too many ways to do it, and that being compatible with commercial databases may not be the right way to do it for core Python. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 23 16:27:59 1999 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 23 Dec 1999 10:27:59 -0500 (EST) Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: <38623493.E6BA6D6F@digicool.com> References: <38623493.E6BA6D6F@digicool.com> Message-ID: <14434.16255.58344.646524@weyr.cnri.reston.va.us> Jim Fulton writes: > In November there was an interesting discussion on comp.lang.python > about the meaning of __str__ and __repr__. One tidbit that came out > of this discussion was that __str__ for longs should drop the trailing > 'L'. Was there a decision on this? I'd really like this to happen. I liked that result as well, and thought about it just the other day. Luckily, you sent a note this morning and made me think about again. I'll have something checked into CVS shortly. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From Mike.Da.Silva at uk.fid-intl.com Thu Dec 23 17:30:07 1999 From: Mike.Da.Silva at uk.fid-intl.com (Da Silva, Mike) Date: Thu, 23 Dec 1999 16:30:07 -0000 Subject: [Python-Dev] Fixed Decimal types Message-ID: Andy Robinson wrote: For the work I do, simply taking the precision of the most precise input (4.00L)would do the trick, but your answer (4.0000L) is purer. We should provide a rounding function, and in practice anyone using such a function would round (or floor, or ceiling) to get to the desired precision immediately. I'm not sure on division either but I'm sure there are precedents to look at. The AS400 provides a useful example of the right way to do scaled decimals. In the RPG programming language, all internal calculations (i.e. multiplication, division) are performed to the maximum precision of the intermediate result (in the multiplication example below), the intermediate result would be 4.0000L. When the intermediate result is assigned to the target scaled decimal number, the decimal precision is automatically extended or truncated to fit the target precision. One extra wrinkle in all of this is the option to "half-adjust" the intermediate value on assignment; that is to apply automatic 5/4 rounding to the precision of the target. So, if the target field is defined as numeric(4,2), the result will be 4.00L. These are probably the kind of semantics that a scaled decimal type would require in Python also; i.e. allow unlimited precision in intermediate calculations, with a sensible set of rules for assignment to a variable of different scale and precision. However, unlike RPG, we should probably ensure that attempts to overflow or underflow the scale result in NaN or Overflow conditions, rather than assuming the user is right and losing the significant digits. Regards, Mike da Silva From jim at digicool.com Thu Dec 23 17:37:10 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 11:37:10 -0500 Subject: [Python-Dev] Fixed-decimal types References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> Message-ID: <38624FB6.ED903F@digicool.com> Guido van Rossum wrote: > > What would be scale of the product of two fixed-decimal numbers? > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are > arguments for either. Same question for division (harder, I think). I'd be inclined to start by doing some research to see if some standard (SQL?) defines this somewhere. It would be nice if someone has already done the requirements work for us. :) > I like the idea of using the dd.ddL notation for this. > > I have no time to implement Me neither. > it but would not be unwilling to accept patches. Cool. If no one else volunteers, then I'll try to find a way to get this done (not necessarily by me). I think it is pretty important. > They would have to be accompanied with a wet signature, see > http://www.python.org/1.5/wetsign.html. Yup. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From captainrobbo at yahoo.com Thu Dec 23 17:38:50 1999 From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=) Date: Thu, 23 Dec 1999 08:38:50 -0800 (PST) Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) Message-ID: <19991223163850.15619.qmail@web604.mail.yahoo.com> Sorry, should have replied to the list... --- Andy Robinson wrote: > Date: Thu, 23 Dec 1999 08:37:18 -0800 (PST) > From: Andy Robinson > Reply-to: andy at robanal.demon.co.uk > Subject: Re: [Python-Dev] Date and timetypes (was: > Fixed-decimal types) > To: Guido van Rossum > > --- Guido van Rossum > wrote: > > I don't know much about date/time types, or about > > mxDateTime. > > My intuition is that there are too many ways to do > > it, and that being > > compatible with commercial databases may not be > the > > right way to do it > > for core Python. > > > > OK. Let me rephrase it. Say we form a consensus on > 'the right way'. Are you amenable to some solution > which goes back before 1970 and after 2038 going > into > the standard library? > > And does your answer change if it involves some > compiled code as well? > > I mention mxDateTime because it was agreed by a > Python > SIG, is mature and stable, and I find it very > useful. > And the core type is pretty small - much of the > helper > stuff in the package now could be kept separate from > the main Python distribution. > > - Andy > > > ===== > Andy Robinson > Robinson Analytics Ltd. > ------------------ > My opinions are the official policy of Robinson > Analytics Ltd. > They just vary from day to day. > > _________________________________________________________ > Do You Yahoo!? > Get your free @yahoo.com address at > http://mail.yahoo.com > ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com From guido at CNRI.Reston.VA.US Thu Dec 23 17:42:33 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 11:42:33 -0500 Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) In-Reply-To: Your message of "Thu, 23 Dec 1999 08:38:50 PST." <19991223163850.15619.qmail@web604.mail.yahoo.com> References: <19991223163850.15619.qmail@web604.mail.yahoo.com> Message-ID: <199912231642.LAA22598@eric.cnri.reston.va.us> > > OK. Let me rephrase it. Say we form a consensus on 'the right > > way'. Are you amenable to some solution which goes back before > > 1970 and after 2038 going into the standard library? No problem. > > And does your answer change if it involves some > > compiled code as well? I'd rather not. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Thu Dec 23 18:05:52 1999 From: skip at mojam.com (Skip Montanaro) Date: Thu, 23 Dec 1999 11:05:52 -0600 (CST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib getopt.py,1.7,1.8 In-Reply-To: <199912212305.SAA13722@eric.cnri.reston.va.us> References: <199912212238.RAA13660@eric.cnri.reston.va.us> <14432.594.33416.600794@weyr.cnri.reston.va.us> <199912212305.SAA13722@eric.cnri.reston.va.us> Message-ID: <14434.22128.639699.738932@dolphin.mojam.com> Guido> (The next step would be to outlaw raise with a string argument; I Guido> think I can't make that for 1.6. But it would be a good idea to Guido> scan the standard library for string exceptions and convert all Guido> of them.) Agreed. I know Zope uses (at least, my Zope-using code uses) stuff like raise 'Redirect', url to map names onto HTTP response codes. Makes it easier on people to remember names instead of numeric codes. I suspect it will take the Zopers awhile to convert to using class-based exceptions if they haven't already. (For all I know I may be using a deprecated feature.) Skip From gvwilson at nevex.com Thu Dec 23 18:24:05 1999 From: gvwilson at nevex.com (gvwilson at nevex.com) Date: Thu, 23 Dec 1999 12:24:05 -0500 (EST) Subject: [Python-Dev] re: Open Source design competition / Python / software tools In-Reply-To: <38623909.CDF41014@digicool.com> Message-ID: Hi, everyone. I'm sending my reply to Jim's message to the whole python-dev list; I'll send follow-ups to individuals if people would prefer. > > * an issue tracking system to replace Gnats and Bugzilla; > > > > * a build system to replace make; > > > > * a platform inspection and configuration system to replace autoconf; > > and > > > > * a testing framework to replace XUnit, Expect, and DejaGnu. > Jim Fulton asked: > Are these categories fixed? For the first round, yes --- I have to prove that this model can solve small problems before I'll be given the funding to tackle larger ones, and I think that a UML modeling tool is definitely "large" :-). I also have to demonstrate uptake, and I think more people will adopt a sane replacement for Autoconf in the next 18 months than would adopt a UML modeler. However, decent Open Source CASE tools are very (very) high on my personal list --- if this works, I'd like to tackle them (along with providing support for DDD, and a few other thingsl ike that). Greg From gstein at lyra.org Thu Dec 23 19:26:44 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 23 Dec 1999 10:26:44 -0800 (PST) Subject: [Python-Dev] Re: Please test new dynamic load behavior In-Reply-To: <38620B04.7CC64485@trema.com> Message-ID: On Thu, 23 Dec 1999, Harri Pasanen wrote: > Greg Stein wrote: > > Hi all, > > > > I reorganized Python's dynamic load/import code over the past few days. > > Gudio provided some feedback, I did some more mods, and now it is checked > > into CVS. The new loading behavior has been tested on Linux, IRIX, and > > Solaris (and probably Windows by now). > > ... > > What was the motivation behind this modification? Harri - With the new code structure, it is much easier to maintain Python's loading code. Each platform has its own file (e.g. dynload_aix.c) rather than being all jammed together into importdl.c. This isn't a huge win by itself, but does increase readability/maintainability. The big improvement, however, is when you are adding support for new platforms or loading mechanisms. A new dynload_*.c can be written and one line added to configure.in, and you're done. No need to make importdl.c even uglier. (actually, importdl.c no longer contains *any* platform specific code; it has all been moved to the dynload_*.c files) Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim at digicool.com Thu Dec 23 20:39:37 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 14:39:37 -0500 Subject: [Python-Dev] Fixed-decimal types References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com> Message-ID: <38627A79.BF379672@digicool.com> Jim Fulton wrote: > > Guido van Rossum wrote: > > > > What would be scale of the product of two fixed-decimal numbers? > > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are > > arguments for either. Same question for division (harder, I think). > > I'd be inclined to start by doing some research to see if some standard > (SQL?) defines this somewhere. It would be nice if someone has already > done the requirements work for us. :) Here is what the book "SQL-99 Complete, Really" says that the SQL standard says: - for addition and subtraction of two "exact" (fixed-decimal) numbers, the result has the maximum of the scales. - for multiplication of two "exact" (fixed-decimal) numbers, the result has the sum of the scales. - punts on division - for addition, subtraction, multiplication or division between "exact" (fixed point) and "approximate" (floating point) yields an approximate result. This means that fixed-decimal coerces to float. I'm curious to see who else chips in with examples from other systems. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From jim at digicool.com Thu Dec 23 20:43:41 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 14:43:41 -0500 Subject: [Python-Dev] Fixed Decimal types References: Message-ID: <38627B6D.447A9553@digicool.com> "Da Silva, Mike" wrote: > > Andy Robinson wrote: > For the work I do, simply taking the precision of the > most precise input (4.00L)would do the trick, but your > answer (4.0000L) is purer. We should provide a > rounding function, and in practice anyone using such a > function would round (or floor, or ceiling) to get to > the desired precision immediately. > > I'm not sure on division either but I'm sure there are > precedents to look at. > > The AS400 provides a useful example of the right way to do scaled > decimals. > > In the RPG programming language, all internal calculations (i.e. > multiplication, division) are performed to the maximum precision of the > intermediate result (in the multiplication example below), the intermediate > result would be 4.0000L. When the intermediate result is assigned to the > target scaled decimal number, the decimal precision is automatically > extended or truncated to fit the target precision. One extra wrinkle in all > of this is the option to "half-adjust" the intermediate value on assignment; > that is to apply automatic 5/4 rounding to the precision of the target. Yee ha! This is great input. Anyone have any other examples of what any other systems do? Anyone got a PL/I manual handy. ;) > So, if the target field is defined as numeric(4,2), the result will > be 4.00L. Since Python doesn't have types values, this is not an issue internally, but would be an issue when binding to external databases. > These are probably the kind of semantics that a scaled decimal type > would require in Python also; i.e. allow unlimited precision in intermediate > calculations, with a sensible set of rules for assignment to a variable of > different scale and precision. > > However, unlike RPG, we should probably ensure that attempts to > overflow or underflow the scale result in NaN or Overflow conditions, rather > than assuming the user is right and losing the significant digits. Since this would be based on infinite-precision numbers, I don't think that this would be an issue. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido at CNRI.Reston.VA.US Thu Dec 23 20:44:36 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 14:44:36 -0500 Subject: [Python-Dev] Fixed-decimal types In-Reply-To: Your message of "Thu, 23 Dec 1999 14:39:37 EST." <38627A79.BF379672@digicool.com> References: <386236E2.F97109D3@digicool.com> <199912231456.JAA22134@eric.cnri.reston.va.us> <38624FB6.ED903F@digicool.com> <38627A79.BF379672@digicool.com> Message-ID: <199912231944.OAA23337@eric.cnri.reston.va.us> Jim Fulton wrote: > - for addition and subtraction of two "exact" (fixed-decimal) > numbers, the result has the maximum of the scales. One could argue that this is incorrect: if "3.1" means that I know the value to one decimal of precision, and "2.01" means that I know that value to two decimals of precision, stating the result of their sum as "5.11" suggests that I know the result to two decimals of precision, which is of course false: because I only knew one decimal of precision for one of the operands, I only know (at most!) one decimal of precision for the result. Not arguing for this interpretation, just indicating that doing fixed precision arithmetic right is hard. I'm waiting for Tim Peters' contribution, but he's on vacation so it may be a while. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Thu Dec 23 21:48:56 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 23 Dec 1999 15:48:56 -0500 Subject: [Python-Dev] Fixed Decimal types In-Reply-To: <38627B6D.447A9553@digicool.com> Message-ID: <1266141247-31971518@hypernet.com> Jim Fulton wrote: > "Da Silva, Mike" wrote: [AS400 RPG rules...] > Yee ha! This is great input. Anyone have any other examples of > what any other systems do? Anyone got a PL/I manual handy. ;) From jim at digicool.com Thu Dec 23 23:18:37 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 17:18:37 -0500 Subject: [Python-Dev] re: Open Source design competition / Python /software tools References: Message-ID: <38629FBD.3B8F47D4@digicool.com> gvwilson at nevex.com wrote: > > Hi, everyone. I'm sending my reply to Jim's message to the whole > python-dev list; I'll send follow-ups to individuals if people would > prefer. > > > > * an issue tracking system to replace Gnats and Bugzilla; > > > > > > * a build system to replace make; > > > > > > * a platform inspection and configuration system to replace autoconf; > > > and > > > > > > * a testing framework to replace XUnit, Expect, and DejaGnu. > > > Jim Fulton asked: > > Are these categories fixed? > > For the first round, yes OK. >--- I have to prove that this model can solve > small problems before I'll be given the funding to tackle larger ones, and > I think that a UML modeling tool is definitely "large" :-). Well, since you gave rational ..... :) Isn't the Open Source community especially good at large problems? Note that I'm thinking more in terms of an open source UML community of tools, based around an existing repository rather than on a single monolithic tool. I envision a community of diagramming and other small tools orbiting Zope or ZODB. The hardest part of a UML tool is the repository, and I think we've mostly got that. I think that what the Open Source community desperately needs are tools for managing and sharing the most important artifacts in the development process. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gstein at lyra.org Fri Dec 24 01:09:29 1999 From: gstein at lyra.org (Greg Stein) Date: Thu, 23 Dec 1999 16:09:29 -0800 (PST) Subject: [Python-Dev] re: Open Source design competition / Python /software tools In-Reply-To: <38629FBD.3B8F47D4@digicool.com> Message-ID: On Thu, 23 Dec 1999, Jim Fulton wrote: > gvwilson at nevex.com wrote: >... > >--- I have to prove that this model can solve > > small problems before I'll be given the funding to tackle larger ones, and > > I think that a UML modeling tool is definitely "large" :-). > > Well, since you gave rational ..... :) > > > Isn't the Open Source community especially good at large problems? Very true, I agree, but part of Greg's problem is "proving" that to the DoE. Somebody has said those four problems are sufficient to do so, and (probably) because they are reasonably constrained to allow completion within a specified timeframe. > Note that I'm thinking more in terms of an open source UML community > of tools, based around an existing repository rather than on a single > monolithic tool. I envision a community of diagramming and other small > tools orbiting Zope or ZODB. The hardest part of a UML tool is the > repository, and I think we've mostly got that. Greg's proposal is quite specific. "A community" isn't, so it might not help to create a proof to the DoE (otherwise, they could look at the Zope community, or other communities!). Jim: there isn't anything stopping or impeding the creation of an Open Source community for UML modeling. This DoE competition won't affect that... Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From jim at digicool.com Fri Dec 24 01:27:53 1999 From: jim at digicool.com (Jim Fulton) Date: Thu, 23 Dec 1999 19:27:53 -0500 Subject: [Python-Dev] re: Open Source design competition / Python /softwaretools References: Message-ID: <3862BE09.9AF62090@digicool.com> Greg Stein wrote: > (snip) > Jim: there isn't anything stopping or impeding the creation of an Open > Source community for UML modeling. Of course not. > This DoE competition won't affect that... Perhaps it could help it. > Happy Holidays, You too. Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From ping at lfw.org Fri Dec 24 09:55:28 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Fri, 24 Dec 1999 00:55:28 -0800 (PST) Subject: [Python-Dev] re: Open Source design competition / Python / software tools In-Reply-To: Message-ID: On Wed, 22 Dec 1999 gvwilson at nevex.com wrote: > To kick-start things, we're going to be holding a two-round design > competition. Anyone (individual or team, professional or student) can > submit a short entry for the first round; the judges will pick four > candidates to go forward in each of four categories, and those > individuals or teams will be asked to submit full entries. The four > categories are: > > * an issue tracking system to replace Gnats and Bugzilla; Hi there. At ILM we've been using a system that i hacked up quickly in Python called "Roundup". It has a number of interesting properties that have made it really useful to us, and arguably better than any of the existing open-source bug-tracking things out there that i know of. It is not just a Web app; it lives between the Web and e-mail, because we do so much of our communication that way. For example, each request item gets its own virtual mailing list, updated on the fly without the need for explicit subscription (if you cc: somebody while discussing the bug, they get subscribed). Empirically i've discovered that unsubscription is actually unnecessary (!) because conversation will stop on a topic when it gets resolved or when it ceases to be interesting. These are fine-grained discussion lists on a per-topic level. This is just to let you know i'm interested. I'm currently asking for permission to open-source Roundup; if it can't be done, or doesn't happen quickly enough, i'll just have to take a weekend and rewrite the thing. There were a few things i wanted to fix anyway. -- ?!ng "You should either succeed gloriously or fail miserably. Just getting by is the worst thing you can do." -- Larry Smith From marangoz at python.inrialpes.fr Fri Dec 24 13:07:05 1999 From: marangoz at python.inrialpes.fr (Vladimir Marangozov) Date: Fri, 24 Dec 1999 13:07:05 +0100 (CET) Subject: [Python-Dev] Exceptions In-Reply-To: <199912221823.NAA16517@eric.cnri.reston.va.us> from "Guido van Rossum" at Dec 22, 1999 01:23:45 PM Message-ID: <199912241207.NAA18783@python.inrialpes.fr> Guido van Rossum wrote: > > Vladimir.Marangozov at inrialpes.fr: > > > Yes. Besides, I still think that string-based exceptions are just > > convenient for quick & dirty, throw-away test scripts. > > They have a hard-to-understand quirk though: the id() of the string is > used to check rather than its value, so that except "foo" doesn't > necessarily catch raise "foo"; but due to various optimization, this > usually works, and people get bent out of shape when it doesn't. Which brings 2 important questions: 1. In the long run, which one is better -- compare and check exceptions by reference (by name) or by value? (currently, this is done by reference on predefined object types: strings, classes or instances) I'd say, exceptions have to be compared (catched) by value, i.e. use "e1 == e2" instead of "e1 is e2". 2. Should we limit the exception "types"? I'd say, no. My Pythonic view of things says that we raise "objects", be they classes, instances, strings or, why not, ints. However, if one wants to put some order in the "unordered set" of exceptions s/he uses, then classes is the way to do it, because classes were given some nice properties, like inheritance, that allow to group and to organize logically the objects we throw and catch as exceptions (+ other bonus properties coming from classes). Note that conceptually, when we say "strings and ints", we have in mind "string instances and int instances", whose "classes" are written in C. When there will be String and Int classes of some sort as first class objects, then we'll fall back to the terminology: Exceptions can be classes or instances. If point 1 and (optionally) point 2 is implemented, the hard-to-understand quirk wouldn't be an issue and string-based exceptions would have a legal reason to stay and live. > Since you have to give your exception a name, how hard is it to say > > class MyError(Exception): pass > > rathern than > > MyError = "MyError" > > ? You know what I think about "names"... I may have defined my exception conventions and be interested in catching an exception named 404, implying that "a 404 bobo" occured deeply in my code ("deeply in my code" meaning for example: database 4, service 0, customer group 4, or just a standard HTTP "Code 404 - Not Found".) Pushing this to the extreme to catapult your thoughts into the next millenium. :) and to emphasize the importance of discussing and anwsering objectively the above questions 1) and 2). -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From mal at lemburg.com Fri Dec 24 12:03:37 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 24 Dec 1999 12:03:37 +0100 Subject: [Python-Dev] str(1L) -> '1' ? References: <38623493.E6BA6D6F@digicool.com> <199912231446.JAA22086@eric.cnri.reston.va.us> Message-ID: <38635309.2AEFF18D@lemburg.com> Guido van Rossum wrote: > > [Jim F] > > In November there was an interesting discussion on comp.lang.python > > about the meaning of __str__ and __repr__. One tidbit that came out > > of this discussion was that __str__ for longs should drop the trailing > > 'L'. Was there a decision on this? I'd really like this to happen. > > Yes, I'd like it to happen. I'd also like repr() of a float to return > the full precision (using the "%.17g" sprintf format). While we're at it: how about adding a PyLong_AsString() API to the C interface ? I currently use PyObject_Str() in mxODBC and then slice off the 'L' -- not very elegant. A PyLong_AsString() API would much better suit the task. Merry Christmas, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 7 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 24 12:11:29 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 24 Dec 1999 12:11:29 +0100 Subject: [Python-Dev] Date and timetypes (was: Fixed-decimal types) References: <19991223163850.15619.qmail@web604.mail.yahoo.com> <199912231642.LAA22598@eric.cnri.reston.va.us> Message-ID: <386354E1.DA560F42@lemburg.com> Guido van Rossum wrote: > > > > OK. Let me rephrase it. Say we form a consensus on 'the right > > > way'. Are you amenable to some solution which goes back before > > > 1970 and after 2038 going into the standard library? > > No problem. > > > > And does your answer change if it involves some > > > compiled code as well? > > I'd rather not. As far as mxDateTime goes, I'd rather not see it in the core distribution. Including the mx stuff in a separate PythonPowerTools distribution would be cool though. For a start in this direction see e.g.: http://startship.skyport.net/~lemburg/PPowerTools-0.2.zip Note that I'll wrap all my mx extensions into a new mx package which will come in several flavours next year. There will no longer be separate packages due to the various naming collisions and to enable intra-mx-package dependencies. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 7 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From captainrobbo at yahoo.com Fri Dec 24 13:22:29 1999 From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=) Date: Fri, 24 Dec 1999 04:22:29 -0800 (PST) Subject: [Python-Dev] Fixed Decimal types Message-ID: <19991224122229.23506.qmail@web606.mail.yahoo.com> > >> However, unlike RPG, we should probably ensure > >> that attempts to overflow or underflow the scale > >> result in NaN or Overflow conditions, rather > >> than assuming the user is right and losing > >> the significant digits. > > > Since this would be based on infinite-precision > numbers, I don't > > think that this would be an issue. Three very general observations before I disappear for Christmas: (1) I think there is great mileage in combining the fixed-decimal concept with Martin Fowler's Quantity pattern, so that a variable could be defined as not just two decimal places but also (say) "GBP" or "USD", and it would be an error to add the two. Same applies for adding metres, kilograms and other quantities. There has also been discussion that the 'type' of a quantity should determine what math should apply. (2) If Python is going to be used increasingly in eCommerce, it should be good at dealing with money - maybe not in the core language, but we should aim for one standard package. (3) We have a python-finance list (python-finance at egroups.com), recently generalized to cover business systems, which is a good place to discuss this if anyone wants to. There are people there who have time, would love to prototype something (indeed some work started in this area 3 months back), and would use it at work too. This would be an ideal first target for that group - or indeed for a finance-sig. I'll pursue this in the New Year. Merry Christmas, Andy ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com From jack at oratrix.nl Fri Dec 24 13:34:28 1999 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 24 Dec 1999 13:34:28 +0100 Subject: [Python-Dev] Fixed Decimal types In-Reply-To: Message by =?iso-8859-1?q?Andy=20Robinson?= , Fri, 24 Dec 1999 04:22:29 -0800 (PST) , <19991224122229.23506.qmail@web606.mail.yahoo.com> Message-ID: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl> > (1) I think there is great mileage in combining the > fixed-decimal concept with Martin Fowler's Quantity > pattern, so that a variable could be defined as not > just two decimal places but also (say) "GBP" or "USD", > and it would be an error to add the two. Same applies > for adding metres, kilograms and other quantities. > There has also been discussion that the 'type' of a > quantity should determine what math should apply. Isn't this something that is ideally suited for implementation in a Python module, based on a core implementation of fixed decimal numbers? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From gstein at lyra.org Fri Dec 24 21:05:22 1999 From: gstein at lyra.org (Greg Stein) Date: Fri, 24 Dec 1999 12:05:22 -0800 (PST) Subject: [Python-Dev] Fixed Decimal types In-Reply-To: <19991224123428.5BA9F370CF2@snelboot.oratrix.nl> Message-ID: On Fri, 24 Dec 1999, Jack Jansen wrote: > > (1) I think there is great mileage in combining the > > fixed-decimal concept with Martin Fowler's Quantity > > pattern, so that a variable could be defined as not > > just two decimal places but also (say) "GBP" or "USD", > > and it would be an error to add the two. Same applies > > for adding metres, kilograms and other quantities. > > There has also been discussion that the 'type' of a > > quantity should determine what math should apply. > > Isn't this something that is ideally suited for implementation in a Python > module, based on a core implementation of fixed decimal numbers? I'd agree with Jack here. The "simple" change of a scale for the Long values is nice. Starting to lump in features like this begins to get a little messier... Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Fri Dec 24 21:13:50 1999 From: gstein at lyra.org (Greg Stein) Date: Fri, 24 Dec 1999 12:13:50 -0800 (PST) Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: <38635309.2AEFF18D@lemburg.com> Message-ID: On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > Guido van Rossum wrote: > > [Jim F] > > > In November there was an interesting discussion on comp.lang.python > > > about the meaning of __str__ and __repr__. One tidbit that came out > > > of this discussion was that __str__ for longs should drop the trailing > > > 'L'. Was there a decision on this? I'd really like this to happen. > > > > Yes, I'd like it to happen. I'd also like repr() of a float to return > > the full precision (using the "%.17g" sprintf format). > > While we're at it: how about adding a PyLong_AsString() API > to the C interface ? I currently use PyObject_Str() in mxODBC > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > API would much better suit the task. Fred just checked in a change yesterday. PyObject_Str() on a Long no longer includes the 'L'. You're going to need to update your code :-) [ I've got some here and there to fix, too, with the idiom: if type(v) is type(1L): return str(v)[:-1] ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From mal at lemburg.com Sun Dec 26 23:29:28 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Sun, 26 Dec 1999 23:29:28 +0100 Subject: [Python-Dev] str(1L) -> '1' ? References: Message-ID: <386696C8.6EBBF428@lemburg.com> Greg Stein wrote: > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > > While we're at it: how about adding a PyLong_AsString() API > > to the C interface ? I currently use PyObject_Str() in mxODBC > > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > > API would much better suit the task. > > Fred just checked in a change yesterday. PyObject_Str() on a Long no > longer includes the 'L'. Ah, ok... scanning the patches: they don't provide an externed C interface... I would like to have such a beast if possible (basically, the new long_format() as PyLong_AsString()). > You're going to need to update your code :-) > [ I've got some here and there to fix, too, with the idiom: > if type(v) is type(1L): return str(v)[:-1] > ] Your above example will effectively divide the long value by 10 which will probably break things in very subtle ways... hmm, this change ought to be made *very* visible to people upgrading to 1.6, IMHO. I'll fix mxODBC to only truncate the string value iff the 'L' is present. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 5 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From andy at robanal.demon.co.uk Mon Dec 27 11:43:17 1999 From: andy at robanal.demon.co.uk (Andy Robinson) Date: Mon, 27 Dec 1999 10:43:17 GMT Subject: [Python-Dev] Fixed Decimal types In-Reply-To: References: Message-ID: <38674259.5377973@post.demon.co.uk> On Fri, 24 Dec 1999 12:05:22 -0800 (PST), you wrote: >On Fri, 24 Dec 1999, Jack Jansen wrote: >> > (1) I think there is great mileage in combining the >> > fixed-decimal concept with Martin Fowler's Quantity >> > pattern, so that a variable could be defined as not >> > just two decimal places but also (say) "GBP" or "USD", >> > and it would be an error to add the two. Same applies >> > for adding metres, kilograms and other quantities. >> > There has also been discussion that the 'type' of a >> > quantity should determine what math should apply. >> >> Isn't this something that is ideally suited for implementation in a Python >> module, based on a core implementation of fixed decimal numbers? > >I'd agree with Jack here. > Me too - I thought I said that in point 2, but in retrospect I didn't say it clearly enough :-) - Andy From gstein at lyra.org Mon Dec 27 12:31:29 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 27 Dec 1999 03:31:29 -0800 (PST) Subject: [Python-Dev] str(1L) -> '1' ? In-Reply-To: <386696C8.6EBBF428@lemburg.com> Message-ID: On Sun, 26 Dec 1999, M.-A. Lemburg wrote: > Greg Stein wrote: > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > > > While we're at it: how about adding a PyLong_AsString() API > > > to the C interface ? I currently use PyObject_Str() in mxODBC > > > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > > > API would much better suit the task. > > > > Fred just checked in a change yesterday. PyObject_Str() on a Long no > > longer includes the 'L'. > > Ah, ok... scanning the patches: they don't provide an externed > C interface... I would like to have such a beast if possible > (basically, the new long_format() as PyLong_AsString()). What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry Point. > > You're going to need to update your code :-) > > [ I've got some here and there to fix, too, with the idiom: > > if type(v) is type(1L): return str(v)[:-1] > > ] > > Your above example will effectively divide the long value by 10 > which will probably break things in very subtle ways... hmm, this Yah :-( Not a lot of fun, but I think for the best. > change ought to be made *very* visible to people upgrading to > 1.6, IMHO. Yes. Cheers, -g -- Greg Stein, http://www.lyra.org/ From mal at lemburg.com Mon Dec 27 13:51:36 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 27 Dec 1999 13:51:36 +0100 Subject: [Python-Dev] str(1L) -> '1' ? References: Message-ID: <386760D8.E897FADF@lemburg.com> Greg Stein wrote: > > On Sun, 26 Dec 1999, M.-A. Lemburg wrote: > > Greg Stein wrote: > > > On Fri, 24 Dec 1999, M.-A. Lemburg wrote: > > > > While we're at it: how about adding a PyLong_AsString() API > > > > to the C interface ? I currently use PyObject_Str() in mxODBC > > > > and then slice off the 'L' -- not very elegant. A PyLong_AsString() > > > > API would much better suit the task. > > > > > > Fred just checked in a change yesterday. PyObject_Str() on a Long no > > > longer includes the 'L'. > > > > Ah, ok... scanning the patches: they don't provide an externed > > C interface... I would like to have such a beast if possible > > (basically, the new long_format() as PyLong_AsString()). > > What's wrong with PyObject_Str()? I don't see a need for Yet Another Entry > Point. What's wrong with a rich C API :-) ? The long_format function would be very useful for programs interacting with other software at C level. Making it external would give the programmer the ability to pass long string representations in any base to other programs, which is very useful for e.g. database interaction or crypto software. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 4 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From bkc at murkworks.com Mon Dec 27 23:04:25 1999 From: bkc at murkworks.com (Brad Clements) Date: Mon, 27 Dec 1999 17:04:25 -0500 Subject: [Python-Dev] Re: [PSA MEMBERS] Re: Please test new dynamic load behavior In-Reply-To: References: <38620B04.7CC64485@trema.com> Message-ID: <199912272204.RAA26173@anvil.murkworks.com> On 23 Dec 99, at 10:26, Greg Stein wrote: > > > I reorganized Python's dynamic load/import code over the past few days. > > > Gudio provided some feedback, I did some more mods, and now it is checked > > > into CVS. The new loading behavior has been tested on Linux, IRIX, and > > > Solaris (and probably Windows by now). FYI, I downloaded the import stuff from CVS and used it in my port of Python to NetWare. Good timing, as I was just tackling dynamic loading on NetWare when I saw your message. The new scheme is much better, and works for me. Though I do need to add some special "un-import" code similar to what BEOS does. Brad Clements, bkc at murkworks.com (315)268-1000 http://www.murkworks.com (315)268-9812 Fax netmeeting: ils://ils.murkworks.com AOL-IM: BKClements From skip at mojam.com Tue Dec 28 22:41:33 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 28 Dec 1999 15:41:33 -0600 Subject: [Python-Dev] Better text processing support in py2k? Message-ID: <199912282141.PAA31426@dolphin.mojam.com> It just occurred to me as I was replying to a request on the main list, that Python's text handling capabilities could be a bit better than they are. This will probably not come as a revelation to many of you, but I finally put it together with the standard argument against beefing things up One fix would be to add regular expressions to the language core and have special syntax for them, as Perl has done. However, I don't like this solution because Python is a general-purpose language, and regular expressions are used for the single application domain of text processing. For other application domains, regular expressions may be of no interest, and you might want to remove them to save memory and code size. and the observation that Python does support some builtin objects and syntax that are fairly specific to some much more restricted application domains than text processing. I stole the above quote from Andrew Kuchling's Python Warts page, which I also happened to read earlier today. What AMK says makes perfect sense until you examine some of the other things that are in the language, like the Ellipsis object and complex numbers. If I recall correctly both were added as a result of the NumPy package development. I have nothing against ellipses or complex numbers. They are fine first class objects that should remain in the language. But I have never used either one in my day-to-day work. On the other hand, I read files and manipulate them with regular expressions all the time. I rather suspect that more people use Python for some sort of text processing than any other single application domain. Python should be good at it. While I don't want to turn Python into Perl, I would like to see it do a better job of what most people probably use the language for. Here is a very short list of things I think need attention: 1. When using something like the simple file i/o idiom for line in f.readlines(): dofunstuff(line) the programmer should not have to care how big the file is. It should just work in a reasonably efficient manner without gobbling up all of memory. I realize this may require some change to the syntax of the common idiom. 2. The re module needs to be sped up, if not to catch up with Perl, then to catch up with the deprecated regex module. Depending how far people want to go with things, adding some language syntax to support regular expressions might be in order. I don't see that as compelling as adding complex numbers however. Another possibility, now that Barry Warsaw has opened the floodgates, is to add regular expression methods to strings. 3. I've not yet used it, but I am told the pattern matching in Marc-Andre Lemburg's mxTextTools (http://starship.python.net/crew/lemburg/) is both powerful and efficient (though it certainly appears complex). Perhaps it deserves consideration for incorporation into the core Python distribution. I'm sure other people will come up with other suggestions. Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From akuchlin at mems-exchange.org Tue Dec 28 23:00:11 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Tue, 28 Dec 1999 17:00:11 -0500 (EST) Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com> References: <199912282141.PAA31426@dolphin.mojam.com> Message-ID: <14441.13035.802146.730160@amarok.cnri.reston.va.us> Skip Montanaro writes: >What AMK says makes perfect sense until you examine some of the other things >that are in the language, like the Ellipsis object and complex numbers. If >I recall correctly both were added as a result of the NumPy package >development. True, but note that you can compile Python with WITHOUT_COMPLEX defined to remove complex numbers. > 1. When using something like the simple file i/o idiom > for line in f.readlines(): > dofunstuff(line) > the programmer should not have to care how big the file is. What about 'for line in fileinput.input()', which already exists? (Hmmm... if you have an already open file object, I don't think you can pass it to fileinput.input(); maybe that should be fixed.) On a vaguely related note, since there are many things like parser generators and XML stuff and mxTextTools, I've been speculating about a text processing topic guide. If you know of Python packages related to text processing, please send me a private e-mail with a link. -- A.M. Kuchling http://starship.python.net/crew/amk/ Constraints often boost creativity. -- Jim Hugunin, 11 Feb 1999 From skip at mojam.com Tue Dec 28 23:26:53 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 28 Dec 1999 16:26:53 -0600 (CST) Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <14441.13035.802146.730160@amarok.cnri.reston.va.us> References: <199912282141.PAA31426@dolphin.mojam.com> <14441.13035.802146.730160@amarok.cnri.reston.va.us> Message-ID: <14441.14637.682862.999776@dolphin.mojam.com> Andrew> True, but note that you can compile Python with WITHOUT_COMPLEX Andrew> defined to remove complex numbers. That's true, but that wasn't my point. I'm not arguing for or against space efficiency, just that the the rather timeworn argument about not doing anything special to support text processing because Python is a general purpose language is a red herring. >> 1. When using something like the simple file i/o idiom >> for line in f.readlines(): >> dofunstuff(line) >> the programmer should not have to care how big the file is. Andrew> What about 'for line in fileinput.input()', which already Andrew> exists? (Hmmm... if you have an already open file object, I Andrew> don't think you can pass it to fileinput.input(); maybe that Andrew> should be fixed.) Well, a couple reasons jump to mind: 1. fileinput.FileInput isn't particularly efficient. At its heart, its __getitem__ method makes a simple readline() call instead of buffering some amount of readlines(sizehint) bytes. This can be fixed, but I'm not sure what would happen to its semantics. 2. As you pointed out, it's not all that general. My point, not at all well stated, is that the programmer shouldn't have to worry (much?) about the conditions under which he does file i/o. Right now, if I know the file is small(ish), I can do for line in f.readlines(): dofunstuff(line) but I have to know that the file won't be big, because readlines() will behave badly (perhaps even generate a MemoryError exception) if the file is large. In that case, I have to fall back to the safer (and slower) line = f.readline() while line: dofunstuff(line) line = f.readline() or the more efficient, but more cumbersome lines = f.readlines(sizehint) while lines: for line in lines: dofunstuff(line) lines = f.readlines(sizehint) That's three separate idioms the programmer has to be aware of when writing code to read a text file based upon the perceived need for speed, memory usage and desired clarity: fast/memory-intensive/clear slow/memory-conserving/not-as-clear fast/memory-conserving/fairly-muddy Any particular reason that the readline method can't return an iterator that supports __getitem__ and buffers input? (Again, remember this is for py2k, so the potential breakage such a change might cause is a consideration, but not a showstopper.) Andrew> On a vaguely related note, since there are many things like Andrew> parser generators and XML stuff and mxTextTools, I've been Andrew> speculating about a text processing topic guide. If you know of Andrew> Python packages related to text processing, please send me a Andrew> private e-mail with a link. This sounds like a good idea to me. Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From captainrobbo at yahoo.com Wed Dec 29 09:34:43 1999 From: captainrobbo at yahoo.com (=?iso-8859-1?q?Andy=20Robinson?=) Date: Wed, 29 Dec 1999 00:34:43 -0800 (PST) Subject: [Python-Dev] Better text processing support in py2k? Message-ID: <19991229083443.27817.qmail@web6005.mail.yahoo.com> --- Skip Montanaro wrote: > fast/memory-intensive/clear > slow/memory-conserving/not-as-clear > fast/memory-conserving/fairly-muddy > > Any particular reason that the readline method can't > return an iterator that > supports __getitem__ and buffers input? (Again, > remember this is for py2k, > so the potential breakage such a change might cause > is a consideration, but > not a showstopper.) Why not generalize fileinput to do buffering instead? More generally, Java has the notion of 'stackable streams' - e.g. construct a 'BufferedFile' around a 'File', maybe construct a 'Line-oriented file' around that etc. Each one takes a file-like object as an argument to the constructor. Things you might want to do: - buffering - international encoding conversions - line delimiters other than CR/LF/CRLF - read/write Python objects (i.e. use pickle/marshal) - easy interfaces to parsers This took me a couple of hours to get used to (and at the time I thought 'Yuk!' when I saw first saw four nested constructors), but gives you very precise control and a lot of versatility when handling files. It's an idiom Python does not use much but maybe it should. I'd argue that maybe some enhancements to fileinput.py - adding some streams to provide building blocks for these operations - would get us the power you want and a lot more versatility besides. ===== Andy Robinson Robinson Analytics Ltd. ------------------ My opinions are the official policy of Robinson Analytics Ltd. They just vary from day to day. __________________________________________________ Do You Yahoo!? Talk to your friends online with Yahoo! Messenger. http://messenger.yahoo.com From mal at lemburg.com Wed Dec 29 17:55:21 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Wed, 29 Dec 1999 17:55:21 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <19991229083443.27817.qmail@web6005.mail.yahoo.com> Message-ID: <386A3CF9.8AF0EA60@lemburg.com> Andy Robinson wrote: > > --- Skip Montanaro wrote: > > fast/memory-intensive/clear > > slow/memory-conserving/not-as-clear > > fast/memory-conserving/fairly-muddy > > > > Any particular reason that the readline method can't > > return an iterator that > > supports __getitem__ and buffers input? (Again, > > remember this is for py2k, > > so the potential breakage such a change might cause > > is a consideration, but > > not a showstopper.) > > Why not generalize fileinput to do buffering instead? > > More generally, Java has the notion of 'stackable > streams' - e.g. construct a 'BufferedFile' around a > 'File', maybe construct a 'Line-oriented file' around > that etc. Each one takes a file-like object as an > argument to the constructor. Things you might want to > do: > - buffering > - international encoding conversions > - line delimiters other than CR/LF/CRLF > - read/write Python objects (i.e. use pickle/marshal) > - easy interfaces to parsers If all goes well we'll have something like this in Python 1.6 at least for the encoding/decoding part file reading and writing. You basically take a file object and then wrap some StreamCodecs around it to get the functionality you need. Very simple and very intuitive. > This took me a couple of hours to get used to (and at > the time I thought 'Yuk!' when I saw first saw four > nested constructors), but gives you very precise > control and a lot of versatility when handling files. > It's an idiom Python does not use much but maybe it > should. > > I'd argue that maybe some enhancements to fileinput.py > - adding some streams to provide building blocks for > these operations - would get us the power you want and > a lot more versatility besides. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Get ready to party ! Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From bckfnn at pipmail.dknet.dk Wed Dec 29 19:51:52 1999 From: bckfnn at pipmail.dknet.dk (Finn Bock) Date: Wed, 29 Dec 1999 18:51:52 GMT Subject: [Python-Dev] zipfile.py In-Reply-To: <3857B97E.3684224F@interet.com> References: <3857B97E.3684224F@interet.com> Message-ID: <386a582d.6762574@pipmail.dknet.dk> James C. Ahlstrom wrote: > ftp://ftp.interet.com/pub/pylib.html I feel that it smell a bit too much like a tool and too little like an general programming api. - It can only add disk files. The ability to write data to a zip entry through a file-like object or from a string would make it more like an API, IMHO - Some kind of access to the TOC entry fields (date, size, compressed size etc) also seems like a nice feature. - The data for an entry must be available in memory. Could be a problem for huge files, but most like not in practical use. I admit that I am fond of the api from java.util.zip.ZipFile and java.util.zip.ZipOutputStream. Regards, Finn Bock From tim_one at email.msn.com Thu Dec 30 07:08:58 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 01:08:58 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <199912282141.PAA31426@dolphin.mojam.com> Message-ID: <000001bf528c$5cbdb9a0$a02d153f@tim> [Skip Montanaro, wants nicer text facilities] > ... > I rather suspect that more people use Python for some sort of > text processing than any other single application domain. Hmm. You're probably right, but I'm an exception. > Python should be good at it. And I guess I'm an exception mostly *because* Perl is better at easy text crunching and Icon is better at hard text-crunching -- that is, I use the right tool for the job . > While I don't want to turn Python into Perl, I would like to see > it do a better job of what most people probably use the language > for. Here is a very short list of things I think need attention: > > 1. [*A* clear way to do memory- and time-efficient textfile > input] I agree, but unsure how to fix it. The best way to write this now is # f is some open file object. while 1: lines = f.readlines(BUFSIZE) if not lines: break for line in lines: process(line) and it's not something anyone figures out on their own -- or enjoys typing or explaining afterwards. Perl gets its line-at-a-time speed by peeking and poking C FILE structs directly in compiler- and platform-specific ways -- ways that vendors *should* have done in their own fgets implementations, but almost never do. I have no idea whether it works well with Perl's nascent notions of threading, but in the absence of that "the system" doesn't know Perl is cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one line at a time -- even mixing in C-level ungetc calls works (well, sometimes <0.1 wink -- they don't always peek and poke enough fields>)). The Python QIO extension module is much easier to port but less compatible (it doesn't use stdio, so QIO-opened files don't play well with others) and slower (although that's likely repairable -- he's got two passes over the buffer where one hairier pass should suffice). > 2. The re module needs to be sped up, if not to catch up with > Perl, then to catch up with the deprecated regex module. The irony here is that the re engine is very often unboundedly faster than the regex engine -- provided you're chewing over large strings. Some tests /F ran showed that the length-independent *overhead* of invoking re is about 10x higher than for regex. Presumably the bulk of that is due to re.py, i.e. that you get to the re engine via going thru Python layers on your way in and out, while regex was pure C. In any case, /F is working on a new engine (for Unicode), and I believe he has this all well in hand. > Depending how far people want to go with things, adding some > language syntax to support regular expressions might be in order. > ... > 3. I've not yet used it, but I am told the pattern matching in > Marc-Andre Lemburg's mxTextTools > (http://starship.python.net/crew/lemburg/) > is both powerful and efficient (though it certainly appears > complex). Perhaps it deserves consideration for > incorporation into the core Python distribution. It's not complex, it's complicated -- and *that's* what makes it un-Pythonic . Tony Ibbs has written a friendly wrapper around mxTextTools that suppresses much of the non-essential complication. OTOH, if you go into this with a regexp mindset, it will run much slower than a real regexp package, because the bulk of the latter is devoted to doing optimization; mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls if you e.g. try to implement naive backtracking). You should go to the REBOL site and look at the description of REBOL's PARSE verb in the FAQ ... mumble, mumble ... at http://www.rebol.com/faq.html#11550948 Here's an example pulled from that page (this is a REBOL code fragment): digit: charset "0123456789" expr: [term ["+" | "-"] expr | term] term: [factor ["*" | "/"] term | factor] factor: [primary "**" factor | primary] primary: [value | "(" expr ")"] value: [digit value | digit] parse "1 + 2 ** 9" expr There hasn't been a pattern scheme this clean, convenient or powerful since SNOBOL4. It exploits REBOL's Forth-like (lack of!) syntax, and Smalltalk-like penchant for passing around thunks (anonymous closures -- "[...]" in REBOL builds a lexically-scoped entity called "a block", which can be treated as code (executed) or data (manipulated like a Python list) at will). Now the example doesn't show this, but you can freely mix computations into the middle of the patterns; only *some* of the words in the blocks have special meaning to PARSE. The fragment above is already way beyond what can be accomplished with regexps, but that's just the start of it. Perl too is slamming in more & more ways to get user code to interact with its regexp engine. So REBOL has a *very* nice approach to this; I believe it's unreasonably clumsy to mimic in Python primarily because of forward references (note e.g. that the block attached to "expr" above refers to "term" before the latter has been bound -- but the stuff inside [...] is just a closure so that doesn't matter -- it only matters that term gets bound before expr is *executed*). I hit a similar snag years ago when trying to mimic SNOBOL4's approach in Python. Perl's endless abuse of regexps is making that language more absurd by the month. The other major approach to mixing patterns with computation is due to Icon, another language where a regexp mindset is fatal. On a whim, I whipped up the attached, which illustrates a bit of the Icon approach in Pythonic terms (but without language support for generators, the *heart* of it can't really be captured). Here's an example of how this could be used to implement (the simplest form of) string.split: def mysplit(str): s = Searcher(str) white = CharSet(" \t\n") result = [] s.many(white) # consume initial whitespace while s.notmany(white): # consume non-whitespace result.append(s.get_match()) s.many(white) return result >>> mysplit(" \t Hey, that's\tpretty\n\n neat! ") ['Hey,', "that's", 'pretty', 'neat!'] >>> The primary thing to note is that there's no seam between analyzing the string and doing computation on the partial results -- "the program is the pattern". This is what Icon does to perfection, Perl is moving toward, and REBOL is arriving at from a different direction. It's The Future <0.9 wink>. Without generators it's difficult to work backtracking into the Searcher class, but, as above, in my experience the backtracking feature of regexps is rarely *needed*! For example, at various points "split" wants to suck up all the whitespace characters, and that's *it* -- the backtracking possibility in the regexp \s+ is often a bug just waiting for unexpected *context* to trigger it. A hairy regexp is pure hell; but what simpler regexps can do don't require all that funky regexp machinery. BTW, the mxTextTools engine could be used to get blazing implementations of the primary Searcher methods (it excels at simple analysis). OTOH, making lots of calls to analyze short strings is slow. The only clean solutions to that are Perl's and Icon's (build everyting into one language so the compiler can optimize stuff away), and REBOL's (make no distinction between code and data, so that code can be analyzed & optimized at runtime -- and build the entire implementation around making closures and calls supernaturally fast). the-less-you-use-regexps-the-less-you-miss-'em-ly y'rs - tim class CharSet: def __init__(self, seq): self.seq = seq d = {} for ch in seq: d[ch] = 1 self.haskey = d.has_key def __call__(self, ch): return self.haskey(ch) def __add__(self, other): if isinstance(other, CharSet): other = other.seq return CharSet(self.seq + other) def _normalize_index(i, n): assert n >= 0 if i >= 0: return min(i, n) elif n == 0: return 0 # want smallest q s.t. i + q*n >= 0 # <-> q*n >= -i # <-> q >= -i/n # so q = ceiling(-i/n) = -floor(i/n) return i - (i/n)*n class Searcher: def __init__(self, str, lo=0, hi=None): """Create object to search in str[lo:hi]. lo defaults to 0. hi defaults to len(str). len(str) is repeatedly added to negative lo or hi until reaching a number >= 0. If lo > hi, a uselessly empty slice will be searched. The search cursor is initialized to lo. """ self.s = str self.lo = _normalize_index(lo, len(str)) if hi is None: self.hi = len(str) else: self.hi = _normalize_index(hi, len(str)) if self.lo > self.hi: self.hi = self.lo self.i = self.lo self.lastmatch = None, None def any(self, charset, consume=1): """Try to match single character in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i = self.i if i < self.hi and charset(self.s[i]): if consume: self.__consume(i+1) return 1 return 0 def notany(self, charset, consume=1): """Try to match single character not in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i = self.i if i < self.hi and not charset(self.s[i]): if consume: self.__consume(i+1) return 1 return 0 def many(self, charset, consume=1): """Try to match one or more characters in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i, n, s = self.i, self.hi, self.s j = i while j < n and charset(s[j]): j = j+1 if i < j: if consume: self.__consume(j) return 1 return 0 def notmany(self, charset, consume=1): """Try to match one or more characters not in charset. Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i, n, s = self.i, self.hi, self.s j = i while j < n and not charset(s[j]): j = j+1 if i < j: if consume: self.__consume(j) return 1 return 0 def match(self, str, consume=1): """Try to match string "str". Return true iff match succeeded. Advance cursor iff success and optional arg "consume" is true. """ i = self.i j = i + len(str) if self.s[i:j] == str: if consume: self.__consume(j) return 1 return 0 def get_str(self): """Return subject string.""" return self.s def get_lo(self): """Return low slice bound.""" return self.lo def get_hi(self): """Return high slice bound.""" return self.hi def get_pos(self): """Return current value of search cursor.""" return self.i def get_match_indices(self): """Return slice indices of last "consumed" match.""" return self.lastmatch def get_match(self): """Return last "consumed" matching substring.""" i, j = self.lastmatch if i is None: return ValueError("no match to return!") return self.s[i:j] def set_pos(self, pos, consume=1): """Set search cursor to new value. No return value. If optional arg "consume" is true, the last match is set to the slice between pos and the current cursor position. """ p = _normalize_index(pos, len(self.s)) if not self.lo <= p <= self.hi: raise ValueError("pos out of bounds: " + `pos`) if consume: self.__consume(p) else: self.i = p def move_pos(self, incr, consume=1): """Move the cursor by incr characters. No return value. If the new value is outside the slice bounds, it's clipped. If optional arg "consume" is true, the last match is set to the slice between the old and new cursor positions. """ newi = self.i + incr if newi < self.lo: newi = self.lo elif newi > self.hi: newi = self.hi if consume: self.__consume(newi) else: self.i = newi def __consume(self, newi): i, j = self.i, newi if i > j: i, j = j, i self.lastmatch = i, j self.i = newi From tim_one at email.msn.com Thu Dec 30 07:09:14 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 01:09:14 -0500 Subject: [Python-Dev] Fixed-decimal types In-Reply-To: <199912231944.OAA23337@eric.cnri.reston.va.us> Message-ID: <000201bf528c$657c3080$a02d153f@tim> [Guido] > ... > Not arguing for this interpretation, just indicating that doing > fixed precision arithmetic right is hard. It's not so much hard as it is arbitrary. The floating-point world is standardized now, but the fixed-point world remains a mish-mash of incompatible legacy schemes carried across generations of products for no reason other than product-specific compatibility. So despite that fixed-point has a specialty audience, whatever rules Python chooses will leave it incompatible with much of that audience's (mixed!) expectations. If fixed-point is needed, and my FixedPoint.py isn't good enough (all other fixed point pkgs I've seen for Python were braindead), then it should be implemented such that developers can control both rounding and precision propagation. I'll attach suitable kernels; they haven't been tested but any bugs discovered will be trivial to fix (there are no difficulties here, but typos are likely); the kernels supply the bulk of what's required, whether implemented in Python or C; various packages can wrap them to supply whatever policies they like; see FixedPoint.py for exact string<->FixedPoint and exact float->FixedPoint conversions; and that's the end of my involvement in fixed-point . Python should certainly *not* add a "scale factor" to its current long implementation; fixed-point should be a distinct type, as scale-factor fiddling is clumsy and pervasive (long arithmetic is challenging enough to get correct and quick without this obfuscating distraction; and by leaving scale factors out of it, it's much easier to plug in alternative bigint implementations (like GMP)). One other point: some people are going to want BCD (binary-coded decimal), which suffers the same mish-mash of legacy policies, but with a different data representation. The point is that many commercial applications spend much more time doing I/O conversions than arithmetic, and BCD accepts slow arithmetic (in the absence of special HW support) in return for fast scaling & I/O conversion. Forgetting the database-heads for a moment, decimal *floating*-point is what calculators do, so that's what "real people" are most comfortable with. The IEEE-854 std (IEEE-754's younger and friendlier brother) specifies that completely. Add a means to boost "global" precision (a la REXX), and it's a powerful tool even for experts (benefits approximating those of unbounded rational arithmetic but with bounded & user-controllable expense). can-never-have-too-many-numeric-types-but-always-have- too-few-literal-notations-ly y'rs - tim # Kernels for fixed-point decimal arithmetic. # _add, _sub, _mul, _div all have arglist # n1, p1, n2, p2, p, round=DEFAULT_ROUND # n1 and n2 are longs; p1, p2 and p ints >= 0. # The inputs are exactly n1/10**p1 and n2/10**p2. # # The return value is the integer n such that n/10**p is the best # approximation to the infinite-precision result. In other words, p1 # and p2 are the input precisions and p is the desired output # precision, where precision is the # of digits *after* the decimal # point. # # What "best approximation" means is determined by the round function. # In many cases rounding isn't required, but when it is # round(top, bot) # is returned. top and bot are longs, with bot > 0 guaranteed. The # infinite-precision result is top/bot. round must return an integer # (long) approximation to top/bot, using whichever rounding discipline # you want. By default, IEEE round-to-nearest/even is used; see the # _roundXXX functions for examples of suitable rounding functions. # # Note: The only code here that knows we're working in decimal is # function _tento; simply change the "10L" in that to do fixed-point # arithmetic in some other base. # # Example: # # >>> r7 = _div(1L, 0, 7L, 0, 20) # 1/7 # >>> r7 # 14285714285714285714L # >>> r5 = _div(1L, 0, 5L, 0, 20) # 1/5 # >>> r5 # 20000000000000000000L # >>> sum = _add(r7, 20, r5, 20, 20) # 1/7 + 1/5 = 12/35 # >>> sum # 34285714285714285714L # >>> _mul(sum, 20, 35L, 0, 20) # 1199999999999999999990L # >>> _mul(sum, 20, 35L, 0, 18) # 12000000000000000000L # >>> _mul(sum, 20, 35L, 0, 0) # 12L # >>> ################################################################### # Sample rounding functions. ################################################################### # Round to minus infinity. def _roundminf(top, bot): assert bot > 0 return top / bot # Round to plus infinity. def _roundpinf(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot if r: q = q + 1 return q # IEEE nearest/even rounding (closest integer; in case of tie closest # even integer). def _roundne(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot c = cmp(r << 1, bot) # c < 0 <-> r < bot/2, etc if c > 0 or (c == 0 and (q & 1) == 1): q = q + 1 return q # "Add a half and chop" rounding (remainder < 1/2 toward 0; remainder # >= half away from 0). def _roundhalf(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot c = cmp(r << 1, bot) # c < 0 <-> r < bot/2, etc if c > 0 or (c == 0 and q >= 0): q = q + 1 return q # Round toward 0 (throw away remainder). def _roundchop(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot if r and q < 0: q = q + 1 return q ################################################################### # Kernels for + - * /. ################################################################### DEFAULT_ROUND = _roundne def _add(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 # (n1/10**p1 + n2/10**p2) * 10**p == # (n1*10**(max-p1) + n2*10**(max-p2))/10**max * 10**p max = p1 # until proven otherwise if p1 < p2: n1 = n1 * _tento(p2 - p1) max = p2 elif p2 < p1: n2 = n2 * _tento(p1 - p2) n3 = n1 + n2 p3 = p - max if p3 > 0: n3 = n3 * _tento(p3) elif p3 < 0: n3 = round(n3, _tento(-p3)) return n3 def _sub(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 return _add(n1, p1, -n2, p2, p, round) def _mul(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 # (n1/10**p1 * n2/10**p2) * 10**p == # (n1*n2)/10**(p1+p2) * 10**p n3 = n1 * n2 p3 = p - p1 - p2 if p3 > 0: n3 = n3 * _tento(p3) elif p3 < 0: n3 = round(n3, _tento(-p3)) return n3 def _div(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 if n2 == 0: raise ZeroDivisionError("scaled integer") # (n1/10**p1 / n2/10**p2) * 10**p == # (n1/n2) * 10**(p2-p1+p) p3 = p2 - p1 + p if p3 > 0: n1 = n1 * _tento(p3) elif p3 < 0: n2 = n2 * _tento(-p3) if n2 < 0: n1 = -n1 n2 = -n2 return round(n1, n2) def _tento(i, _cache={}): assert i >= 0 try: return _cache[i] except KeyError: answer = _cache[i] = 10L ** i return answer From fredrik at pythonware.com Thu Dec 30 12:05:45 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 30 Dec 1999 12:05:45 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <000001bf528c$5cbdb9a0$a02d153f@tim> Message-ID: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com> Tim Peters is back from his vacation: > > While I don't want to turn Python into Perl, I would like to see > > it do a better job of what most people probably use the language > > for. Here is a very short list of things I think need attention: > > > > 1. [*A* clear way to do memory- and time-efficient textfile > > input] > > I agree, but unsure how to fix it. The best way to write this now is > > # f is some open file object. > while 1: > lines = f.readlines(BUFSIZE) > if not lines: > break > for line in lines: > process(line) > > and it's not something anyone figures out on their own -- or enjoys typing > or explaining afterwards. > > Perl gets its line-at-a-time speed by peeking and poking C FILE structs > directly in compiler- and platform-specific ways -- ways that vendors > *should* have done in their own fgets implementations, but almost never do. > I have no idea whether it works well with Perl's nascent notions of > threading, but in the absence of that "the system" doesn't know Perl is > cheating (i.e., as far as libc+friends are concerned, Perl *is* reading one > line at a time -- even mixing in C-level ungetc calls works (well, sometimes > <0.1 wink -- they don't always peek and poke enough fields>)). > > The Python QIO extension module is much easier to port but less compatible > (it doesn't use stdio, so QIO-opened files don't play well with others) and > slower (although that's likely repairable -- he's got two passes over the > buffer where one hairier pass should suffice). we have something called SIO which uses memory mapping where possible, and just a more aggressive read-ahead for other cases. on a windows box, a traditional while/readline loop runs 3-5 times faster than before. with SRE instead of re, a while/readline/match loop runs up to 10 times faster than before. note that this is without *any* changes to the Python source code... > > 2. The re module needs to be sped up, if not to catch up with > > Perl, then to catch up with the deprecated regex module. > > The irony here is that the re engine is very often unboundedly faster than > the regex engine -- provided you're chewing over large strings. Some tests > /F ran showed that the length-independent *overhead* of invoking re is about > 10x higher than for regex. Presumably the bulk of that is due to re.py, > i.e. that you get to the re engine via going thru Python layers on your way > in and out, while regex was pure C. I've attached some old benchmarks. I think the current code base is a bit faster, but you get the idea. > In any case, /F is working on a new engine (for Unicode), and I believe he > has this all well in hand. with a little luck, the new module will replace both pcre and regex... not to mention that it's fairly easy to write your own front- end to the matching engine -- the expression parser and the compiler are both written in good old python. $ python sre_bench.py 0 5 50 250 1000 5000 25000 ----- ----- ----- ----- ----- ----- ----- ----- search for Python|Perl in Perl -> sre8 0.007 0.008 0.010 0.010 0.020 0.073 0.349 sre16 0.007 0.007 0.008 0.010 0.020 0.075 0.353 re 0.097 0.097 0.101 0.103 0.118 0.175 0.480 regex 0.007 0.007 0.009 0.020 0.059 0.271 1.320 search for (Python|Perl) in Perl -> sre8 0.007 0.007 0.007 0.010 0.020 0.074 0.344 sre16 0.007 0.007 0.008 0.010 0.020 0.074 0.347 re 0.110 0.104 0.111 0.115 0.125 0.184 0.559 regex 0.006 0.006 0.009 0.019 0.057 0.285 1.432 search for Python in Python -> sre8 0.007 0.007 0.007 0.011 0.021 0.072 0.387 sre16 0.007 0.007 0.008 0.010 0.022 0.082 0.365 re 0.107 0.097 0.105 0.102 0.118 0.175 0.511 regex 0.009 0.008 0.010 0.018 0.036 0.139 0.708 search for .*Python in Python -> sre8 0.008 0.007 0.008 0.011 0.021 0.079 0.379 sre16 0.008 0.008 0.008 0.011 0.022 0.075 0.402 re 0.102 0.108 0.119 0.183 0.400 1.545 7.284 regex 0.013 0.019 0.072 0.318 1.231 8.035 45.366 search for .*Python.* in Python -> sre8 0.008 0.008 0.008 0.011 0.021 0.080 0.383 sre16 0.008 0.008 0.008 0.011 0.021 0.079 0.395 re 0.103 0.108 0.119 0.184 0.418 1.685 8.378 regex 0.013 0.020 0.073 0.326 1.264 9.961 46.511 search for .*(Python) in Python -> sre8 0.007 0.008 0.008 0.011 0.021 0.077 0.378 sre16 0.007 0.008 0.008 0.011 0.021 0.077 0.444 re 0.108 0.107 0.134 0.240 0.637 2.765 13.395 regex 0.026 0.112 3.820 87.322 (skipped) search for .*P.*y.*t.*h.*o.*n.* in Python -> sre8 0.010 0.010 0.014 0.031 0.093 0.419 2.212 sre16 0.010 0.011 0.014 0.030 0.093 0.419 2.292 re 0.112 0.121 0.195 0.521 1.747 8.298 40.877 regex 0.026 0.048 0.248 1.148 4.550 24.720 ... (searching for patterns in padded strings; sre8 is the sre engine compiled for 8-bit characters, sre16 is the same engine compiled for 16-bit characters) From mal at lemburg.com Thu Dec 30 12:52:50 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 30 Dec 1999 12:52:50 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <000001bf528c$5cbdb9a0$a02d153f@tim> Message-ID: <386B4792.A551022A@lemburg.com> Tim Peters wrote: > > [Skip Montanaro, wants nicer text facilities] > > While I don't want to turn Python into Perl, I would like to see > > it do a better job of what most people probably use the language > > for. Here is a very short list of things I think need attention: > > > > 1. [*A* clear way to do memory- and time-efficient textfile > > input] > > ... > > The Python QIO extension module is much easier to port but less compatible > (it doesn't use stdio, so QIO-opened files don't play well with others) and > slower (although that's likely repairable -- he's got two passes over the > buffer where one hairier pass should suffice). What is QIO ? > > Depending how far people want to go with things, adding some > > language syntax to support regular expressions might be in order. > > ... > > 3. I've not yet used it, but I am told the pattern matching in > > Marc-Andre Lemburg's mxTextTools > > (http://starship.python.net/crew/lemburg/) > > is both powerful and efficient (though it certainly appears > > complex). Perhaps it deserves consideration for > > incorporation into the core Python distribution. > > It's not complex, it's complicated -- and *that's* what makes it un-Pythonic > . Tony Ibbs has written a friendly wrapper around mxTextTools that > suppresses much of the non-essential complication. OTOH, if you go into > this with a regexp mindset, it will run much slower than a real regexp > package, because the bulk of the latter is devoted to doing optimization; > mxTextTools is WYSIWYG (it screams if you code to its strengths, but crawls > if you e.g. try to implement naive backtracking). All true. mxTextTools provides the tools, not the magic. But this is also its strength: you can optimize the hell out of your particular parsing requirement without having to think about how the RE optimizer works. > You should go to the REBOL site and look at the description of REBOL's PARSE > verb in the FAQ ... mumble, mumble ... at > > http://www.rebol.com/faq.html#11550948 > > Here's an example pulled from that page (this is a REBOL code fragment): > > digit: charset "0123456789" > expr: [term ["+" | "-"] expr | term] > term: [factor ["*" | "/"] term | factor] > factor: [primary "**" factor | primary] > primary: [value | "(" expr ")"] > value: [digit value | digit] > > parse "1 + 2 ** 9" expr > > There hasn't been a pattern scheme this clean, convenient or powerful since > SNOBOL4. It exploits REBOL's Forth-like (lack of!) syntax, and > Smalltalk-like penchant for passing around thunks (anonymous closures -- > "[...]" in REBOL builds a lexically-scoped entity called "a block", which > can be treated as code (executed) or data (manipulated like a Python list) > at will). Looks nice indeed, but how does executable code fit into that definition ? (mxTextTools allows you to write your own parsing elements in Python, BTW; it should be possible to use those mechanisms to achieve a similar intergration.) > ... > > BTW, the mxTextTools engine could be used to get blazing implementations of > the primary Searcher methods (it excels at simple analysis). OTOH, making > lots of calls to analyze short strings is slow. That's why mxTextTools converts these search idioms into byte codes which it executes at C level. Some future version will even "precompile" the tuple input and then omit the type checks during the search... that should give another noticeable speedup. Note that recursion etc. can be done at C level too -- Python function calls are not needed. > The only clean solutions to > that are Perl's and Icon's (build everyting into one language so the > compiler can optimize stuff away), and REBOL's (make no distinction between > code and data, so that code can be analyzed & optimized at runtime -- and > build the entire implementation around making closures and calls > supernaturally fast). Just for kicks, here is the mysplit() function using mxTextTools: from mx.TextTools import * table = ( # Match all whitespace (None,AllInSet,whitespace_set,+1), # Match and tag all non-whitespace ('text',AllInSet + AppendMatch,nonwhitespace_set,+1), # Loop until EOF (None,EOF,Here,-2), ) def mysplit(text): return tag(text,table)[1] The timings: mysplit: 5.84 sec. string.split: 3.62 sec. Note that you can customize the above to split text at any character set you like, not just whitespace... without compiling or writing C code. The function mx.TextTools.setsplit() provides this functionality as pure C function. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Get ready to party ! Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jim at interet.com Thu Dec 30 15:21:36 1999 From: jim at interet.com (James C. Ahlstrom) Date: Thu, 30 Dec 1999 09:21:36 -0500 Subject: [Python-Dev] zipfile.py References: <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk> Message-ID: <386B6A70.3C9A0042@interet.com> Finn Bock wrote: > > James C. Ahlstrom wrote: > > > ftp://ftp.interet.com/pub/pylib.html > > I feel that it smell a bit too much like a tool and too little like an general > programming api. It was meant to be an API except for writepy(), which is clearly a tool. > - It can only add disk files. The ability to write data to a zip entry through > a file-like object or from a string would make it more like an API, IMHO I could add a method writestr(self, string, year, month, day, hour, minute, second, ...) There are a lot of fields required which usually come from the file. > - Some kind of access to the TOC entry fields (date, size, compressed > size etc) also seems like a nice feature. This access is provided directly by self.TOC, and the fields are documented. > - The data for an entry must be available in memory. Could be a problem > for huge files, but most like not in practical use. I agree, but adding loops will make it slower. What do others think? > I admit that I am fond of the api from java.util.zip.ZipFile and > java.util.zip.ZipOutputStream. I don't know this API. If writestr() is not sufficient, what API would you like? JimA From bckfnn at pipmail.dknet.dk Thu Dec 30 20:14:14 1999 From: bckfnn at pipmail.dknet.dk (Finn Bock) Date: Thu, 30 Dec 1999 19:14:14 GMT Subject: [Python-Dev] zipfile.py In-Reply-To: <386B6A70.3C9A0042@interet.com> References: <3857B97E.3684224F@interet.com> <386a582d.6762574@pipmail.dknet.dk> <386B6A70.3C9A0042@interet.com> Message-ID: <386baec9.2867733@pipmail.dknet.dk> [I wrote] > - It can only add disk files. The ability to write data to a zip entry through > a file-like object or from a string would make it more like an API, IMHO [JimA wrote] >I could add a method > writestr(self, string, year, month, day, hour, minute, second, ...) >There are a lot of fields required which usually come from the file. Something like that seems fine to me. [I wrote] > - Some kind of access to the TOC entry fields (date, size, compressed > size etc) also seems like a nice feature. [JimA answers] >This access is provided directly by self.TOC, and the fields are >documented. Good enough. My bad, I was looking for getter methods. (me being a java dude) [I wrote] > I admit that I am fond of the api from java.util.zip.ZipFile and > java.util.zip.ZipOutputStream. [JimA asks] >I don't know this API. If writestr() is not sufficient, what >API would you like? This is only meant as a source for inspiration, certainly as a request for change. writestr would answer my complaint nicely. Below, only one ZipEntry can be actively read or written to at a time. All the small details of performance and implementation complexity are ignored. class ZipFile: def getEntry(name): ... self.activeentry = ZipEntry(name) return self.activeentry class ZipEntry: #enough methods and fields to fake file-ness to casual users like me. def write(list): ... def writelines(str): ... def read(size=None): ... def readlines(sizehint=-1): ... def seek(offset): ... def flush(): ... def close(str): ... def getSize(): .... def getCompressedSize(): .... def getFlags(): .... regards, finn From tim_one at email.msn.com Fri Dec 31 04:35:18 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 22:35:18 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <386B4792.A551022A@lemburg.com> Message-ID: <000001bf5340$0fb20300$e12d153f@tim> [M.-A. Lemburg] > What is QIO ? See DejaNews (I don't save URLs). "Quick" line-oriented text input adapted from INN. Someone rewrote that as a Python extension module. >> http://www.rebol.com/faq.html#11550948 > Looks nice indeed, but how does executable code fit into > that definition ? See the URL above I didn't save . PARSE's "pattern" argument is a block. Blocks can be (& often are) nested. Whether any given block is code or data is all the same to REBOL, so passing nested code blocks in PARSE's pattern argument is easy. Because blocks are lexically scoped, assignments (etc) inside a block are (well, can be) visible to its context; etc. It's a very Lispish approach. REBOL is essentially Scheme under the covers, but with syntax much more like Forth's (whitespace-separated strings of arbitrary non-whitespace characters, with few pre-assigned meanings or restrictions -- in fact, it's impossible for a compiler to determine where a REBOL function call begins or ends! can't be known until runtime). > (mxTextTools allows you to write your own parsing elements > in Python, BTW; it should be possible to use those mechanisms > to achieve a similar intergration.) It can't capture the flavor -- although I don't know that it needs to . There's no distinction between "the pattern language" and "the computational language" in REBOL or Icon, and it's hard to explain what a maddening distinction that can be once you've lived without it. mxTextTools embedding would feel more like Icon, where the matching engine is fully exposed to the programmer (REBOL hides it, allowing only "approved" interactions). >> OTOH, making lots of calls to analyze short strings is slow. > That's why mxTextTools converts these search idioms into byte > codes which it executes at C level. Some future version will > even "precompile" the tuple input and then omit the type checks > during the search...that should give another noticeable speedup. > Note that recursion etc. can be done at C level too -- Python > function calls are not needed. That's also the curse of having distinct languages; e.g., Python already had recursion, but you needed to reimplement it in a different way with different syntax and different rules in your pattern language. In Icon etc, there's no difference between a recursive pattern and a recursive function, except in *what* it computes. The machinery is all the same, and both more powerful and easier to learn because of that. > ... > Just for kicks, here is the mysplit() function using mxTextTools: > > from mx.TextTools import * > > table = ( > # Match all whitespace > (None,AllInSet,whitespace_set,+1), > # Match and tag all non-whitespace > ('text',AllInSet + AppendMatch,nonwhitespace_set,+1), > # Loop until EOF > (None,EOF,Here,-2), > ) > > def mysplit(text): > > return tag(text,table)[1] > > The timings: > mysplit: 5.84 sec. > string.split: 3.62 sec. > > Note that you can customize the above to split text at any > character set you like, not just whitespace... without > compiling or writing C code. That's equally true of the example I posted . Now what if I wanted to stop splitting right after I find a keyword, recognized as such because it's a key in some passed-in dictionary? In my example, I make an obvious local code change, from while s.notmany(white): # consume non-whitespace result.append(s.get_match()) s.many(white) to while s.notmany(white): # consume non-whitespace word = s.get_match() result.append(word) if dictionary.has_key(word): break s.many(white) What does it do to your example? Or what if the target string isn't "a string" (the code I posted only assumes the "str" object responds to indexing and slicing -- any buffer object is fine -- so my example doesn't change at all)? Or what if you need to pass the tokens on as they're found, pipeline style? Etc. This is why I do complex string processing in Icon <0.9 wink>. OTOH, at what it does well, mxTextTools runs quicker than Icon. Its biggest problem has always been that e.g. nobody knows what the hell (None,EOF,Here,-2), *means* at first glance -- or third . an-extreme-on-the-transparency-vs-speed-curve-ly y'rs - tim From mal at lemburg.com Fri Dec 31 12:18:57 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 31 Dec 1999 12:18:57 +0100 Subject: [Python-Dev] Better text processing support in py2k? References: <000001bf5340$0fb20300$e12d153f@tim> Message-ID: <386C9121.E9D9DC01@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > What is QIO ? > > See DejaNews (I don't save URLs). "Quick" line-oriented text input adapted > from INN. Someone rewrote that as a Python extension module. Ok, thanks. > >> http://www.rebol.com/faq.html#11550948 > > > Looks nice indeed, but how does executable code fit into > > that definition ? > > See the URL above I didn't save . PARSE's "pattern" argument is a > block. Blocks can be (& often are) nested. Whether any given block is code > or data is all the same to REBOL, so passing nested code blocks in PARSE's > pattern argument is easy. Because blocks are lexically scoped, assignments > (etc) inside a block are (well, can be) visible to its context; etc. It's a > very Lispish approach. REBOL is essentially Scheme under the covers, but > with syntax much more like Forth's (whitespace-separated strings of > arbitrary non-whitespace characters, with few pre-assigned meanings or > restrictions -- in fact, it's impossible for a compiler to determine where a > REBOL function call begins or ends! can't be known until runtime). If I understand the concept correctly, I think Python could do pretty much the same thing. The bummer is of course the need for new keywords and byte codes (although these could be split out into a separate text scanning engine). Using Python function calls would slow down things to an extent that would render the added functionality useless, well IMHO anyways ;-) > > (mxTextTools allows you to write your own parsing elements > > in Python, BTW; it should be possible to use those mechanisms > > to achieve a similar intergration.) > > It can't capture the flavor -- although I don't know that it needs to > . There's no distinction between "the pattern language" and "the > computational language" in REBOL or Icon, and it's hard to explain what a > maddening distinction that can be once you've lived without it. mxTextTools > embedding would feel more like Icon, where the matching engine is fully > exposed to the programmer (REBOL hides it, allowing only "approved" > interactions). Of course its hard for a Turing Machine to capture the flavor of any high level language :-) When you're programming the mxTextTools Tagging Engine directly you feel like writing assembler... but things are moving in the right direction: Tony Ibbs has a nice meta-language and M.C. Fletcher his SimpleParse to cover up these insufficiencies. > >> OTOH, making lots of calls to analyze short strings is slow. > > > That's why mxTextTools converts these search idioms into byte > > codes which it executes at C level. Some future version will > > even "precompile" the tuple input and then omit the type checks > > during the search...that should give another noticeable speedup. > > Note that recursion etc. can be done at C level too -- Python > > function calls are not needed. > > That's also the curse of having distinct languages; e.g., Python already had > recursion, but you needed to reimplement it in a different way with > different syntax and different rules in your pattern language. In Icon etc, > there's no difference between a recursive pattern and a recursive function, > except in *what* it computes. The machinery is all the same, and both more > powerful and easier to learn because of that. Agreed. > > ... > > Just for kicks, here is the mysplit() function using mxTextTools: > > > > from mx.TextTools import * > > > > table = ( > > # Match all whitespace > > (None,AllInSet,whitespace_set,+1), > > # Match and tag all non-whitespace > > ('text',AllInSet + AppendMatch,nonwhitespace_set,+1), > > # Loop until EOF > > (None,EOF,Here,-2), > > ) > > > > def mysplit(text): > > > > return tag(text,table)[1] > > > > The timings: > > mysplit: 5.84 sec. > > string.split: 3.62 sec. > > > > Note that you can customize the above to split text at any > > character set you like, not just whitespace... without > > compiling or writing C code. > > That's equally true of the example I posted . Now what if I wanted to > stop splitting right after I find a keyword, recognized as such because it's > a key in some passed-in dictionary? In my example, I make an obvious local > code change, from > > while s.notmany(white): # consume non-whitespace > result.append(s.get_match()) > s.many(white) > > to > > while s.notmany(white): # consume non-whitespace > word = s.get_match() > result.append(word) > if dictionary.has_key(word): > break > s.many(white) > > What does it do to your example? You'd replace the 'text' tagobj with a callable object and write AllInSet + CallTag as command. The Tagging Engine will then call the object with arguments (taglist,text,l,r,subtags) and let it decide what to do. In your example it would check the dictionary and raise an exception in case a keyword is found to stop any further scanning. If it's not a keyword, it would simply append the found string to the taglist and return None. Here's the code: from mx.TextTools import * import exceptions stoplist = {'abc':1, 'def':1} class KeywordFound(exceptions.StandardError): def __init__(self, taglist): self.taglist = taglist def callable(taglist,text,l,r,subtags): taglist.append(text[l:r]) if stoplist.has_key(text[l:r]): raise KeywordFound(taglist) table = ( # Match all whitespace (None,AllInSet,whitespace_set,+1), # Match and tag all non-whitespace (callable,AllInSet + CallTag,nonwhitespace_set,+1), # Loop until EOF (None,EOF,Here,-2), ) def mysplitex(text): try: return tag(text,table)[1] except KeywordFound,data: return data.taglist > Or what if the target string isn't "a > string" (the code I posted only assumes the "str" object responds to > indexing and slicing -- any buffer object is fine -- so my example doesn't > change at all)? The current version only handles string objects, but I am already beginning to convert all the APIs in mxTextTools to "s#" or "t#" style (can't decide which to use... "s#" is great for processing raw data, while "t#" more closely refers to text processing). > Or what if you need to pass the tokens on as they're found, > pipeline style? Etc. This is why I do complex string processing in Icon > <0.9 wink>. You can have all that extra magic via callable tag objects or callable matching functions. It's not exactly nice to write, but I'm sure that a meta-language could do the conversions for you. > OTOH, at what it does well, mxTextTools runs quicker than Icon. Its biggest > problem has always been that e.g. nobody knows what the hell > > (None,EOF,Here,-2), > > *means* at first glance -- or third . The structure of those tag tables is very simple: (tagobject, command, argument[, jump offset in case of failure [, jump offset in case of success]]) Please remember that this is byte code, not some higher level abstraction. The design is very much inverted from what you'd usually do: design a nice language and then try to find suitable set of byte codes to make it work as intended. Anyway, I'll keep focussing on the speed aspect of mxTextTools; others can focus on abstractions, so that eventually everybody will be happy :-) Happy New Year, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: Get ready to party ! Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tim_one at email.msn.com Fri Dec 31 23:53:49 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 31 Dec 1999 17:53:49 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <012d01bf52b5$d6a3cb00$f29b12c2@secret.pythonware.com> Message-ID: <000701bf53e1$e7119760$472d153f@tim> [Fredrik Lundh, whose very nice eMatter book is on sale until the end of the 20th century (as real people think of it), although the eMatter distribution scheme has lots of problems [just an editorial note from a bot who has to-- for unknown reasons Fatbrain "is working on" --delete the Fatbrain registry tree and reregister the book almost every time he tries to open it ] ] > we have something called SIO which uses memory mapping > where possible, and just a more aggressive read-ahead for > other cases. on a windows box, a traditional while/readline > loop runs 3-5 times faster than before. with SRE instead of > re, a while/readline/match loop runs up to 10 times faster > than before. > > note that this is without *any* changes to the Python > source code... If so, there's potential for significantly more speed. Python does its line-at-a-time input with a character-at-a-time macro-in-a-loop, the same way naive vendors (read "almost all vendors") implement fgets. It's replacing that inner loop with direct peeking into the FILE buffer that gets Perl its dramatic speed -- despite that Perl has fancier input functionality (the oft-requested automagical "input record separator"). So it sounds like the Perl trick is orthogonal to SIO's tricks; Perl isn't doing mmaps or read-aheads or anything else fancy under the covers -- it only optimizes the inner loop! > ... > with a little luck, the new module will replace both pcre > and regex... If something more tangible than luck would help to make this come true, feel free to mention it . > not to mention that it's fairly easy to write your own front- > end to the matching engine -- the expression parser and the > compiler are both written in good old python. Ah, good news / bad news. Perl refugees aren't accustomed to "precompiling" regexp objects, so write code that will cause regexps to get recompiled over & over. Even if you cache the results under the covers, the overhead of the Python call to the regexp compiler will likely take as long as the engine takes to search. Personally, in such cases, I think they should learn how to use the language <0.5 wink>. From tim_one at email.msn.com Fri Dec 31 23:53:56 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 31 Dec 1999 17:53:56 -0500 Subject: [Python-Dev] Better text processing support in py2k? In-Reply-To: <386C9121.E9D9DC01@lemburg.com> Message-ID: <000901bf53e1$eb4248c0$472d153f@tim> >> This is why I do complex string processing in Icon <0.9 wink>. [MAL] > You can have all that extra magic via callable tag objects > or callable matching functions. It's not exactly nice to > write, but I'm sure that a meta-language could do the > conversions for you. That wasn't my point: I do it in Icon because it *is* "exactly nice to write", and doesn't require any yet-another meta-language. It's all straightforward, in a way that separate schemes pasted together can never be (simply because they *are* "separate schemes pasted together" ). The point of my Python examples wasn't that they could do something mxTextTools can't do, but that they were *Python* examples: every variation I mentioned (or that you're likely to think of) was easy to handle for any Python programmer because the "control flow" and "data type" etc aspects could be handled exactly the way they always are in *non* pattern-matching Python code too, rather than recoded in pattern-scheme-specific different ways (e.g., where I had a vanailla "if/break", you set up a special exception to tickle the matching engine). I'm not attacking mxTextTools, so don't feel compelled to defend it -- people using regexps in those examples are dead in the water. mxTextTools is very good at what it does; if we have a real disagreement, it's probably that I'm less optimistic about the prospects for higher-level wrappers (e.g., MikeF's SimpleParse is much slower than "a real" BNF parsing system (ARBNFPS), in part because he isn't doing all the optimizations ARBNFPS does, but also in part because ARBNFPS uses an underlying engine more optimized to its specific task than mxTextTool's more-general engine *can* be). So I don't see mxTextTools as being the answer to everything -- and if you hadn't written it, you would agree with that on first glance . > Anyway, I'll keep focussing on the speed aspect of mxTextTools; > others can focus on abstractions, so that eventually everybody > will be happy :-) You and I will be, anyway .