From dubois1 at llnl.gov Thu Apr 2 01:44:43 1998 From: dubois1 at llnl.gov (Paul F. Dubois) Date: Wed, 1 Apr 1998 15:44:43 -0800 Subject: [C++-SIG] Fw: the dynamic load mystery... Message-ID: <000801bd5dc8$25e74dc0$998a7380@pduboispc.llnl.gov> I've been trying to get my recent C++ work to work on an HPUX-10 system. It compiles and makes a dynamic library just fine; or you can build it into a static application. In the static case, everything is fine. In the dynamic case, python 1.5 complains that the shared library does not not contain the module's init routine, named initexample. But, as you'll see below, it does. In an effort to diagnose this, we took the spammodule.c example, changed spammodule.c to spammodule.cxx, adding the external "C" statement for initspam, and made it on an HP using KCC -x and cc -Ae. Works fine: import spam succeeds. Then we changed the name spaminit everywhere to exampleinit, and renamed the file example.cxx. Works fine: import example succeeds. Copied this example.cxx into the CXX/Demo directory and rebuilt there. Works fine. Added one line to example.cxx: #include "CXX_Objects.h". Now we get the failing behavior. nm reveals laura[63] nm *.sl | grep initexample 00032d08 T __sti____Demo_example_cxx_initexample 00032cf0 T __sti____Demo_example_cxx_initexample 000316f4 T initexample 00031694 T initexample Demo/example.cxx is the file name in which initexample resides. I wonder if this extra initexample inside a name is confusing some part of the dynamic load. Does any of this ring a bell with anyone? From hinsen at ibs.ibs.fr Thu Apr 2 12:45:04 1998 From: hinsen at ibs.ibs.fr (Konrad Hinsen) Date: Thu, 2 Apr 1998 12:45:04 +0200 Subject: [C++-SIG] Re: [PSA MEMBERS] Fw: the dynamic load mystery... In-Reply-To: <000801bd5dc8$25e74dc0$998a7380@pduboispc.llnl.gov> (dubois1@llnl.gov) Message-ID: <199804021045.MAA19352@lmspc1.ibs.fr> > static application. In the static case, everything is fine. In the dynamic > case, python 1.5 complains that the shared library does not not contain the > module's init routine, named initexample. But, as you'll see below, it does. Sounds familiar. I have exactly that problem with plain C extensions, e.g. NumPy, both with HP/UX 10 and HP/UX 9. And it works fine with Python 1.4. But it doesn't occur with all extensions, and to make things worse it depends on what other dynamic modules have been imported before. For example: import multiarray doesn't work, but import time import multiarray does. So does import time import multiarray import umath but not import time import umath For the moment, our HP's are still at Python 1.4 for this reason. Unfortunately, I don't have the time to figure out what's going wrong. I suppose that comparing the Makefiles for 1.4 and 1.5 should produce some significant difference. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- From neal at ctd.comsat.com Thu Apr 2 18:57:02 1998 From: neal at ctd.comsat.com (Neal Becker) Date: 02 Apr 1998 11:57:02 -0500 Subject: [C++-SIG] Fw: the dynamic load mystery... In-Reply-To: "Paul F. Dubois"'s message of "Wed, 1 Apr 1998 15:44:43 -0800" References: <000801bd5dc8$25e74dc0$998a7380@pduboispc.llnl.gov> Message-ID: Paul> laura[63] nm *.sl | grep initexample Paul> 00032d08 T __sti____Demo_example_cxx_initexample Paul> 00032cf0 T __sti____Demo_example_cxx_initexample Paul> 000316f4 T initexample Paul> 00031694 T initexample I very much doubt it. Look at any hpux .sl file. They all look like that. From dubois1 at llnl.gov Thu Apr 2 20:47:09 1998 From: dubois1 at llnl.gov (Paul F. Dubois) Date: Thu, 2 Apr 1998 10:47:09 -0800 Subject: [C++-SIG] CXX-a3 Message-ID: <000f01bd5e67$be1791a0$998a7380@pduboispc.llnl.gov> CXX-a3.zip and .exe are at ftp-icf.llnl.gov/pub/python. Changed namespace name to Py to match work being done by Furnish. Changed PythonExtension::methods() to api() and added a new methods() to hold a method table for additional, non-api methods. Added new methods to the Demo/ "r" object and R class, and excercised the new facilities in rtest.cxx. Misc. small improvements. From dubois1 at llnl.gov Fri Apr 3 20:08:58 1998 From: dubois1 at llnl.gov (Paul F. Dubois) Date: Fri, 3 Apr 1998 10:08:58 -0800 Subject: [C++-SIG] CXX-a4 available at ftp-icf.llnl.gov/pub/python Message-ID: <005c01bd5f2b$93538a40$998a7380@pduboispc.llnl.gov> I have started coordinating with Geoff Furnish, who is hard at work on the trampoline. Geoff and I have made a number of naming changes as follows: Former macros Py_Null and Nothing are now inline functions Null() and Nothing. This eliminates all macros so that the namespaces can do their job properly. Name change edits: CXX -> Py PyException_ -> // delete PyException -> Exception Py_Null -> Null () Nothing -> Nothing () Signature change: Exception::clear() is now not const. Technically, it doesn't alter the Exception object, but morally, it does. Thus to catch an exception you may wish to clear, catch it as catch (PyException& e) { ...; e.clear(); ...} For one you don't intend to clear, catch it as catch (const PyException&) {...} From skaller at maxtal.com.au Tue Apr 7 16:56:09 1998 From: skaller at maxtal.com.au (John Skaller) Date: Wed, 08 Apr 1998 00:56:09 +1000 Subject: [C++-SIG] Re: [PSA MEMBERS] Fw: the dynamic load mystery... Message-ID: <1.5.4.32.19980407145609.00917088@triode.net.au> At 12:45 2/04/98 +0200, Konrad Hinsen wrote: >> static application. In the static case, everything is fine. In the dynamic >> case, python 1.5 complains that the shared library does not not contain the >> module's init routine, named initexample. But, as you'll see below, it does. > >Sounds familiar. I have exactly that problem with plain C extensions, >e.g. NumPy, both with HP/UX 10 and HP/UX 9. And it works fine with Python 1.4. >But it doesn't occur with all extensions, and to make things worse >it depends on what other dynamic modules have been imported before. I didn't realise this problem was so widespread so I will repeat some of the private email I sent Paul. I hope it helps. I have had an _apparently_ similar problem which I have solved. I have a C extension '_pytcl' which sometimes failed to import during development. [I had a LOT of problems because pytcl itself loads Tcl and somewhat indirectly causes Tk and X libraries to be loaded too. (Tcl and Tk 8.0 are 'incorrectly' built in the distribution).] What happened to me was that I declared and called a function with external linkage but did NOT define it. When that happened under Python 1.4, I would eventually get an 'unresolved symbol' diagnostic. Under Pythnon 1.5, at my suggestion, dlopen now uses a flag supported by linux and some other Unices which causes immediate failure if a library contains an unresolved external reference (after attempted dynamic linkage). This failure causes the actual dlopen call to fail, 'as if' the module or it's entry point could not be found. Now, this problem is probably closely related to the problem Konrad reports -- a dependency on the order of 'import'. This is because Unix dynamic loading is by and large an utter hack, and is badly implemented and defined. (The Windows mechanism is much saner). What is happening is probably as follows: in Python 1.4, the dlopen flags are different to Python 1.5. in 1.5 they are RTLD_NOW and RTLD_GLOBAL. What that means is that (1) RTLD_GLOBAL this dl library will provide symbols for others (I do NOT understand the details, try as I might, it doesn't work at all sanely, and is, of course, in Unix tradition, not documented properly) (2) RTLD_NOW means 'give an error if the library cannot be fully bound' (all external references in it satisfied) What that means is that you can NOT simply use incremental linkage, loading a library depending on another, not yet loaded, and then load that other library. Instead you MUST link the library against the dependent library so it is loaded automatically by the loader (NOT by Python import which calls dlopen). For example: cc libB.c -o libB.so -rdynamic .... will FAIL where cc libB.c -lA -o libB.so ... ^^^ will work, where 'libA.so' is required by 'libB.so'. That is because libB.so will be loaded when libA.so is loaded, by the loader, a subsequent 'dlopen' on libA.so (caused by Python import) will simply bind to an already loaded library. As an example of this problem: Tcl and Tk are INCORRECTLY built in the standard distribution. in particular, Tk is NOT linked against libtcl.so or against libX. This works just fine for 'wish' because 'wish' IS linked against tcl and X, but just try tcl tcl>package require Tk and you will get an error, (hundreds of them -- one for each X windows function) because tcl (or the tcl shell tclsh) of course is NOT linked against X. [As I understand it this error in the distribution should NOT happen if RTLD_GLOBAL actually worked properly, which it doesn't seem to] The Tk distribution must be patched to explicitly link the Tk _library_ against tcl and X (not just that one application, 'wish'). A special note for C++ developers: you will have fun with template instantiation and dynamic linkage! In summary: 1) Unix dynamic loading is brain dead on many systems including Linux, the 'compile time linker' will not detect an unresolved external reference 2) in general you must ensure _intentionally_ unresolved externals will be satisfied at dynamic load time by linking all dynamic libraries against dependent libraries proving these symbols 3) EXCEPT that symbols provided by the main application are supplied automatically provided it is linked with the right flags (-rdynamic or -Xlinker, -export-dynamic) The 'dynamic export' mechanism does NOT work correctly (whatever that means) across three level dynamic loads, at least under Linux. Thus (2) -- explicitly link a library against dependent libraries. I would love to hear from someone who really understands how the dynamic linker tries to resolve external references incrementally. (Since it doesn't work :-( ------------------------------------------------------- John Skaller email: skaller at maxtal.com.au http://www.maxtal.com.au/~skaller phone: 61-2-96600850 snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia From dubois1 at llnl.gov Tue Apr 7 19:15:32 1998 From: dubois1 at llnl.gov (Paul F. Dubois) Date: Tue, 7 Apr 1998 10:15:32 -0700 Subject: [C++-SIG] Fw: the dynamic load mystery... Message-ID: <002c01bd6248$c5dffc80$998a7380@pduboispc.llnl.gov> I really appreciate your taking the time to help me. Indeed, we discovered the secret was to use KCC to load. I replaced the definition for LDSHARED and it worked. This is consistent with your explanation, too. Well, "worked" means "ran some and then croaked", but at least I got past the import, and at least it might even now be my own fault. Thank you to everyone who tried to help. -----Original Message----- From: John Skaller To: dubois1 at llnl.gov Date: Tuesday, April 07, 1998 5:53 AM Subject: Re: [C++-SIG] Fw: the dynamic load mystery... >At 15:44 1/04/98 -0800, Paul F. Dubois wrote: >>I've been trying to get my recent C++ work to work on an HPUX-10 system. It >>compiles and makes a dynamic library just fine; or you can build it into a >>static application. In the static case, everything is fine. In the dynamic >>case, python 1.5 complains that the shared library does not not contain the >>module's init routine, named initexample. But, as you'll see below, it does. >> >>In an effort to diagnose this, we took the spammodule.c example, changed >>spammodule.c to spammodule.cxx, adding the >>external "C" statement for initspam, and made it on an HP using KCC -x and >>cc -Ae. Works fine: import spam succeeds. >> >>Then we changed the name spaminit everywhere to exampleinit, and renamed the >>file example.cxx. Works fine: import example succeeds. >> >>Copied this example.cxx into the CXX/Demo directory and rebuilt there. >>Works fine. >> >>Added one line to example.cxx: #include "CXX_Objects.h". >>Now we get the failing behavior. nm reveals >> >>laura[63] nm *.sl | grep initexample >>00032d08 T __sti____Demo_example_cxx_initexample >>00032cf0 T __sti____Demo_example_cxx_initexample >>000316f4 T initexample >>00031694 T initexample >> >>Demo/example.cxx is the file name in which initexample resides. I wonder if >>this extra initexample inside a name is confusing some part of the dynamic >>load. >> >>Does any of this ring a bell with anyone? > >I have no idea if this helps .. but I had a problem which I >_thought_ was like your description. but it turned out to be >something quite different: an unsatisfied external reference >in the dynamic library. > > >[I'm pisssed at the STUPID way Unices/compilers/linkers/loaders handle this, >it's an utter hack. Windows has saner system. :-[ > >Since you are using templates, it is possible the dynamic link library >indeed contains an external reference to an uninstantiated template >instance. > >Check the output of 'nm' carefully for unexpected undefined references. > >My current C extension to Python (PyTcl) exhibits this problem: >if I declare and call, but forget to define, a function (all my >interface functions have external linkage), the resultant library >is built fine but cannot be loaded. Because the load uses a flag >forcing immediate resolution of undefined references in the >library, failure is immediate (rather than at the point of call). > >Thus, the python 'import' fails with a WRONG message 'can't find >init_pytcl' -- the _actual_ error is different. > >With slightly different options on the build of the library >and/or Python (forget which) I _used_ to get the actual error, >namely 'can't resolve symbol XXXX'. >------------------------------------------------------- >John Skaller email: skaller at maxtal.com.au > http://www.maxtal.com.au/~skaller > phone: 61-2-96600850 > snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia > > > From hinsen at ibs.ibs.fr Thu Apr 16 16:28:19 1998 From: hinsen at ibs.ibs.fr (Konrad Hinsen) Date: Thu, 16 Apr 1998 16:28:19 +0200 Subject: [C++-SIG] Re: [PSA MEMBERS] Fw: the dynamic load mystery... In-Reply-To: <1.5.4.32.19980407145609.00917088@triode.net.au> (message from John Skaller on Wed, 08 Apr 1998 00:56:09 +1000) Message-ID: <199804161428.QAA10017@lmspc1.ibs.fr> > What happened to me was that I declared and called a function with > external linkage but did NOT define it. When that happened under > Python 1.4, I would eventually get an 'unresolved symbol' diagnostic. > > Under Pythnon 1.5, at my suggestion, dlopen now uses a flag supported > by linux and some other Unices which causes immediate failure > if a library contains an unresolved external reference (after attempted > dynamic linkage). > > This failure causes the actual dlopen call to fail, 'as if' the > module or it's entry point could not be found. Ahhh, thanks, that is probably the source of the NumPy problem. However, I don't see any solution! Here's what happens during NumPy import: A Python module does "import umath" (plus many other things). Module umath needs functions from module _numpy, but is *not* linked with it. My solution was to import _numpy explicitly in the init function of module umath, using the Python call PyImport_ImportModule(), which is more or less equivalent to the Python import statement. In other words, this scheme guarantees only that all symbols needed are available *after* execution of the init function, but not when module umath is loaded. So why is umath not linked with _numpy? First of all, that doesn't seem to work with some systems. Second, everytging might be linked statically with the Python interpreter, and in that case there wouldn't be any dynamic library to link to. Third, the names of shared libraries vary between systems. The scheme I used has the enormous advantage of allowing client modules to use NumPy in a perfectly portable way, i.e. a single Setup file works for all systems and all ways of installing NumPy. Well, almost :-( > Now, this problem is probably closely related to the problem Konrad > reports -- a dependency on the order of 'import'. This is because > Unix dynamic loading is by and large an utter hack, and is badly Definitely true, but I suppose we can't do much about it! > As an example of this problem: Tcl and Tk are INCORRECTLY built > in the standard distribution. in particular, Tk is NOT linked against > libtcl.so or against libX. I remember running into that problem as well under HP/UX; I had to change the Makefile to make it work. > 2) in general you must ensure _intentionally_ unresolved > externals will be satisfied at dynamic load time > by linking all dynamic libraries against dependent > libraries proving these symbols That's the big problem - I don't see any way to do this in a portable way. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- From skaller at maxtal.com.au Fri Apr 17 16:10:17 1998 From: skaller at maxtal.com.au (John Skaller) Date: Sat, 18 Apr 1998 00:10:17 +1000 Subject: [C++-SIG] Re: [PSA MEMBERS] Fw: the dynamic load mystery... Message-ID: <1.5.4.32.19980417141017.009c10fc@triode.net.au> At 16:28 16/04/98 +0200, Konrad Hinsen wrote: >So why is umath not linked with _numpy? First of all, that doesn't >seem to work with some systems. I needs to work 'by definition', at least where 'link' is interpreted 'liberally' :-) >Second, everytging might be linked >statically with the Python interpreter, and in that case there >wouldn't be any dynamic library to link to. Third, the names of shared >libraries vary between systems. Yes. But I think that 'the fact that linkage is not portable' means that ... >The scheme I used has the enormous >advantage of allowing client modules to use NumPy in a perfectly >portable way, i.e. a single Setup file works for all systems and >all ways of installing NumPy. Well, almost :-( ... it isn't portable. :-) But it still has to be done. Somehow :-( >> 2) in general you must ensure _intentionally_ unresolved >> externals will be satisfied at dynamic load time >> by linking all dynamic libraries against dependent >> libraries proving these symbols > >That's the big problem - I don't see any way to do this in a portable >way. That's not a problem. You're probably right. So do it in a non-portable way :-) With lots of cases, and people emailing you what worked for them. :-(( Ugly. But it isn't your fault. ------------------------------------------------------- John Skaller email: skaller at maxtal.com.au http://www.maxtal.com.au/~skaller phone: 61-2-96600850 snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia From hinsen at ibs.ibs.fr Fri Apr 17 16:23:43 1998 From: hinsen at ibs.ibs.fr (Konrad Hinsen) Date: Fri, 17 Apr 1998 16:23:43 +0200 Subject: [C++-SIG] Re: [PSA MEMBERS] Fw: the dynamic load mystery... In-Reply-To: <1.5.4.32.19980417141017.009c10fc@triode.net.au> (message from John Skaller on Sat, 18 Apr 1998 00:10:17 +1000) Message-ID: <199804171423.QAA14623@lmspc1.ibs.fr> > Yes. But I think that 'the fact that linkage is not portable' > means that ... > > >The scheme I used has the enormous > >advantage of allowing client modules to use NumPy in a perfectly > >portable way, i.e. a single Setup file works for all systems and > >all ways of installing NumPy. Well, almost :-( > > ... it isn't portable. :-) > But it still has to be done. Somehow :-( Yeah, somehow... > >That's the big problem - I don't see any way to do this in a portable > >way. > > That's not a problem. You're probably right. > So do it in a non-portable way :-) With lots of cases, > and people emailing you what worked for them. :-(( > Ugly. But it isn't your fault. No, but it's my problem. I am trying to convince my colleagues to use Python (and my libraries), but among the people I know, more than 50% have given up due to installation problems related to NumPy. That was supposed to be fixed by the new version... BTW, what was the problem that caused you to ask for a change in the behaviour under HP/UX? I understand that it's nicer to get error messages immediately for testing, but was there any other problem? If the 1.4 behaviour is fine for production use, maybe it is more reasonable to use the 1.5 behaviour only in debugging mode. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- From skaller at maxtal.com.au Mon Apr 20 16:33:51 1998 From: skaller at maxtal.com.au (John Skaller) Date: Tue, 21 Apr 1998 00:33:51 +1000 Subject: [C++-SIG] Re: [PSA MEMBERS] Fw: the dynamic load mystery... Message-ID: <1.5.4.32.19980420143351.009ae168@triode.net.au> At 16:23 17/04/98 +0200, Konrad Hinsen wrote: >No, but it's my problem. I am trying to convince my colleagues to use >Python (and my libraries), but among the people I know, more than 50% >have given up due to installation problems related to NumPy. That was >supposed to be fixed by the new version... Keep trying. Hard task. >BTW, what was the problem that caused you to ask for a change in the >behaviour under HP/UX? Not HP/UX particularly: I'm running Linux. >I understand that it's nicer to get error >messages immediately for testing, but was there any other problem? If >the 1.4 behaviour is fine for production use, maybe it is more >reasonable to use the 1.5 behaviour only in debugging mode. From memory there are _two_ switches: either immediate load (RTLD_NOW) or defered load (RTLD_LAZY) (that's _one_ switch :-), and also a flag to 'share' symbols, RTLD_GLOBAL. The sharing flag allows one dynamically loaded library to make its symbols available to another (I think). I think it was this flag that needed to be added. Not sure. :-( So you could be right, RTLD_LAZY could still be used. ------------------------------------------------------- John Skaller email: skaller at maxtal.com.au http://www.maxtal.com.au/~skaller phone: 61-2-96600850 snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia From hinsen at ibs.ibs.fr Tue Apr 21 19:03:52 1998 From: hinsen at ibs.ibs.fr (Konrad Hinsen) Date: Tue, 21 Apr 1998 19:03:52 +0200 Subject: [C++-SIG] Re: [PSA MEMBERS] Fw: the dynamic load mystery... In-Reply-To: <1.5.4.32.19980420143351.009ae168@triode.net.au> (message from John Skaller on Tue, 21 Apr 1998 00:33:51 +1000) Message-ID: <199804211703.TAA00700@lmspc1.ibs.fr> > From memory there are _two_ switches: either immediate load > (RTLD_NOW) or defered load (RTLD_LAZY) > (that's _one_ switch :-), and also a flag to > 'share' symbols, RTLD_GLOBAL. > > The sharing flag allows one dynamically loaded > library to make its symbols available to another (I think). > I think it was this flag that needed to be added. Not sure. :-( > > So you could be right, RTLD_LAZY could still be used. I'll probably try some combinations... Anyway, I changed my mind about the NumPy problem. It has to be something else than I thought. In fact, NumPy modules do not use any symbols from other modules. They just import another module and retrieve a pointer from it (via a CObject), and then they do function calls through this pointer. As far as dynamic library loading is concerned, the modules are independent. Conclusion: I still don't know why NumPy has problems under HP/UX. But the change in dlopen() parameters is still worth exploring. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at ibs.ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, av. des Martyrs | Deutsch/Esperanto/English/ 38027 Grenoble Cedex 1, France | Nederlands/Francais ------------------------------------------------------------------------------- From skaller at maxtal.com.au Wed Apr 22 05:21:14 1998 From: skaller at maxtal.com.au (John Skaller) Date: Wed, 22 Apr 1998 13:21:14 +1000 Subject: [C++-SIG] Re: [PSA MEMBERS] Fw: the dynamic load mystery... Message-ID: <1.5.4.32.19980422032114.00907144@triode.net.au> See Also, From News: ------- I have problems to get Python 1.5 to load a shared library under HP-UX. I have installed Python exactly as described in the HPUX-NOTES. The shared lib was made with SWIG. The problem is: Python tells me the library would not export the symbol initlibc, but nm tells me that is not true. ------- ------------------------------------------------------- John Skaller email: skaller at maxtal.com.au http://www.maxtal.com.au/~skaller phone: 61-2-96600850 snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia