[C++-SIG] Re: [PSA MEMBERS] Fw: the dynamic load mystery...

John Skaller skaller at maxtal.com.au
Tue Apr 7 16:56:09 CEST 1998


At 12:45 2/04/98 +0200, Konrad Hinsen wrote:
>> static application. In the static case, everything is fine. In the dynamic
>> case, python 1.5 complains that the shared library does not not contain the
>> module's init routine, named initexample. But, as you'll see below, it does.
>
>Sounds familiar. I have exactly that problem with plain C extensions,
>e.g. NumPy, both with HP/UX 10 and HP/UX 9. And it works fine with Python 1.4.
>But it doesn't occur with all extensions, and to make things worse
>it depends on what other dynamic modules have been imported before.

I didn't realise this problem was so widespread so I will repeat 
some of the private email I sent Paul. I hope it helps.

I have had an _apparently_ similar problem which I have solved.

I have a C extension '_pytcl' which sometimes failed to import
during development. [I had a LOT of problems because pytcl itself
loads Tcl and somewhat indirectly causes Tk and X libraries to be
loaded too. (Tcl and Tk 8.0 are 'incorrectly' built in the distribution).]

What happened to me was that I declared and called a function with 
external linkage but did NOT define it. When that happened under
Python 1.4, I would eventually get an 'unresolved symbol' diagnostic.

Under Pythnon 1.5, at my suggestion, dlopen now uses a flag supported
by linux and some other Unices which causes immediate failure
if a library contains an unresolved external reference (after attempted
dynamic linkage).

This failure causes the actual dlopen call to fail, 'as if' the
module or it's entry point could not be found. 

Now, this problem is probably closely related to the problem Konrad
reports -- a dependency on the order of 'import'. This is because
Unix dynamic loading is by and large an utter hack, and is badly
implemented and defined. (The Windows mechanism is much saner).

What is happening is probably as follows: in Python 1.4, the dlopen
flags are different to Python 1.5. in 1.5 they are RTLD_NOW and RTLD_GLOBAL.

What that means is that

        (1) RTLD_GLOBAL this dl library will provide symbols for others
                (I do NOT understand the details, try as I might, it doesn't
                work at all sanely, and is, of course, in Unix tradition,
                not documented properly)

        (2) RTLD_NOW means 'give an error if the library cannot be fully
                bound' (all external references in it satisfied)

What that means is that you can NOT simply use incremental linkage,
loading a library depending on another, not yet loaded, and then
load that other library.

Instead you MUST link the library against the dependent library so it
is loaded automatically by the loader (NOT by Python import which calls dlopen).

For example:

        cc libB.c -o libB.so -rdynamic ....

will FAIL where

        cc libB.c -lA -o libB.so ...
                  ^^^

will work, where 'libA.so' is required by 'libB.so'. That is because
libB.so will be loaded when libA.so is loaded, by the loader,
a subsequent 'dlopen' on libA.so (caused by Python import)
will simply bind to an already loaded library.

As an example of this problem: Tcl and Tk are INCORRECTLY built
in the standard distribution. in particular, Tk is NOT linked against
libtcl.so or against libX.

This works just fine for 'wish' because 'wish' IS linked against
tcl and X, but just try

        tcl
        tcl>package require Tk

and you will get an error, (hundreds of them -- one for each
X windows function) because tcl (or the tcl shell tclsh)
of course is NOT linked against X. [As I understand it
this error in the distribution should NOT happen if 
RTLD_GLOBAL actually worked properly, which it doesn't seem to]

The Tk distribution must be patched to explicitly link the
Tk _library_ against tcl and X (not just that one
application, 'wish').

A special note for C++ developers: you will have fun with template instantiation
and dynamic linkage!


In summary:

        1) Unix dynamic loading is brain dead on many systems
           including Linux, the 'compile time linker' will not
           detect an unresolved external reference

        2) in general you must ensure _intentionally_ unresolved
            externals will be satisfied at dynamic load time
            by linking all dynamic libraries against dependent
            libraries proving these symbols

        3) EXCEPT that symbols provided by the main application
           are supplied automatically provided it is linked
           with the right flags (-rdynamic or -Xlinker, -export-dynamic)


The 'dynamic export' mechanism does NOT work correctly (whatever that
means) across three level dynamic loads, at least under Linux.
Thus (2) -- explicitly link a library against dependent libraries.

I would love to hear from someone who really understands how the
dynamic linker tries to resolve external references incrementally.
(Since it doesn't work :-(
-------------------------------------------------------
John Skaller    email: skaller at maxtal.com.au
		http://www.maxtal.com.au/~skaller
		phone: 61-2-96600850
		snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia






More information about the Cplusplus-sig mailing list