[C++-sig] [Boost.Python v3] Conversions and Registries

Jim Bosch talljimbo at gmail.com
Tue Sep 20 18:38:14 CEST 2011


On 09/20/2011 11:06 AM, Niall Douglas wrote:
> On 19 Sep 2011 at 17:03, Jim Bosch wrote:
>
>> I'd like to see support for static, template-based conversions.  These
>> would be defined by [partial-]specializing a traits class, and I tend to
>> think they should only be invoked after attempting all registry-based
>> conversions.
>
> Surely not! You'd want to let template specialisaton be the first
> point of call so the compiler can compile in obvious conversions,
> *then* and only then do you go to a runtime registry.
>
> This also lets one override the runtime registry when needed in the
> local compiland. I'm not against having another set of template
> specialisations do something should the first set of specialisations
> fail, and/or the runtime registry lookup fails.
>

I'd also considered having a different set of template conversions that 
are checked first for performance reasons, but I'd actually viewed the 
override preference argument from the opposite direction - once a 
template converter traits class has been fully specialized, you can't 
specialize it again differently in another module (well, maybe symbol 
visibility labels can get you out of that bind in practice).  So it 
seemed a registry-based override would be the only way to override a 
template-based conversion, and hence the registry-based conversions 
would have to go first.

But overall I think your proposal to just try the templates first is 
cleaner, because having multiple specializations of the same traits 
class in different modules would be a problem either way; allowing users 
to override the compile-time conversions with registry-based conversions 
is at best a poor workaround.

>> Users would have to include the same headers in groups of
>> interdependent modules to avoid specializing the same traits class
>> multiple times in different ways; I can't think of a way to protect them
>> from this, but template-based specializations are a sufficiently
>> advanced featured that I'm comfortable leaving it up to users to avoid
>> this problem.
>
> Just make sure what you do works with precompiled headers :)
>
> P.S.: This is trickier than it sounds.
>

Yuck.  Precompiled headers are something I've never dealt with before, 
but I suppose I had better learn.

>> We've had some discussion of allowing different modules to have
>> different registries (in addition to a global registry shared across
>> modules).  Leaving aside implementation questions, I have a little
>> survey for interested parties:
>>
>> 1) Under what circumstances would you want a conversion that is
>> completely limited to a specific module (or a group of modules that
>> explicitly declare it)?
>
> Defaults to most recent in calling thread stack, but overridable
> using a TLS override to allow impersonation.
>
> The same mechanism usefully also takes care of multiple python
> interpreters too.
>

I have to admit I'm only barely following you here - threads are another 
thing I don't deal with often.  It sounds like you have a totally 
different option from the ones I was anticipating.  Could you explain in 
more detail how this would work?

>> 2) Under what circumstances would you want a conversion to look in a
>> module-specific registry, and then fall back to the global registry?
>
> As above. That implies that there is no global registry, just the
> default registry which all module registries inherit.

(still a little confused about what you mean)

>> 3) Considering that we will have a "best-match" overloading system, what
>> should take precedence, an inexact match in a module-specific registry,
>> or an exact match in a global registry?  (Clearly this is a moot point
>> for to-Python conversion).
>
> The way I've always done this is to have the template metaprogramming
> set a series of type comparison functions which return scores. This
> pushes most of the scoring and weighting into the compiler and the
> compiler will elide any calls into the dynamic registry where the
> scoring makes that sensible. Makes compile times rather longer though
> :)
>
> The dynamic and compile-time registries can be merged easily enough,
> so all the runtime registry is is a set of comparison functions
> normally elided by the compiler in other modules. In other words,
> mark the inline functions as visible outside the current DLL
> (dllexport/visibility(default)) so the compiler will assemble
> complete versions for external usage.
>

An interesting idea - avoid trying all possible conversions a runtime 
seems a very worthy goal, though I could also see this inflating the 
size of the modules.  Can you point me at anything existing for an example?

>> Finally, can anyone give me a reason why having a global registry can
>> lead to a violation of the "One Definition Rule"?  This was alluded to
>> many times in the earlier discussion, and there's no doubt that a global
>> registry may lead to unexpected (from a given module's perspective)
>> behavior - but I do not understand the implication that the global
>> registry can result in formally undefined behavior by violating the ODR.
>
> ODR only matters in practice for anything visible outside the current
> compiland. If compiling with GCC -fvisibility=hidden, or on any MSVC
> by default, you can define class foo to be anything you like so long
> as nothing outside the current compiland can see class foo.
>
> ODR is real important though across DLLs. If a DLL X says that class
> foo is one thing and DLL Y says it's something different, expect
> things to go very badly wrong. Hence I simply wouldn't have a global
> registry. It's bad design. You *have* to have per module registries
> and *only* per module registries.
> Imagine the following. Program A loads DLL B and DLL C. DLL B is
> dependent on DLL D which uses BPL. DLL C is dependent on DLL E which
> uses BPL.
>
> DLL D tells BPL that class foo is implicitly convertible with an
> integer.
>
> DLL E tells BPL that class foo is actually a thin wrapper for
> std::string.
>
> Right now with present BPL, we have to load two copies of BPL, one
> for DLL D and one for DLL E. They maintain separate type registries,
> so all is good.
>
> But what if DLL B returns a python function to Program A, which then
> installs it as a callback with DLL C?
>
> In the normal case, BPL code in DLL E will call into BPL code DLL D
> and all is well.
>
> But what if the function in DLL D throws an exception?
>
> This gets converted into a C++ exception by throwing
> boost::error_already_set.
>
> Now the C++ runtime must figure where to send the exception. But what
> is the C++ runtime supposed to do with such an exception type? It
> isn't allowed to see the copy of BPL living in DLL E, so it will fire
> the exception type into DLL D where it doesn't belong. At this point,
> the program will almost certainly segfault.
>
> Whatever you do with BPL in the future, it MUST support being a
> dependency of multiple DLLs simultaneously. It MUST know who is
> calling what and when, and know how to unwind everything at any
> particular stage. This implies that it must be 100% compatible with
> dlopen(RTLD_GLOBAL).
>
> As I mentioned earlier, this is a very semantically similar problem
> to supporting multiple python interpreters anyway with each calling
> into one another. You can kill two birds with the one stone as a
> result.

If I understand your argument, it's not the global registry that causes 
ODR violations - it's the fact that you're trying to mimic having local 
registries by forcing distinct BPLs for each module, and that makes BPL 
symbols ambiguous.  If you had a pair of modules that were happy using 
each other's converters, they would do the standard thing and share one 
BPL and one registry and you wouldn't have any ODR problems.

In other words, it's not the fact that DLL D and DLL E register 
different conversions for class foo that causes the ODR problems; that 
just makes modules interact unfortunately (but in a deterministic and 
debuggable way).  It's the workaround (loading multiple BPLs) that 
causes the actual ODR problems.

So it sounds we agree that we should only ever have one BPL loaded.  We 
just need to implement the registry so it can know which module DLL 
instance a particular registry lookup is coming from, whether that's 
using special module-instance IDs or compiling the registries into the 
module DLLs or something else.

Is that right?

Thanks!

Jim


More information about the Cplusplus-sig mailing list