SoC project: Python-Haskell bridge - request for feedback

Fri Mar 28 02:49:46 EDT 2008

"Michaâ Janeczek" <janeczek at gmail.com> writes:
> I wasn't aware of the runtime issues, these can be things to watch out
> for. However, the type of embedding that I imagined would be mostly
> pure functions, since Python can deal with IO rather well. It'd also
> be applicable in situations where we want to add some functionality to
> to existing, large Python project, where the complete rewrite
> would be infeasible.

Of course I can't say what functions someone else would want to use,
but I'm not seeing very convincing applications of this myself.  There
aren't that many prewritten Haskell libraries (especially monomorphic
ones) that I could see using that way.  And if I'm skilled enough with
Haskell to write the functions myself, I'd probably rather embed
Python in a Haskell app than the other way around.  Haskell's i/o has
gotten a lot better recently (Data.ByteString) though there is
important stuff still in progress (bytestring unicode).  For other
pure functions (crypto, math, etc.) there are generally libraries
written in C already interfaced to Python (numarray, etc.), maybe
through Swig.  The case for Haskell isn't that compelling.

> I didn't mention this in this first draft, but I don't know (yet)
> how to support those "fancy" types. The plan for now is to export
> monomorphic functions only. 

This probably loses most of the interesting stuff: parser combinators,
functional data structures like zippers, etc.  Unless you mean to
use templates to make specialized versions?

> As for GC, I think having the two systems involved is unavoidable if
> I want to have first class functions on both sides.

This just seems worse and worse the more I think about it.  Remember
that GHC uses a copying gc so there is no finalization and therefore
no way to notify python that a reference has been freed.  And you'd
probably have to put Haskell pointers into Python's heap objects so
that the Haskell gc wouldn't have to scan the whole Python heap.
Also, any low level GHC gc stuff (not sure if there would be any)
might have to be redone for GHC 6.10(?) which is getting a new
(parallel) gc.  Maybe I'm not thinking of this the right way though, I
haven't looked at the low level ghc code.

Keep in mind also that Python style tends to not use complex data
structures and fancy sharing of Haskell structures may not be in the
Python style.  Python uses extensible lists and mutable dictionaries
for just about everything, relying on the speed of the underlying C
functions to do list operations very fast (C-coded O(n) operation
faster than interpreted O(log n) operation for realistic n).  So maybe
this type of sharing won't be so useful.  

It may be simplest to just marshal data structures across a message
passing interface rather than really try to share values between the
two systems.  For fancy functional structures, from a Python
programmer's point of view, it is probably most useful to just pick a
few important ones and code them in C from scratch for direct use in
Python.  Hedgehog Lisp (google for it) has a nice C implementation of
functional maps that could probably port easily to the Python C API,
and I've been sort of wanting to do that.  It would be great if
you beat me to it.

> >  Anyway I'm babbling now, I may think about this more later.
> By all means, please do go on :) This has helped a lot :)

One thing I highly recommend is that you join the #haskell channel on
irc.freenode.net.  There are a lot of real experts there (I'm just
newbie) who can advise you better than I can, and you can talk to them
in real time.