help wanted with wicked new., parser., etc. black magic

Sun Aug 20 17:25:10 EDT 2000

First of all, my apologies for diving in "over my head" and then
bleating for help. In another thread, I've already seen Tim-Peters-ly
warn that new is deep black magic, and not for the amusement of mere
mortals. I always wanted to be a god, anyway, so here goes :)

I'm building a set of libraries to support prototype-based OOP (a.k.a.
"classless" OOP) in Python. I won't go too far into the details of what
the heck prototype-based OOP is, aside from pointing folks to the Self
project at Sun Microsystems: http://self.sunlabs.com/ . Suffice it to
say that it's cool stuff, but the Self language has some limitations
(portability foremost among them at present), so I've sought to
reproduce its better angels in languages dearer to me (Perl, Python,
Java, etc.). A few years ago, I brought up a working implementation in
Perl, which has served me well. Now I'm on to Python, and I'm slamming
my head against new and prettier walls in so doing :)

But enough about that. My specific current woe is this: I desire to
shovel python code into a database for later retrieval and execution.
Now, on the surface of it, that sounds fine. Even if you ignore
nastiness like globals, presumptions about execution context, etc.,
etc., it's straightforward enough to scribble a string out to the DB and
compile / eval it back in.

But that's not what I want to do :)

At present, I have a system which works like this:

	from selfish import proto

	def mymeth (self, msg):
		print "%s says %s" % (self, msg)

	o = proto( )
	o.say = mymeth

	o.say( 'hello' )

and results in this output:

	<selfish.proto instance at 80dd600> says hello

All well and good. Under the covers, some black magic involving
new.instancemethod goes on, which transforms mymeth into an actual
method, and binds it to o. If you understand prototype-based OOP, you
know what I'm getting at. If you don't, it doesn't matter anyway :) The
bottom line is that I like attaching arbitrary methods to arbitrary
objects :)

Now, what I ultimately /desire/ to do, that I'm having no success in, is
to take a string describing a function, shove it in the database, and
later pull it out and attach it to an object, like above. So it becomes
this, instead:

	## script 'insert'
	... database black magic

	id = db.store( args, code )

	## script 'retrieve'
	from selfish import proto

	o = proto( )
	... database black magic

	(args, code) = db.retrieve( id )
	func = make_func_magic( args, code )

	o.say = func

	o.say( 'hello' )

The various database and make_func black magic could include parsing,
rewriting ASTs, etc. I'm not partial to any particular method, but I
would like something fairly robust, and not wholly dependant on a
particular version of the python runtime :)

Right now, what I've tried is something like this:

	src = """
	def anon (self, msg):
		lmsg = '%s says %s' % (self, msg)
		print lmsg
	"""
	code = compile( src, '<string>', 'exec' )
	eval( code )
	anon( None, 'hello' )

That, of course, works, but it requires that I know a priori the name of
the function. Yes, I can store that in the database, but then I have to
deal with namespace issues, etc., etc. Since these functions are
destined to be bound to an object instance, they don't /need/ a
well-known name. It suffices for them to be anonymous.

I could, of course, regex-replace "def ([_A-Za-z][_A-Za-z0-9]*)" with
"def _anon", and declare _anon off-limits by fiat.

That's ugly and hackish, but it's what I'll do if no one has a more
elegant way :)

What I /want/ to do is this:

	src = """
	lmsg = '%s says %s' % (self, msg)
	print lmsg
	"""

	raw_code = compile( src, '<string>', 'exec' )
	code = new.code( 
		2,
		raw_code.co_nlocals,
		raw_code.co_stacksize,
		raw_code.co_flags,
		raw_code.co_code,
		raw_code.co_consts,
		raw_code.co_names,
		('self', 'msg') + raw_code.co_varnames,
		'<string>',
		'',
		1,
		raw_code.co_lnotab )
    func = new.function( code, globals( ), '', () )
    func( None, 'hello' )

But this bombs out without being able to determine where the heck 'self'
and 'lmsg' come from. I presume that this is because, in the absence of
a def statement in the source, the compiler never generates code to push
/ pop / whatever those arguments onto the stack. And I suspect this only
gets hairier when I want to get into *args and **keywords, as well :)

So... Any thoughts on directions I might persue? I'm specifically not
averse to reading the interpreter source, but I've held off so far
because my brain is quite foggy, and I no longer have the stamina I once
had for these things :) However, if someone says "go read foo.c in the
interpreter source, your answers lie there", I'll do so without
complaint :)

Thanks in advance for any help, wisdom, etc., etc.

-- 
Tripp Lilley * tripp at perspex.com *
http://stargate.eheart.sg505.net/~tlilley/
-----------------------------------------------------------------------------
"This whole textual substitution thing is pissing me off.
 I feel like I'm programming in Tcl."

- Eric Frias, former roommate, hacking partner extraordinaire

help wanted with wicked new.*, parser.*, etc. black magic

help wanted with wicked new., parser., etc. black magic