C++ code generation

Dan Goodman dg.gmane at thesamovar.net
Tue Mar 16 21:00:11 EDT 2010


Hi all,

I'm doing some C++ code generation using Python, and would be interested 
in any comments on the approach I'm taking.

Basically, the problem involves doing some nested loops and executing 
relatively simple arithmetic code snippets, like:

for i in xrange(len(X)):
   X[i] += 5

Actually they're considerably more complicated than this, but this gives 
the basic idea. One way to get C++ code from this would be to use 
Cython, but there are two problems with doing that. The first problem is 
that the arithmetic code snippets are user-specified. What I want to do 
is generate code, and then compile and run it using Scipy's weave 
package. The second problem is that I have various different data 
structures and the C++ code generated needs to be different for the 
different structures (e.g. sparse or dense matrices).

So far what I've been doing is writing Python code that writes the C++ 
code, but in a very non-transparent way. I like the idea of specifying 
the C++ code using Python syntax, like in Cython. So the idea I came up 
with was basically to abuse generators and iterators so that when you 
write something like:

for x in X:
    ...

it actually outputs some C++ code that looks like:

for(int i=0; i<X_len; i++){
double &x = X[i];
...
}

The ... in the Python code is only executed once because when X is 
iterated over it only returns one value.

Here's the example I've written so far (complete code given below):

# initialisation code
code = OutputCode()
evaluate = Evaluator(code)
X = Array(code, 'values')
# specification of the C++ loop
for x in X:
     evaluate('x += 5; x *= 2')
# and show the output
print code.code

It generates the following C++ code:

for(int values_index=0; values_index<values_len; values_index++){
double &values = values_array[values_index];
values += 5;
values *= 2;
}

OK, so that's an overview of the idea that I have of how to do it. Any 
comments or suggestions on either the approach or the implementation?

Below is the complete code I've written for the example above (linewraps 
aren't perfect but there's only a couple of lines to correct).

Thanks for any feedback,

Dan

import re, inspect

# We just use this class to identify certain variables
class Symbol(str): pass

# This class is basically just a mutable string
class OutputCode(object):
     def __init__(self):
         self.code = ''
     def __iadd__(self, code):
         self.code = self.code+code
         return self

# Iterating over instances of this class generates code
# for iterating over a C++ array, it yields a single
# Symbol object, the variable name of the value in the
# array
class Array(object):
     def __init__(self, code, name, dtype='double'):
         self.name = name
         self.dtype = dtype
         self.code = code
     def __iter__(self):
         def f():
             self.code += 'for(int {name}_index=0; 
{name}_index<{name}_len; {name}_index++){{\n'.format(name=self.name)
             self.code += '{dtype} &{name} = 
{name}_array[{name}_index];\n'.format(dtype=self.dtype, name=self.name)
             yield Symbol(self.name)
             self.code += '}\n'
         return f()

# Instances of this class generate C++ code from Python syntax
# code snippets, replacing variable names that are a Symbol in the
# namespace with the value of that Symbol.
class Evaluator(object):
     def __init__(self, code):
         self.code = code
     def __call__(self, code):
         # The set of variables in the code snippet
         vars = re.findall(r'\b(\w+)\b', code)
         # Extract any names from the namespace of the calling frame
         frame = inspect.stack()[1][0]
         globals, locals = frame.f_globals, frame.f_locals
         values = {}
         for var in vars:
             if var in locals:
                 values[var] = locals[var]
             elif var in globals:
                 values[var] = globals[var]
         # Replace any variables whose values are Symbols with their values
         for var, value in values.iteritems():
             if isinstance(value, Symbol):
                 code = re.sub(r'\b{var}\b'.format(var=var), str(value), 
code)
         # Turn Python snippets into C++ (just a simplified version for now)
         code = code.replace(';', '\n')
         lines = [line.strip() for line in code.split('\n')]
         code = ''.join(line+';\n' for line in lines)
         self.code += code

if __name__=='__main__':
     code = OutputCode()
     evaluate = Evaluator(code)
     X = Array(code, 'values')
     for x in X:
         evaluate('x += 5; x *= 2')
     print code.code




More information about the Python-list mailing list