Extending/embedding versus separation

John Machin sjmachin at lexicon.net
Thu Mar 28 00:20:33 EST 2002


"skoria" <shomon at softhome.net> wrote in message news:<mailman.1017253842.6718.python-list at python.org>...
> Hi
> 
> I have a system for processing usage statistics, for which most of the
> hard work is done by a small C program 
[etc etc]
> 
> Now I want to create fancy graphs, 
[etc etc]
> So the plan was until today to write everything out
> to a text file and process these files with python to display in html
> or whatever other format comes to mind. 
> 
> But I've been reminded that python is good at integrating with C
> programs through extension and embedding. This tells me that the other
> way to go from here is to turn my c program into a python extension,
> or somehow merge the two languages, so that I miss the step of saving
> the C programs results to disk. (and potentially increase memory
> usage?). I will be having a look at "programming python" tonight, to
> see which, out of extending and embedding, is what I should go for.

My take on a project like yours would be to write *everything* (that's
not available off the shelf) in Python first, even parts that you
think you know for sure are going to have to be implemented in C
later. Those parts can be coded up in Python modules that can be
replaced by C extensions if really needed. The infrastructure stuff
like command-line-arg handling (or GUI input), file handling, etc is
so much easier to bolt together in Python than in C that I would
prefer extending Python (if necessary) to embedding Python in C.

Hash tables in C? Are you using a package like Cdt, or did you write
your own? If you can process data into hash tables in C and then get
the data into Python faster than you can process the data into
dictionaries in Python, then please divulge your deep dark magic
spells to Tim Peters the Python-dict-shaman.

You seem to be a bit concerned about memory usage. My advice is to
write a prototype of your application in Python, bearing memory
efficiency in mind, but not obsessively -- i.e. use the most
appropriate data structures and don't distort your Python code into
unreadability. Then you either have enough memory or you don't. Do
some back-of-the-envelope calculations, like: a 256MB stick of memory
costs how few hours of developer time? If you can't for whatever
reason get more memory, then it's time to consider your next step. If
your application has some large dictionaries that only have objects of
type X as keys and type Y as values (where Y is a simple type like int
or float) then it might be a good idea to take a copy of dictobject.c
and make a specialised intdict (say) module that instead of managing
(PyObject *) pointers, managed int values directly -- this would save
you heaps (pardon the pun) of memory; see recent thread in this
newsgroup about amount of memory taken up by Python objects.

For another (already implemented) variation on this memory-saving
theme, google("c.l.py", "Machin intern memory").

HTH,
John



More information about the Python-list mailing list