Parsing and Editing Source

Rafe rafesacks at gmail.com
Fri Aug 15 11:16:38 EDT 2008


On Aug 15, 9:21 pm, "Paul Wilson" <paulalexwil... at gmail.com> wrote:
> Hi all,
>
> I'd like to be able to do the following to a python source file
> programmatically:
>  * Read in a source file
>  * Add/Remove/Edit Classes, methods, functions
>  * Add/Remove/Edit Decorators
>  * List the Classes
>  * List the imported modules
>  * List the functions
>  * List methods of classes
>
> And then save out the result back to the original file (or elsewhere).
>
> I've begun by using the tokenize module to generate a token-tuple list
> and am building datastructures around it that enable the above
> methods. I'm find that I'm getting a little caught up in the details
> and thought I'd step back and ask if there's a more elegant way to
> approach this, or if anyone knows a library that could assist.
>
> So far, I've got code that generates a line number to token-tuple list
> dictionary, and am working on a datastructure describing where the
> classes begin and end, indexed by their name, such that they can be
> later modified.
>
> Any thoughts?
> Thanks,
> Paul


I can't help much...yet, but I am also heavily interested in this as I
will be approaching a project which will require me to write code
which writes code back to a file or new file after being manipulated.
I had planned on using the inspect module's getsource(), getmodule()
and getmembers() methods rather than doing any sort of file reading.
Have you tried any of these yet? Have you found any insurmountable
limitations?

It looks like everything needed is there. Some quick thoughts
regarding inspect.getmembers(module) results...
 * Module objects can be written based on their attribute name and
__name__ values. If they are the same, then just write "import %s" %
mod.__name__. If they are different, write "import %s as %s" % (name,
mod.__name__)

 * Skipping built in stuff is easy and everything else is either an
attribute name,value pair or an object of type 'function' or 'class'.
Both of which work with inspect.getsource() I believe.

 * If the module used any from-import-* lines, it doesn't look like
there is any difference between items defined in the module and those
imported in to the modules name space. writing this back directly
would 'flatten' this call to individual module imports and local
module attributes. Maybe reading the file just to test for this would
be the answer. You could then import the module and subtract items
which haven't changed. This is easy for attributes but harder for
functions and classes...right?


Beyond this initial bit of code, I'm hoping to be able to write new
code where I only want the new object to have attributes which were
changed. So if I have an instance of a Person object who's name has
been changed from it's default, I only want a new class which inherits
the Person class and has an attribute 'name' with the new value.
Basically using python as a text-based storage format instead of
something like XML. Thoughts on this would be great for me if it
doesn't hijack the thread ;) I know there a quite a few who have done
this already.


Cheers,

- Rafe







More information about the Python-list mailing list