An idiom for code generation with exec

eliben eliben at gmail.com
Fri Jun 20 15:44:52 EDT 2008


On Jun 20, 3:19 pm, George Sakkis <george.sak... at gmail.com> wrote:
> On Jun 20, 8:03 am, eliben <eli... at gmail.com> wrote:
>
>
>
> > On Jun 20, 9:17 am, Bruno Desthuilliers <bruno.
>
> > 42.desthuilli... at websiteburo.invalid> wrote:
> > > eliben a écrit :> Hello,
>
> > > > In a Python program I'm writing I need to dynamically generate
> > > > functions[*]
>
> > > (snip)
>
> > > > [*] I know that each time a code generation question comes up people
> > > > suggest that there's a better way to achieve this, without using exec,
> > > > eval, etc.
>
> > > Just to make things clear: you do know that you can dynamically build
> > > functions without exec, do you ?
>
> > Yes, but the other options for doing so are significantly less
> > flexible than exec.
>
> > > > But in my case, for reasons too long to fully lay out, I
> > > > really need to generate non-trivial functions with a lot of hard-coded
> > > > actions for performance.
>
> > > Just out of curiousity : could you tell a bit more about your use case
> > > and what makes a simple closure not an option ?
>
> > Okay.
>
> > I work in the field of embedded programming, and one of the main uses
> > I have for Python (and previously Perl) is writing GUIs for
> > controlling embedded systems. The communication protocols are usually
> > ad-hoc messages (headear, footer, data, crc) built on top of serial
> > communication (RS232).
>
> > The packets that arrive have a known format. For example (YAMLish
> > syntax):
>
> > packet_length: 10
> > fields:
> >   - name: header
> >     offset: 0
> >     length: 1
> >   - name: time_tag
> >     offset: 1
> >     length: 1
> >     transform: val * 2048
> >     units: ms
> >   - name: counter
> >     offset: 2
> >     length: 4
> >     bytes-msb-first: true
> >   - name: bitmask
> >     offset: 6
> >     length: 1
> >     bit_from: 0
> >     bit_to: 5
> > ...
>
> > This is a partial capability display. Fields have defined offsets and
> > lengths, can be only several bits long, can have defined
> > transformations and units for convenient display.
>
> > I have a program that should receive such packets from the serial port
> > and display their contents in tabular form. I want the user to be able
> > to specify the format of his packets in a file similar to above.
>
> > Now, in previous versions of this code, written in Perl, I found out
> > that the procedure of extracting field values from packets is very
> > inefficient. I've rewritten it using a dynamically generated procedure
> > for each field, that does hard coded access to its data. For example:
>
> > def get_counter(packet):
> >   data = packet[2:6]
> >   data.reverse()
> >   return data
>
> > This gave me a huge speedup, because each field now had its specific
> > function sitting in a dict that quickly extracted the field's data
> > from a given packet.
>
> It's still not clear why the generic version is so slower, unless you
> extract only a few selected fields, not all of them. Can you post a
> sample of how you used to write it without exec to clarify where the
> inefficiency comes from ?
>
> George

The generic version has to make a lot of decisions at runtime, based
on the format specification.
Extract the offset from the spec, extract the length. Is it msb-
first ? Then reverse. Are specific bits required ? If so, do bit
operations. Should bits be reversed ? etc.

A dynamically generated function doesn't have to make any decisions -
everything is hard coded in it, because these decisions have been done
at compile time. This can save a lot of dict accesses and conditions,
and results in a speedup.

I guess this is not much different from Lisp macros - making decisions
at compile time instead of run time and saving performance.

Eli



More information about the Python-list mailing list