[pypy-issue] [issue1220] Improving readability of generated .c code
Dave Malcolm
tracker at bugs.pypy.org
Thu Jul 19 03:25:20 CEST 2012
New submission from Dave Malcolm <dmalcolm at redhat.com>:
I'm attaching a patch which I believe significantly improves the readability of
the C code that the translator emits.
Specifically, the patch:
* adds comments to the generated C showing the corresponding RPython source
code, *including* that of inlined functions.
* attempts to reduce the spaghetti-like gotos of the blocks in a function by
replaceing "goto" to a block with no
predecessors with the block itself. Doing so constructs a pleasing hierarchical
structure that more closely resembles
human-written sources.
This is a followup to an old mailing list post:
http://codespeak.net/pipermail/pypy-dev/2010q4/006532.html
which covered showing the RPython sources in the generated C.
I've carried a similar patch to the one given there within my Fedora/EPEL PyPy
rpms (see [1]), in which I tried to
implement the source code for inlining by trying to capture a "source code
location" for an operation as a stack of actual
source locations (corresponding to inlining).
[1] http://pkgs.fedoraproject.org/gitweb/?p=pypy.git;a=blob;f=more-readable-c-
code.patch;h=92d340f69bdfb4837fce6c9cd7ab722bf9bdb9b9;hb=HEAD
However it never worked well, and not all operations are actually associated
with source code e.g.:
* "same_as" no-ops (see
pypy/translator/backendopt/constfold.py:constant_diffuse)
* gencapicall() within rtyper.py (e.g. generated "PyInt_AsLong()" call to get
at a boxed int)
* operations added by gctransform (e.g. reference counting, by
pypy.rpython.memory.gctransform.refcounting.RefcountingGCTransformer)
* exceptions added for handling exceptions
(pypy.translator.exceptiontransform.LLTypeExceptionTransformer)
The alternative approach I came up with is to add a new kind of operation:
OP_COMMENT/"comment", a no-op, added when we
build the graph, and which gets turned into a comment in the generated source,
which copes with inlining nicely (However,
need to be careful not to thwart optimizations - for example, an earlier version
of this patch defeated the switch-building
detection in merge_if_blocks due to the extra ops).
My initial aim was to add one each time we change source line in the simple
case, but to also add them for other
transformations as appropriate.
I tried a few different places in which to inject the comment ops:
* pypy.objspace.flow.flowcontext.BlockRecorder.bytecode_trace()
* pypy.objspace.flow.flowcontext.FlowExecutionContext.bytecode_trace()
* pypy.objspace.flow.objspace.FlowObjSpace.do_operation()
Doing it within one of the bytecode_trace() methods leads to very large numbers
of comments (one per bytecode, whereas most
bytecodes don't seem to directly generate SpaceOperations). I tried reducing
the number of comments by only emitting a
comment when the line number changes, but I couldn't find a good place to store
the current line: if I'm reading things
correctly the flow objspace creates large numbers of small blocks which then get
merged.
Adding them within do_operation() means that we only get one comment per
"actual" SpaceOperation, so I went with this
approach.
However, it means that we get non-equal results between the Recorder and
Replayer classes, but I fixed this by filtering
out comment ops when comparing the ops seen by Recorder and Replayer.
I added a new pass to simplify.py, to prune the comments per-block after the
flowgraph is built, at the point where
something resembling the final block structure has been reached - see the
comment in the new pass.
Caveat: I haven't yet run the full test suite; doing that now
----------
files: more-readable-c-code-2012-07-18-001.patch
messages: 4598
nosy: dmalcolm, pypy-issue
priority: feature
status: unread
title: Improving readability of generated .c code
________________________________________
PyPy bug tracker <tracker at bugs.pypy.org>
<https://bugs.pypy.org/issue1220>
________________________________________
More information about the pypy-issue
mailing list