[issue6594] json C serializer performance tied to structure depth on some systems

Raymond Hettinger report at bugs.python.org
Wed Aug 5 21:35:45 CEST 2009


Raymond Hettinger <rhettinger at users.sourceforge.net> added the comment:

Are you sure that recursion depth is the issue?  Have you tried the same
number and kind of objects listed serially (unnested)?  This would help
rule-out memory allocation issues and would instead confirm that it has
something to do with the C stack.

It would be helpful if you uploaded your test data strings and timing
suite.  Are you able to run a C profile so we can tell where the hotspot
is?  Can you run PyYAML over the same data to see if it is similarly
afflicted (yaml is a superset of json).

Also, try timing a repr() serialization of the same data,
x=repr(rootobj).  The repr code also uses recursion and it has to build
a big string in memory.  It has to visit every node, so it will reveal
whether memory cache misses are the culprit.  

Try your timings with GC turned-off so that we can rule that out.

Do you have some option to compile with an alternate memory allocator
(such as dlmalloc).  A crummy memory allocator may be the issue since
serialization entails creating many small strings, then joining and
resizing them.

Also, try serializing to /dev/null so that we can exclude fileio issues
(buffering and whatnot).

Sorry for all the requests, but there are many possible culprits and I
think it unlikely that recursion is the cause (much of the code in
Python works recursively -- everything from repr to gc -- so if that
were the problem, everything would run slower, not just json serialization).

----------
assignee: bob.ippolito -> rhettinger
nosy: +rhettinger

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6594>
_______________________________________


More information about the Python-bugs-list mailing list