[issue6594] json C serializer performance tied to structure depth on some systems
Raymond Hettinger
report at bugs.python.org
Wed Aug 5 21:35:45 CEST 2009
Raymond Hettinger <rhettinger at users.sourceforge.net> added the comment:
Are you sure that recursion depth is the issue? Have you tried the same
number and kind of objects listed serially (unnested)? This would help
rule-out memory allocation issues and would instead confirm that it has
something to do with the C stack.
It would be helpful if you uploaded your test data strings and timing
suite. Are you able to run a C profile so we can tell where the hotspot
is? Can you run PyYAML over the same data to see if it is similarly
afflicted (yaml is a superset of json).
Also, try timing a repr() serialization of the same data,
x=repr(rootobj). The repr code also uses recursion and it has to build
a big string in memory. It has to visit every node, so it will reveal
whether memory cache misses are the culprit.
Try your timings with GC turned-off so that we can rule that out.
Do you have some option to compile with an alternate memory allocator
(such as dlmalloc). A crummy memory allocator may be the issue since
serialization entails creating many small strings, then joining and
resizing them.
Also, try serializing to /dev/null so that we can exclude fileio issues
(buffering and whatnot).
Sorry for all the requests, but there are many possible culprits and I
think it unlikely that recursion is the cause (much of the code in
Python works recursively -- everything from repr to gc -- so if that
were the problem, everything would run slower, not just json serialization).
----------
assignee: bob.ippolito -> rhettinger
nosy: +rhettinger
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6594>
_______________________________________
More information about the Python-bugs-list
mailing list