[Baypiggies] json using huge memory footprint and not releasing

David Lawrence david at bitcasa.com
Fri Jun 15 22:15:45 CEST 2012


When I load the file into json, pythons memory usage spike to about 1.8GB
and I can't seem to get that memory to be released.  I put together a test
case that's very simple:

with open("test_file.json", 'r') as f:
    j = json.load(f)

I'm sorry that I can't provide a sample json file, my test file has a lot
of sensitive information, but for context, I'm dealing with a file in the
order of 240MB.  After running the above 2 lines I have the
previously mentioned 1.8GB of memory in use.  If I then do "del j" memory
usage doesn't drop at all.  If I follow that with a "gc.collect()" it still
doesn't drop.  I even tried unloading the json module and running another
gc.collect.

I'm trying to run some memory profiling but heapy has been churning 100%
CPU for about an hour now and has yet to produce any output.

Does anyone have any ideas?  I've also tried the above using cjson rather
than the packaged json module.  cjson used about 30% less memory but
otherwise displayed exactly the same issues.

I'm running Python 2.7.2 on Ubuntu server 11.10.

I'm happy to load up any memory profiler and see if it does better then
heapy and provide any diagnostics you might think are necessary.  I'm
hunting around for a large test json file that I can provide for anyone
else to give it a go.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/baypiggies/attachments/20120615/e4cae73c/attachment-0001.html>


More information about the Baypiggies mailing list