[Python-Dev] PEP 393 memory savings update

Wed Sep 28 01:17:15 CEST 2011

Great news, Martin!

On Tue, Sep 27, 2011 at 3:56 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> I have redone my memory benchmark, and added a few new
> counters.
>
> The application is a very small Django application. The same
> source code of the app and Django itself is used on all Python
> versions. The full list of results is at
>
> http://www.dcl.hpi.uni-potsdam.de/home/loewis/djmemprof/
>
> Here are some excerpts:
>
> A. 32-bit builds, storage for Unicode objects
> 3.x, 32-bit wchar_t: 6378540
> 3.x, 16-bit wchar_t: 3694694
> PEP 393:             2216807
>
> Compared to the previous results, there are now some
> significant savings even compared to a narrow unicode build.
>
> B. 3.x, number of strings by maxchar:
> ASCII:   35713 (1,300,000 chars)
> Latin-1: 235   (11,000 chars)
> BMP:     260   (700 chars)
> other:   0
> total:   36,000 (1,310,000 chars)
>
> This explains why the savings for shortening ASCII objects
> are significant in this application. I have no good intuition
> how this effect would show for "real" applications. It may be
> that the percentage of ASCII strings (in number and chars) grows
> proportionally with the total number of strings; it may also
> be that the majority of these strings is a certain fixed overhead
> (resulting from Python identifiers and other interned strings).
>
> C. String-ish objects in 2.7 and 3.3-trunk:
>                   2.x         3.x
> #unicode           370      36,000
> #bytes          43,000      14,000
> #total          43,400      50,000
>
> len(unicode)     5,300   1,306,000
> len(bytes)   2,040,000     860,000
> len(total)   2,046,000   2,200,000
>
> (Note: the computations in the results are slightly messed up:
> the number of bytes for bytes objectts is actually the sum
> of the lengths, not the sum of the sizeofs; this gets added
> in the "total" lines to the sum of sizeofs of unicode strings,
> which is non-sensical. The table above corrects this)
>
> As you can see, Python 3 creates more string objects in total.
>
> D. Memory consumption for 2.x, 3.x, PEP 393, accounting both
>   unicode and bytes objects, using 32-bit builds and 32-bit
>   wchar_t:
> 2.x:     3,620,000 bytes
> 3.x:     7,750,000 bytes
> PEP 393: 3,340,000 bytes
>
> This suggests that PEP 393 actually reduces memory consumption
> below what 2.7 uses. This is offset though by "other" (non-string)
> objects, which take 300KB more in 3.x.
>
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)