unicode memory usage

"Martin v. Löwis" martin at v.loewis.de
Thu Sep 4 15:44:13 EDT 2003


Gary Robinson wrote:

> But I don't know whether that's actually how Python strings work internally.

Python Unicode objects use normally 2 bytes per character, unless Python 
is built in UCS-4 mode, in which case they use 4 bytes per character.

> So, my question: Do unicode strings in Python take substantially more memory
> than classic python strings or not, assuming the strings are generally 99%
> ASCII characters (but not 100%)?

Yes; you can expect that 99% of the storage for characters are null 
bytes, then. Whether this is substantial depends on the total amount of 
storage that you need for string objects, compared to the storage needed 
for other things, or the storage available.

Regards,
Martin





More information about the Python-list mailing list