Flexible string representation, unicode, typography, ...

Neil Hodgson nhodgson at iinet.net.au
Thu Aug 23 09:57:50 EDT 2012


wxjmfauth at gmail.com:

> Small illustration. Take an a4 page containing 50 lines of 80 ascii
> characters, add a single 'EM DASH' or an 'BULLET' (code points>  0x2000),
> and you will see all the optimization efforts destroyed.
>
>>> sys.getsizeof('a' * 80 * 50)
> 4025
>>>> sys.getsizeof('a' * 80 * 50 + '•')
> 8040

    This example is still benefiting from shrinking the number of bytes 
in half over using 32 bits per character as was the case with Python 3.2:

 >>> sys.getsizeof('a' * 80 * 50)
16032
 >>> sys.getsizeof('a' * 80 * 50 + '•')
16036
 >>>

    Neil



More information about the Python-list mailing list