RE Module Performance
Chris Angelico
rosuav at gmail.com
Thu Jul 25 15:18:44 EDT 2013
On Fri, Jul 26, 2013 at 5:07 AM, <wxjmfauth at gmail.com> wrote:
> Let start with a simple string \textemdash or \texttendash
>
>>>> sys.getsizeof('–')
> 40
>>>> sys.getsizeof('a')
> 26
Most of the cost is in those two apostrophes, look:
>>> sys.getsizeof('a')
26
>>> sys.getsizeof(a)
8
Okay, that's slightly unfair (bonus points: figure out what I did to
make this work; there are at least two right answers) but still, look
at what an empty string costs:
>>> sys.getsizeof('')
25
Or look at the difference between one of these characters and two:
>>> sys.getsizeof('aa')-sys.getsizeof('a')
1
>>> sys.getsizeof('––')-sys.getsizeof('–')
2
That's what the characters really cost. The overhead is fixed. It is,
in fact, almost completely insignificant. The storage requirement for
a non-ASCII, BMP-only string converges to two bytes per character.
ChrisA
More information about the Python-list
mailing list