RE Module Performance

Prasad, Ramit ramit.prasad at jpmorgan.com
Thu Jul 25 15:30:51 EDT 2013


Chris Angelico wrote:
> On Fri, Jul 26, 2013 at 5:07 AM,  <wxjmfauth at gmail.com> wrote:
> > Let start with a simple string \textemdash or \texttendash
> >
> >>>> sys.getsizeof('-')
> > 40
> >>>> sys.getsizeof('a')
> > 26
> 
> Most of the cost is in those two apostrophes, look:
> 
> >>> sys.getsizeof('a')
> 26
> >>> sys.getsizeof(a)
> 8
> 
> Okay, that's slightly unfair (bonus points: figure out what I did to
> make this work; there are at least two right answers) but still, look
> at what an empty string costs:

I like bonus points. :)
>>> a = None 
>>> sys.getsizeof(a)
8

Not sure what the other right answer is...booleans take 12 bytes (on 2.6)

> 
> >>> sys.getsizeof('')
> 25
> 
> Or look at the difference between one of these characters and two:
> 
> >>> sys.getsizeof('aa')-sys.getsizeof('a')
> 1
> >>> sys.getsizeof('--')-sys.getsizeof('-')
> 2
> 
> That's what the characters really cost. The overhead is fixed. It is,
> in fact, almost completely insignificant. The storage requirement for
> a non-ASCII, BMP-only string converges to two bytes per character.
> 
> ChrisA
> --
> http://mail.python.org/mailman/listinfo/python-list


Ramit



This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email.  



More information about the Python-list mailing list