Py 3.3, unicode / upper()

wxjmfauth at gmail.com wxjmfauth at gmail.com
Thu Dec 20 14:40:21 EST 2012


Le mercredi 19 décembre 2012 22:31:42 UTC+1, Ian a écrit :
> On Wed, Dec 19, 2012 at 2:18 PM,  <wxjmfauth at gmail.com> wrote:
> 
> > latin-1 (iso-8859-1) ? are you sure ?
> 
> 
> 
> Yes.
> 
> 
> 
> >>>> sys.getsizeof('a')
> 
> > 26
> 
> >>>> sys.getsizeof('ab')
> 
> > 27
> 
> >>>> sys.getsizeof('aé')
> 
> > 39
> 
> 
> 
> Compare to:
> 
> 
> 
> >>> sys.getsizeof('a\u0100')
> 
> 42
> 
> 
> 
> The reason for the difference you posted is that pure ASCII strings
> 
> have a further optimization, which I glossed over and which is purely
> 
> a savings in overhead:
> 
> 
> 
> >>> sys.getsizeof('abcde') - sys.getsizeof('a')
> 
> 4
> 
> >>> sys.getsizeof('ábçdê') - sys.getsizeof('á')
> 
> 4

-----

I know all of this. And this is exactly, what I explained.
I do not care about this optimization. I'm not an ascii user.
As a non ascii user, this optimization is just irrelevant.

What should a Python user think, if he sees his strings
are comsuming more memory just because he uses non ascii
characters or he sees his strings are changing just because
he "uppercases" them.
Unicode is here to serve anybody.

jmf



More information about the Python-list mailing list