Micro Python -- a lean and efficient implementation of Python 3

wxjmfauth at gmail.com wxjmfauth at gmail.com
Tue Jun 10 15:27:26 EDT 2014


Le samedi 7 juin 2014 04:20:22 UTC+2, Tim Chase a écrit :
> On 2014-06-06 09:59, Travis Griggs wrote:
> 
> > On Jun 4, 2014, at 4:01 AM, Tim Chase wrote:
> 
> > > If you use UTF-8 for everything
> 
> > 
> 
> > It seems to me, that increasingly other libraries (C, etc), use
> 
> > utf8 as the preferred string interchange format.
> 
> 
> 
> I definitely advocate UTF-8 for any streaming scenario, as you're
> 
> iterating unidirectionally over the data anyways, so why use/transmit
> 
> more bytes than needed.  The only failing of UTF-8 that I've found in
> 
> the real world(*) is when you have to requirement of constant-time
> 
> indexing into strings.
> 
> 
> 
> -tkc

And once again, just an illustration,

>>> timeit.repeat("(x*1000 + y)", setup="x = 'abc'; y = 'z'")
[0.9457552436453511, 0.9190932610143818, 0.9322044912393039]
>>> timeit.repeat("(x*1000 + y)", setup="x = 'abc'; y = '\u0fce'")
[2.5541921791045183, 2.52434366066052, 2.5337417948967413]
>>> timeit.repeat("(x*1000 + y)", setup="x = 'abc'.encode('utf-8'); y = 'z'.encode('utf-8')")
[0.9168235779232532, 0.8989583403075017, 0.8964204541650247]
>>> timeit.repeat("(x*1000 + y)", setup="x = 'abc'.encode('utf-8'); y = '\u0fce'.encode('utf-8')")
[0.9320969737165115, 0.9086006535332558, 0.9051715140790861]
>>> 
>>> 
>>> sys.getsizeof('abc'*1000 + '\u0fce')
6040
>>> sys.getsizeof(('abc'*1000 + '\u0fce').encode('utf-8'))
3020
>>>


But you know, that's not the problem.

When a see a core developper discussing benchmarking,
when the same application using non ascii chars become
1, 2, 5, 10, 20 if not more, slower comparing to pure
ascii, I'm wondering if there is not a serious problem
somewhere.

(and also becoming slower that Py3.2)

BTW, very easy to explain.

I do not understand why the "free, open, what-you-wish-here, ... "
software is so often pushing to the adoption of serious
corporate products.

jmf



More information about the Python-list mailing list