Performance of int/long in Python 3

Tue Apr 2 14:50:10 EDT 2013

On Apr 2, 11:22 pm, jmfauth <wxjmfa... at gmail.com> wrote:
> On 2 avr, 18:57, rusi <rustompm... at gmail.com> wrote:
>
>
>
>
>
>
>
>
>
> > On Apr 2, 8:17 pm, Ethan Furman <et... at stoneleaf.us> wrote:
>
> > > Simmons (too many Steves!), I know you're new so don't have all the history with jmf that many
> > > of us do, but consider that the original post was about numbers, had nothing to do with
> > > characters or unicode *in any way*, and yet jmf still felt the need to bring unicode up.
>
> > Just for reference, here is the starting para of Chris' original mail
> > that started this thread.
>
> > > The Python 3 merge of int and long has effectively penalized
> > > small-number arithmetic by removing an optimization. As we've seen
> > > from PEP 393 strings (jmf aside), there can be huge benefits from
> > > having a single type with multiple representations internally. Is
> > > there value in making the int type have a machine-word optimization in
> > > the same way?
>
> > ie it mentions numbers, strings, PEP 393 *AND jmf.*  So while it is
> > true that jmf has been butting in with trollish behavior into
> > completely unrelated threads with his unicode rants, that cannot be
> > said for this thread.
>
> -----
>
> That's because you did not understand the analogy, int/long <-> FSR.
>
> One another illustration,
>
> >>> def AddOne(i):
>
> ...     if 0 < i <= 100:
> ...         return i + 10 + 10 + 10 - 10 - 10 - 10 + 1
> ...     elif 100 < i <= 1000:
> ...         return i + 100 + 100 + 100  + 100 - 100 - 100 - 100 - 100
> + 1
> ...     else:
> ...         return i + 1
> ...
>
> Do it work? yes.
> Is is "correct"? this can be discussed.
>
> Now replace i by a char, a representent of each "subset"
> of the FSR, select a method where this FST behave badly
> and take a look of what happen.
>
> >>> timeit.repeat("'a' * 1000 + 'z'")
>
> [0.6532032148133153, 0.6407248807756699, 0.6407264561239894]>>> timeit.repeat("'a' * 1000 + '9'")
>
> [0.6429508479509245, 0.6242782443215589, 0.6240490311410927]
>
>
>
> >>> timeit.repeat("'a' * 1000 + '€'")
>
> [1.095694927496563, 1.0696347279235603, 1.0687741939041082]>>> timeit.repeat("'a' * 1000 + 'ẞ'")
>
> [1.0796421281222877, 1.0348612767961853, 1.035325216876231]>>> timeit.repeat("'a' * 1000 + '\u2345'")
>
> [1.0855414137412112, 1.0694677410017164, 1.0688096392412945]
>
>
>
> >>> timeit.repeat("'œ' * 1000 + '\U00010001'")
>
> [1.237314015362017, 1.2226262553064657, 1.21994619397816]>>> timeit.repeat("'œ' * 1000 + '\U00010002'")
>
> [1.245773635836997, 1.2303978424029651, 1.2258257877430765]
>
> Where does it come from? Simple, the FSR breaks the
> simple rules used in all coding schemes (unicode or not).
> 1) a unique set of chars
> 2) the "same" algorithm for all chars.

Can you give me a source for this requirement?
Numbers are after all numbers. SO we should use the same code/
algorithms/machine-instructions for floating-point and integers?

>
> And again that's why utf-8 is working very smoothly.

How wonderful. Heres a suggestion.
Code up the UTF-8 and any of the python string reps in C and profile
them.
Please come back and tell us if UTF-8 outperforms any of the python
representations for strings on any operation (except straight copy).

>
> The "corporates" which understood this very well and
> wanted to incorporate, let say, the used characters
> of the French language had only the choice to
> create new coding schemes (eg mac-roman, cp1252).
>
> In unicode, the "latin-1" range is real plague.
>
> After years of experience, I'm still fascinated to see
> the corporates has solved this issue easily and the "free
> software" is still relying on latin-1.
> I never succeed to find an explanation.
>
> Even, the TeX folks, when they shifted to the Cork
> encoding in 199?, were aware of this and consequently
> provides special package(s).
>
> No offense, this is in my mind why "corporate software"
> will always be "corporate software" and "hobbyist software"
> will always stay at the level of "hobbyist software".
>
> A French windows user, understanding nothing in the
> coding of characters, assuming he is aware of its
> existence (!), has certainly no problem.
>
> Fascinating how it is possible to use Python to teach,
> to illustrate, to explain the coding of the characters. No?
>
> jmf

You troll with eclat and elan!