[Patches] [ python-Patches-479615 ] Fast-path for interned string compares

noreply@sourceforge.net noreply@sourceforge.net
Tue, 12 Nov 2002 05:25:26 -0800


Patches item #479615, was opened at 2001-11-08 15:19
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=479615&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: M.-A. Lemburg (lemburg)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Fast-path for interned string compares

Initial Comment:
This patch adds a fast-path for comparing equality of interned strings. 

The patch boosts performance for comparing identical string objects 
by some 20% on my machine while not causing any noticable slow-down 
for other operations (according to tests done with pybench).

More infos and benchmarks later...


----------------------------------------------------------------------

Comment By: Armin Rigo (arigo)
Date: 2002-11-12 13:25

Message:
Logged In: YES 
user_id=4771

It seems to me that the whole status of interned strings is
not clear from the user's perspective.  Maybe we should
avoid putting more emphasis on it.  Deprecating intern() in
favor of sys.intern() would even look like a good thing to
do.

Besides, in the use case you describe, you can compare
tokens with "is" instead of "==" as you know for sure that
you are comparing two explicitely interned strings.  That's
a hack, but calling intern() in the first place already
looks like a hack.

I'd vote against it, but if the patch is accepted don't
forget to change the constants EQ, LE,... into PyCmp_EQ,
PyCmp_LE,...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-08-08 21:16

Message:
Logged In: YES 
user_id=38388

I still consider the patch worth adding. The application space
where it helps may be small, but also important: it can
massively
speed up parsers which use interned strings as tokens.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-08-08 21:07

Message:
Logged In: YES 
user_id=21627

Is there any progress on this patch, or should it be
considered withdrawn?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-03-23 23:35

Message:
Logged In: YES 
user_id=35752

Attached is an updated version of this patch.  I'm -0 on it
since it doesn't seem to help much except for artificial
benchmarks.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-11-08 15:26

Message:
Logged In: YES 
user_id=38388

Output from pybench comparing today's CVS Python with patch (eqpython) and without patch (stdpython):

PYBENCH 1.0

Benchmark: eqpython.bench (rounds=10, warp=20)

Tests:                              per run    per oper.    diff *)
------------------------------------------------------------------------
          BuiltinFunctionCalls:     125.55 ms    0.98 us    -1.68%
           BuiltinMethodLookup:     180.10 ms    0.34 us    +1.75%
                 CompareFloats:     107.30 ms    0.24 us    +2.04%
         CompareFloatsIntegers:     185.15 ms    0.41 us    -0.05%
               CompareIntegers:     163.50 ms    0.18 us    -1.77%

        CompareInternedStrings:      79.50 ms    0.16 us   -20.78%
^^^^^^^^^^^^^^^^^^^^ This is the interesting line :-) ^^^^^^^^^^^^^^^^^^^^^^^^^^

                  CompareLongs:     110.25 ms    0.24 us    +0.09%
                CompareStrings:     143.40 ms    0.29 us    +2.14%
                CompareUnicode:     118.00 ms    0.31 us    +1.68%
                 ConcatStrings:     189.55 ms    1.26 us    -1.61%
                 ConcatUnicode:     226.55 ms    1.51 us    +1.34%
               CreateInstances:     202.35 ms    4.82 us    -1.87%
       CreateStringsWithConcat:     221.00 ms    1.11 us    +0.45%
       CreateUnicodeWithConcat:     240.00 ms    1.20 us    +1.27%
                  DictCreation:     213.25 ms    1.42 us    +0.47%
             DictWithFloatKeys:     263.50 ms    0.44 us    +1.15%
           DictWithIntegerKeys:     158.50 ms    0.26 us    -1.86%
            DictWithStringKeys:     147.60 ms    0.25 us    +0.75%
                      ForLoops:     144.90 ms   14.49 us    -4.64%
                    IfThenElse:     174.15 ms    0.26 us    -0.00%
                   ListSlicing:      88.80 ms   25.37 us    -1.11%
                NestedForLoops:     136.95 ms    0.39 us    +3.01%
          NormalClassAttribute:     177.80 ms    0.30 us    -2.68%
       NormalInstanceAttribute:     166.85 ms    0.28 us    -0.54%
           PythonFunctionCalls:     152.20 ms    0.92 us    +1.40%
             PythonMethodCalls:     133.70 ms    1.78 us    +1.60%
                     Recursion:     119.45 ms    9.56 us    +0.04%
                  SecondImport:     124.65 ms    4.99 us    -6.03%
           SecondPackageImport:     130.70 ms    5.23 us    -5.73%
         SecondSubmoduleImport:     161.65 ms    6.47 us    -5.88%
       SimpleComplexArithmetic:     245.50 ms    1.12 us    +2.08%
        SimpleDictManipulation:     108.50 ms    0.36 us    +0.05%
         SimpleFloatArithmetic:     125.80 ms    0.23 us    +0.84%
      SimpleIntFloatArithmetic:     128.50 ms    0.19 us    -1.46%
       SimpleIntegerArithmetic:     128.45 ms    0.19 us    -0.77%
        SimpleListManipulation:     159.15 ms    0.59 us    -5.32%
          SimpleLongArithmetic:     189.55 ms    1.15 us    +2.65%
                    SmallLists:     293.70 ms    1.15 us    -5.26%
                   SmallTuples:     230.00 ms    0.96 us    +0.44%
         SpecialClassAttribute:     175.70 ms    0.29 us    -2.79%
      SpecialInstanceAttribute:     199.70 ms    0.33 us    -1.55%
                StringMappings:     196.85 ms    1.56 us    -2.48%
              StringPredicates:     133.00 ms    0.48 us    -8.28%
                 StringSlicing:     165.45 ms    0.95 us    -3.47%
                     TryExcept:     193.60 ms    0.13 us    +0.57%
                TryRaiseExcept:     175.40 ms   11.69 us    +0.69%
                  TupleSlicing:     156.85 ms    1.49 us    -0.00%
               UnicodeMappings:     175.90 ms    9.77 us    +1.76%
             UnicodePredicates:     141.35 ms    0.63 us    +0.78%
             UnicodeProperties:     184.35 ms    0.92 us    -2.10%
                UnicodeSlicing:     179.45 ms    1.03 us    -1.10%
------------------------------------------------------------------------
            Average round time:    9855.00 ms               -1.13%

*) measured against: stdpython.bench (rounds=10, warp=20)

As you can see, the rest of the results don't change much and the
ones that do indicate some additional benefit gained by the patch.
All slow-downs are way below the noise limit of around 5-10% (depending
the platforms/machine/compiler).



----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=479615&group_id=5470