It's ok to __slots__ for what they were intended

John Nagle nagle at animats.com
Sun Dec 23 00:27:12 EST 2007


Fredrik Lundh wrote:
> John Nagle wrote:
> 
>>      I'd like to hear more about what kind of performance gain can be
>> obtained from "__slots__".  I'm looking into ways of speeding up
>> HTML parsing via BeautifulSoup.  If a significant speedup can be
>> obtained when navigating large trees of small objects, that's worth
>> quite a bit to me.
> 
> The following micro-benchmarks are from Python 2.5 on a Core Duo 
> machine.  C0 is an old-style class, C1 is a new-style class, C2 is a 
> new-style class using __slots__:
> 
> # read access
> $ timeit -s "import q; o = q.C0(); o.attrib = 1" "o.attrib"
> 10000000 loops, best of 3: 0.133 usec per loop
> $ timeit -s "import q; o = q.C1(); o.attrib = 1" "o.attrib"
> 10000000 loops, best of 3: 0.184 usec per loop
> $ timeit -s "import q; o = q.C2(); o.attrib = 1" "o.attrib"
> 10000000 loops, best of 3: 0.161 usec per loop
> 
> # write access
> $ timeit -s "import q; o = q.C0(); o.attrib = 1" "o.attrib = 1"
> 10000000 loops, best of 3: 0.15 usec per loop
> $ timeit -s "import q; o = q.C1(); o.attrib = 1" "o.attrib = 1"
> 1000000 loops, best of 3: 0.217 usec per loop
> $ timeit -s "import q; o = q.C2(); o.attrib = 1" "o.attrib = 1"
> 1000000 loops, best of 3: 0.209 usec per loop

    Not much of a win there.  Thanks.
> 
>  > I'm looking into ways of speeding up HTML parsing via BeautifulSoup.
> 
> The solution to that is spelled "lxml".

    I may eventually have to go to a non-Python solution.
But I've finally made enough robustness fixes to BeautifulSoup that
it's usable on large numbers of real-world web sites.  (Only two
exceptions in the last 100,000 web sites processed.  If you want
to exercise your HTML parser on hard cases, run hostile-code web
sites through it.)

					John Nagle



More information about the Python-list mailing list