It's ok to __slots__ for what they were intended
John Nagle
nagle at animats.com
Sun Dec 23 00:27:12 EST 2007
Fredrik Lundh wrote:
> John Nagle wrote:
>
>> I'd like to hear more about what kind of performance gain can be
>> obtained from "__slots__". I'm looking into ways of speeding up
>> HTML parsing via BeautifulSoup. If a significant speedup can be
>> obtained when navigating large trees of small objects, that's worth
>> quite a bit to me.
>
> The following micro-benchmarks are from Python 2.5 on a Core Duo
> machine. C0 is an old-style class, C1 is a new-style class, C2 is a
> new-style class using __slots__:
>
> # read access
> $ timeit -s "import q; o = q.C0(); o.attrib = 1" "o.attrib"
> 10000000 loops, best of 3: 0.133 usec per loop
> $ timeit -s "import q; o = q.C1(); o.attrib = 1" "o.attrib"
> 10000000 loops, best of 3: 0.184 usec per loop
> $ timeit -s "import q; o = q.C2(); o.attrib = 1" "o.attrib"
> 10000000 loops, best of 3: 0.161 usec per loop
>
> # write access
> $ timeit -s "import q; o = q.C0(); o.attrib = 1" "o.attrib = 1"
> 10000000 loops, best of 3: 0.15 usec per loop
> $ timeit -s "import q; o = q.C1(); o.attrib = 1" "o.attrib = 1"
> 1000000 loops, best of 3: 0.217 usec per loop
> $ timeit -s "import q; o = q.C2(); o.attrib = 1" "o.attrib = 1"
> 1000000 loops, best of 3: 0.209 usec per loop
Not much of a win there. Thanks.
>
> > I'm looking into ways of speeding up HTML parsing via BeautifulSoup.
>
> The solution to that is spelled "lxml".
I may eventually have to go to a non-Python solution.
But I've finally made enough robustness fixes to BeautifulSoup that
it's usable on large numbers of real-world web sites. (Only two
exceptions in the last 100,000 web sites processed. If you want
to exercise your HTML parser on hard cases, run hostile-code web
sites through it.)
John Nagle
More information about the Python-list
mailing list