Recursion limit of pickle?

Sat Feb 9 23:09:12 EST 2008

On 2月10日, 上午11時42分, "Gabriel Genellina" <gagsl-... at yahoo.com.ar>
wrote:
> En Sat, 09 Feb 2008 09:49:46 -0200, Victor Lin <Borns... at gmail.com>
> escribi�:
>
> > I encounter a problem with pickle.
> > I download a html from:
>
> >http://www.amazon.com/Magellan-Maestro-4040-Widescreen-Navigator/dp/B...
>
> > and parse it with BeautifulSoup.
> > This page is very huge.
> > When I use pickle to dump it, a RuntimeError: maximum recursion depth
> > exceeded occur.
>
> BeautifulSoup objects usually aren't pickleable, independently of your
> recursion error.
But I pickle and unpickle other soup objects successfully.
Only this object seems too deep to pickle.
>
> py> import pickle
> py> import BeautifulSoup
> py> soup = BeautifulSoup.BeautifulSoup("<html><body>Hello, world!</html>")
> py> print pickle.dumps(soup)
> Traceback (most recent call last):
> ...
> TypeError: 'NoneType' object is not callable
> py>
>
> Why do you want to pickle it? Store the downloaded page instead, and
> rebuild the BeautifulSoup object later when needed.
>
> --
> Gabriel Genellina

Because parsing html cost a lots of cpu time. So I want to cache soup
object as file. If I have to get same page, I can get it from cache
file, even the parsed soup file. My program's bottleneck is on parsing
html, so if I can parse once and unpickle them later, it could save a
lots of time.