help!! *extra* tricky web page to extract data from...

Diez B. Roggisch deets at nospam.web.de
Tue Mar 13 18:26:29 EDT 2007


Paul Rubin schrieb:
> "Diez B. Roggisch" <deets at nospam.web.de> writes:
>> Still, some pages are AJAX, you won't be able to scrape them easily
>> without analyzing the JS code.
> 
> Sooner or later it would be great to have a JS interpreter written in
> Python for this purpose.  It would do all the same operations on an
> HTML/XML DOM that a browser does, basically all the stuff of a browser
> except rendering into pixels.  JS semantics are similar enough to
> Python that maybe the JS could be compiled into Python byte code.

Nice idea, but not really helpful in the end. Besides the rather nasty 
parts of the DOMs that make JS programming the PITA it is, I think the 
whole event-based stuff makes this basically impossible.

Diez



More information about the Python-list mailing list