What am I doing wrong with urllib.urlopen() ?

Hernan M. Foffani hfoffani at yahoo.com
Sun Mar 24 08:18:26 EST 2002


"Jose Rivera" escribió
> The URL calculated :
>
>
http://avisos.elnorte.com/casa_venta_result.asp?fotos=0&Order=order+by+colo
nia&Precio_i=-1&Presentacion=Tabla&Precio_f=-1&PLANTAS=0&pagina=1&id_inmueb
le=3&dia=&RECAMARAS=0&constr_f=-1&COLONIA=0&BANOS=0&Tipo=CASAS&Constr_i=-1&
Terreno_i=-1&Terreno_f=-1&zona=0
>
> works on my PC, try it, and let's see whats wrong.
That page uses DHTML a lot. urlopen works as expected, you get
the same result if you use IE with max security for avisos.elnorte.com
You want to fix this problem first studying the html source of the
page deeply.

> Does anyone have an idea of how I can save it to a DataBase?
Polish the page with HTML Tidy first, then strip all script code tags,
parse the resulting HTML to catch the rows you're interested in
and save them into a DB.

> Is there a better way?
A better way of what? I missed the original post.

Regards,
-Hernan





More information about the Python-list mailing list