What am I doing wrong with urllib.urlopen() ?

Bengt Richter bokr at oz.net
Tue Mar 19 21:52:17 EST 2002


On 19 Mar 2002 16:17:15 -0800, jriveramerla at yahoo.com (Jose Rivera) wrote:

>Hi...
>
>Thanks in advance for any help...
>
>I want to retrieve data from the web for historic research 
>about real state prices, fetching the info from a newspaper page, 
>for personal use, loading this to MySQL DataBase for later study on
>trends, but I don't want to be doing this procedure manually every
>day.
>
>But this routine just gives me an error, please try it 
>and see whats wrong?, I have not found what am I missing.
>
>When I try the address genereated by the code directly on the
>iExplorer.EXE, it works... but not on Python...
>
>This is the routine:
>--------------------
>import os
>import urllib
>
>params={}
>params["pagina"] = 1
>params["Presentacion"] = "Tabla"
>params["Tipo"] = "CASAS"
>params["Order"] = "Order By Fecha Desc"
                   "order by colonia" if you want to duplicate the address below [1]
>params["id_inmueble"] = "3"
>params["zona"] = "0"
>params["COLONIA"] = "0"
>params["RECAMARAS"] = "0"
>params["BANOS"] = "0"
>params["dia"] = 	''
>params["PLANTAS"] = "0"
>params["Constr_i"] = "-1"
>params["constr_f"] = "-1"
>params["Terreno_i"] = "-1"
>params["Terreno_f"] = "-1"
>params["Precio_i"] = "-1"
>params["Precio_f"] = "-1"
>params["fotos"] = "0"
>
># This is the address calculated
>#http://avisos.elnorte.com/casa_venta_result.asp?fotos=0&Order=order+by+colonia&Precio_i=-1&Presentacion=Tabla&Precio_f=-1&PLANTAS=0&pagina=1&id_inmueble=3&dia=&RECAMARAS=0&constr_f=-1&COLONIA=0&BANOS=0&Tipo=CASAS&Constr_i=-1&Terreno_i=-1&Terreno_f=-1&zona=0
  ^...[1]
>
That is not the address calculated. The difference (other than upper case HTTP://AVISOS.ELNORTE.COM) is
Order=order+by+colonia
Order=Order+By+Fecha+Desc


>pms=urllib.urlencode(params)
>direccion="HTTP://AVISOS.ELNORTE.COM/casa_venta_result.asp?%s"%pms
>#print direccion
>f=urllib.urlopen(direccion)
>
>ff=open('xc.htm','w')
>ff.write('<strong>'+direccion+'<\br>')
                               ^^^^^^^  '</strong>' ??

>ff.write(f.read())
>ff.close
>
I think you may need a
 del ff
after
 ff.close
to let go of xc.htm so start can use it, though I think that's a bad idea.

In general this would be dangerous, since it will make the browser
think the page is safe (to it, it's coming from your disk this way).

>os.system('start xc.htm')
Better to look at it with a non-html-executing editor, e.g.,
 os.system('start write xc.htm')

I haven't a clue as to what parameter formats are for that site, but it returns
------<snip>---------
 <font face="Arial" size=2>
<p>Microsoft VBScript runtime </font> <font face="Arial" size=2>error '800a000d'</font>
<p>
<font face="Arial" size=2>Type mismatch: '[string: ""]'</font>
<p>
<font face="Arial" size=2>/barra.asp</font><font face="Arial" size=2>, line 87</font> 
------<snip>---------

And when I paste your original address line into IE5, it also returns
------<snip>---------
Microsoft VBScript runtime error '800a000d' 

Type mismatch: '[string: ""]' 

/barra.asp, line 87 
------<snip>---------

So maybe the site has a problem, or some parameter is the wrong format. I can't help you there,
unless you have a definition of the format. An exact copy of an URL that works from the browser
would help. The one in the comment is not it.

Regards,
Bengt Richter




More information about the Python-list mailing list