urllib problem with "version" now using Python 2.0

Greg Jorgensen gregj at pobox.com
Wed Oct 25 03:03:39 EDT 2000


"Gordon Williams" <g_will at cyberus.ca> wrote in message
news:8t45ej$2o4l$1 at news2.ottawa.cyberus.ca...
> I have been using a script to extract some information from web pages
using
> Python 1.5.2 for some time without problem.  Since I have changed to
Python
> 2.0 (final) I have an .html error page sent back to me rather than the
page
> that I wanted.  The error message is:
>
> Template: D:\Servers\The Fund Library
2K\tfl\FundCompany\CompanyDetail\p_CompanyDetail.cfm
> Browser: Python-urllib/1.13
> Query: Tab=price&SR=1&Sort=Fund_Name_TFL&Order=ASC&Company=40
> Details: Parameter 2 of function Left which is now "0" must be a positive
> integer
>
> The error occurred while evaluating the expression:
>
>  Version = Left(Version,Find(" ",Version))
>
> The error occurred while processing an element with a general identifier
of
> (CFSET), occupying document position (29:1) to (29:49).
>
> I'm not sure what "Version" is referring to in the error message (maybe
some
> header information???). Some changed have been made to the urllib module
> since python 1.5.2 and I am wondering if a bug was introduced.
>
> Any ideas??


The error message is coming from the Cold Fusion code in the page you are
retrieving. There is a bug in the Cold Fusion code: it expects to find a
space in the variable named Version, but there isn't one, so it crashes.
I'll bet $1.00 that the CF code is trying to figure out the browser type
from the HTTP USER-AGENT header, and it's failing because urllib is sending
a value that doesn't have a space in it (python-urllib/1.13). I don't have
1.52 installed anymore but it probably sent a different value for
USER-AGENT.

Try this:

import urllib

u = urllib.URLOpener()
u.addheaders[0] = ("User-agent", "Python-urllib 1.13")
f = u.open(your-full-url)
f.read()

If that works, you should tell the author of the offending .cfm page to make
the browser detection code more robust:

<cfset p = Find(" ", Version)>
<cfif p eq 0>
    <cfset p = Find("/", Version)>
</cfif>
<cfif p gt 0>
    <cfset Version = Left(Version, p)>
</cfif>

Until that happens, you can either use the code above instead of
urllib.open(), or edit Python/Libs/urllib.py in the URLopen class. Change:

    version = "Python-urllib/%s" % __version__

to

    version = "Python-urllib %s" % __version__

--
Greg Jorgensen
Deschooling Society
Portland, Oregon, USA
gregj at pobox.com





More information about the Python-list mailing list