webchecker on Windows

Des Barry desb at desb.demon.co.uk
Mon May 3 10:54:09 EDT 1999


In article <5logkbrndp.fsf at eric.cnri.reston.va.us>, Guido van Rossum
<guido at eric.cnri.reston.va.us> writes
>Des Barry <desb at desb.demon.co.uk> writes:
>
>> I have just recently downloaded 1.5.2(final) and tried to use webchecker
>> on a local file tree.
>> 
>> I am unable to get it to work - as it did in 1.5.2b2 (with a patch
>> applied to urllib)
>> 
>
>Unfortunately, you're right.  I think that the change has to do with
>the changes in urllib.py regarding when to use url2pathname() and
>pathname2url() -- the new policy is much more useful, but webchecker
>was counting on the old policy.  (The policy change is that the url
>argument to open(), open_file(), open_local_file() and the like must
>always be in url format.)
>
>Below is a patch that I think makes it work, but it still requires
>that you use forward slashes in the file: URL you give it.  It
>supports drive letters but only if you use the form "file:/D|/path";
>the form "file:///D|/path" doesn't seem to work due to the way
>urlparse works.
>
>I hope someone else can continue the analysis from here...
>
>--Guido van Rossum (home page: http://www.python.org/~guido/)

On further investigation I find that urlparse.py and nturlpath.py have
also been touched.

As a simple test I have tried urlparse.urlparse and urlparse.urlunparse
on file: and found them not to be symmetric (same in, same out)
This is also the case for nturlpath.url2pathname and
nturlpath.pathname2url

Before going any further, was it intended to break this symmetry? and if
so, what is the reasoning behind these changes?

According to RFC1808 (in my interpretation) all use of local files
should be of the form:
  file:///user/etc/xxx.htm - for unix and
  file:///C|/dir1/dir2/test.htm - for windows

That is, I believe that
  file:/C|/dir1/dir2/test.htm is illegal

-- 
Des Barry




More information about the Python-list mailing list