[XML-SIG] problems with PyXML 0.6.3

Thomas B. Passin tpassin@home.com
Wed, 14 Feb 2001 20:02:49 -0500


This file: business is trickier than it seems, because the RFC is ambiguous
for file: urls.  A pipe character isn't in the rfc at all even though it's
used by some of the browsers.

I strongly suggest that when a local file is intended, that one should use the
file: scheme.  That way, the application doesn't have to guess and it won't
try a spurious url if the file isn't found.  The way it's done in this example
is just asking for continuous trouble, as I guess we're seeing now.

I think we should come to an agreement with the maintainer of the urllib about
the allowed forms for file: schemes.  It's mainly on Windows (and, perhaps,
Macs) that there would be a problem.  My preferred forms are these, for a file
at d:\temp\python\thefile.xml -

1) file:///d:/temp/python/thefile.xml

2) file:///d:\temp\python\thefile.xml

Both of these comply fully with the rfc.  2) is an "opaque" form - no further
parsing would be done by the url processor, it would just pass it to the os.
1) is what you get according to the rfc when you want the url processor to be
able to parse out the path parts.  The processor is supposed to know to
replace slashes by backslashes if appropriate for the os.

Either 1) or 2) would also work for files on a network file system, if you put
the host name in there -

file://host/temp/python/thefile.xml

1) would be more portable, and is my preference.  The processor should be able
to handle both, however.  For backwards compatibility, form 3) should also be
accepted, I suppose:

3) file:d:\temp\python\thefile.xml

This could be negotiated, though.

Let's agree on this and get it working right!

Cheers,

Tom P


Alexandre Fayolle wrote -

> On Wed, 14 Feb 2001, Uche Ogbuji wrote:
>
> > So can we think of a better algorithm than the current "check for file,
and if
> > it doesn't exist, just blindly toss it to urllib)?
>
> If running windows, and the second character of the 'url' is a colon,
> replace it with a pipe and prepend file: to the url?
>
> > This problem affects 4Suite as well.
>
> I had to use a similar hack when generating a CATALOG file for Narval, for
> use with xmlproc, since urllib would choke on C:\fooo\dtd_base\, and whine
> until it got C|\fooo\dtd_base\
>
> Maybe what we need is a new function in os.path or similar that would
> perform the file -> URL conversion described above. This would ease the
> work of application writers. I, for one, would be much more at ease if I
> knew that no implicit assumptions are made on what I pass. If the API
> requires an URI/URL, then this is what it should get.
>
> Opinions?