[XML-SIG] Using PyExpat.py

Guido van Rossum guido@digicool.com
Tue, 20 Feb 2001 09:51:32 -0500


[Uche]
> Sorry.

Accepted. :-)

> Basically, it's what you do with the DOM, and especially how attributes, 
> system identifiers and other such creatures are interpreted.
> 
> Basically, parseFile or parseUri in a top-level URI is typically
> only a small cross-section of the usage pattern in any XML
> processor.  Other functions such as Stylesheet processing,
> XIncludes, xml:base, RDF, and pretty much anything else, gets these
> strings and are *required* to interpret these as URIs.
> 
> If they were originally interpreted purely as files, then all the
> points of confusion you pointed out are immediately compounded as
> the system tries to reconcile the relative URIs against the "base
> URI" which is actually a file system file.
> 
> This is actually a problem that I have seen people run into far more
> often than any worries about computers not having network
> connections.  I'be been sorely tempted to remove file support just
> because it eliminates confusion with the large body of XML
> processing that requires relative URI normalization and resolution.

OK, I think I understand the issues a bit better now.  When XML docs
contain references to other things, they typically use (absolute or
relative) URL references.  I'm guessing that this means that the
separator is always "/" and the parent directory is always represented
by "..".  Fine.

But I still maintain that the API used by the application should be
clear and explicit about whether it is naming a local file or a URI.
Then parseFile(f) can call parseURI("file:" + f) [1] internally and
parseURI can set the proper base URI.  [1]: don't take this literally;
reality is more complicated than tacking "file:" onto the front.  On
non-Unix platforms, use macurl2path.pathname2url(f) on the Mac, and
nturl2path on DOS/Windows.

--Guido van Rossum (home page: http://www.python.org/~guido/)