[XML-SIG] Using PyExpat.py

Guido van Rossum guido@digicool.com
Tue, 20 Feb 2001 09:41:57 -0500


> * Guido van Rossum
> | 
> | I'd like to drop support for URLs; I don't think the typical
> | computer is sufficiently networked to make this work well.

[Lars]
> Dropping support for URLs not really an option when dealing with XML.
> The XML recommendation states clearly that all system identifiers[1]
> are URIs in XML.
> 
> What this really means is that we have two cases to deal with:
> 
>   - XML software is provided a reference to an XML document
>   - XML document references as used internally by XML software and
>     also as passed back out to client software
> 
> In the second case the references must be URIs, since it is a
> deep-seated assumption in the entire XML family of specifications that
> all such references will be URIs. This is especially clear in the case
> of entity references (as Uche illustrated), but most other XML
> specifications are equally clear on this point, such as the infoset,
> XSLT, XBase and so on.

OK, I understand.

> Of course, in the first case there is no reason why it shouldn't be
> allowed to pass file names into the APIs to have them converted into
> URIs there. In fact, I think there is very good reason to do so, since
> my experience with the Java tools that require URIs have been fairly
> painful. (Who remembers the precise syntax for file URIs on all kinds
> of platforms anyway?)

OK.  That's useful information.

> Outlawing URIs, however is not really an option.

OK, I also understand that.

> [1] What most people would call 'references to external resources',
>     usually files.
> 
> | I would suggest to have separate APIs depending on the argument
> | type, e.g. p.parseFile(filename), p.parseURL(url),
> | p.parseStream(InputSource), p.parseString(text).
> 
> That may be a better option than to have a single function/method, but
> that is really separate from the issue of whether to allow URIs or
> not.

OK, so let's focus on this then: APIs must be clear in whether they
accept a URI or a filename, and not guess based on the form of the
string.

--Guido van Rossum (home page: http://www.python.org/~guido/)