[XML-SIG] Using PyExpat.py
Guido van Rossum
guido@digicool.com
Mon, 19 Feb 2001 17:34:16 -0500
> > I'd like to drop support for URLs; I don't think the typical computer
> > is sufficiently networked to make this work well.
>
> In this case, the typical computer user will have a great deal of trouble
> using any XML application in any language. Almost all of them use URIs as
> basis, and for good reason. Special support for local files are almost
> universally a mere convenience.
>
> Most XML processing specifications mandate that the URI of the XML
> entity that contains an infoset node is used as the basis for
> further processing. To me, this argues strongly for dropping local
> files rather than URIs if we must choose. Some XML specs would be
> very difficult to implement properly if the low-level tools became
> file-system-only readers.
Can you give more details of how this is used? I've got very limited
XML experience, and so far it all falls in the category of "here's a
file; give me a DOM tree for it" or "here's a DOM tree, write it to a
file". There are no URLs anywhere. Sometimes instead of a file it'll
be text data read from or written to a database. But no URLs.
> The Mac people should have spoken to the IETF a decade ago when URLs
> emerged, or a bit later when URIs came out. I suspect, again that
> if this is the case, they suffer much more pain in XML processing
> than is inflicted on them by PyXML.
That's a pretty intolerant attitude you're displaying there. They
need not suffer at all if at all times it is clear whether a name is a
URL or a filename. It's trying to fold the two namespaces into one
that I'm fighting here.
> > I would suggest to have separate APIs depending on the argument type,
> > e.g. p.parseFile(filename), p.parseURL(url),
> > p.parseStream(InputSource), p.parseString(text). (And no, Java
> > overloading wouldn't help much here, since three out of four APIs have
> > string arguments.)
>
> Sure, one can add a parseFile, but what do you do with
>
> <?xml version='1.0'?>
> <!DOCTYPE spam [
> <!ENTITY foo SYSTEM 'foo.bar'>
> ]>
> <spam>&foo;</spam>
>
> URI or file?
>
> Note that this is a trick question, and the "trick" is *exactly* my point.
So explain the trick. I don't know enough XML to understand what it
means. I don't even know which thing you are asking about! spam?
foo? foo.bar? &foo;?
--Guido van Rossum (home page: http://www.python.org/~guido/)