[XML-SIG] Using PyExpat.py

Guido van Rossum guido@digicool.com
Mon, 19 Feb 2001 17:34:16 -0500


> > I'd like to drop support for URLs; I don't think the typical computer
> > is sufficiently networked to make this work well.
> 
> In this case, the typical computer user will have a great deal of trouble 
> using any XML application in any language.  Almost all of them use URIs as 
> basis, and for good reason.  Special support for local files are almost 
> universally a mere convenience.
> 
> Most XML processing specifications mandate that the URI of the XML
> entity that contains an infoset node is used as the basis for
> further processing.  To me, this argues strongly for dropping local
> files rather than URIs if we must choose.  Some XML specs would be
> very difficult to implement properly if the low-level tools became
> file-system-only readers.

Can you give more details of how this is used?  I've got very limited
XML experience, and so far it all falls in the category of "here's a
file; give me a DOM tree for it" or "here's a DOM tree, write it to a
file".  There are no URLs anywhere.  Sometimes instead of a file it'll
be text data read from or written to a database.  But no URLs.

> The Mac people should have spoken to the IETF a decade ago when URLs
> emerged, or a bit later when URIs came out.  I suspect, again that
> if this is the case, they suffer much more pain in XML processing
> than is inflicted on them by PyXML.

That's a pretty intolerant attitude you're displaying there.  They
need not suffer at all if at all times it is clear whether a name is a
URL or a filename.  It's trying to fold the two namespaces into one
that I'm fighting here.

> > I would suggest to have separate APIs depending on the argument type,
> > e.g. p.parseFile(filename), p.parseURL(url),
> > p.parseStream(InputSource), p.parseString(text).  (And no, Java
> > overloading wouldn't help much here, since three out of four APIs have
> > string arguments.)
> 
> Sure, one can add a parseFile, but what do you do with
> 
> <?xml version='1.0'?>
> <!DOCTYPE spam [
>   <!ENTITY foo SYSTEM 'foo.bar'>
> ]>
> <spam>&foo;</spam>
> 
> URI or file?
> 
> Note that this is a trick question, and the "trick" is *exactly* my point.

So explain the trick.  I don't know enough XML to understand what it
means.  I don't even know which thing you are asking about!  spam?
foo?  foo.bar?  &foo;?

--Guido van Rossum (home page: http://www.python.org/~guido/)