[issue2174] xml.sax.xmlreader does not support the InputSource protocol

Yitz Gale report at bugs.python.org
Sun Feb 24 15:53:30 CET 2008


Yitz Gale added the comment:

So I think there are two possibilities:

1. Use a special value for getSourceEnconding(),
like "unicode", to indicate that this is a
unicode character stream and not a byte stream.

2. Provide yet another method in the XMLReader
interface: sourceIsCharacterStream(), returning
a bool.

There is a more drastic option:

3. Since expat doesn't support this stuff
anyway, and perhaps not too many people
have written parsers that do support it,
dumb down the InputSource interface.

Specifically, deprecate setCharacterStream(),
getCharacterStream(), setEncoding() and
getEncoding(), none of which are used by
expat. Parsers should read the XML from
the byte stream and use that to determine
the encoding.

That may upset some implementors of XML
libraries though. They would each have to go
to some trouble to provide their own
proprietary and possibly incompatible
mechanisms for this, if they need it.

Perhaps a compromise fourth path would
be to have subclasses of InputSource for
the two cases of character stream and
byte stream.

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2174>
__________________________________


More information about the Python-bugs-list mailing list