[New-bugs-announce] [issue26838] sax.xmlreader.InputSource.setCharacterStream() does not work?

Alan Jenkins report at bugs.python.org
Sun Apr 24 11:58:51 EDT 2016


New submission from Alan Jenkins:

python3-3.4.3-5.fc23-x86_64

So far I spelunked here.  Starting from <https://github.com/kurtmckee/feedparser/issues/30>.  I experimented with using setCharacterStream() instead of setByteStream()

setCharacterStream() is shown in documentation but exercising it fails

>>> help(InputSource)
 |  setCharacterStream(self, charfile)
 |      Set the character stream for this input source. (The stream
 |      must be a Python 2.0 Unicode-wrapped file-like that performs
 |      conversion to Unicode strings.)
 |      
 |      If there is a character stream specified, the SAX parser will
 |      ignore any byte stream and will not attempt to open a URI
 |      connection to the system identifier.

Actually using an InputSource set up this way errors out as follows:

  File "/home/alan/.local/lib/python3.4/site-packages/feedparser-5.2.1-py3.4.egg/feedparser/api.py", line 236, in parse
  File "/usr/lib64/python3.4/site-packages/drv_libxml2.py", line 146, in parse
    source = saxutils.prepare_input_source(source)
  File "/usr/lib64/python3.4/xml/sax/saxutils.py", line 355, in prepare_input_source
    sysidfilename = os.path.join(basehead, sysid)
  File "/usr/lib64/python3.4/posixpath.py", line 79, in join
    if b.startswith(sep):
AttributeError: 'NoneType' object has no attribute 'startswith'

because the character stream is not actually used:

def prepare_input_source(source, base=""):
    """This function takes an InputSource and an optional base URL and
    returns a fully resolved InputSource object ready for reading."""

    if isinstance(source, str):
        source = xmlreader.InputSource(source)
    elif hasattr(source, "read"):
        f = source
        source = xmlreader.InputSource()
        source.setByteStream(f)
        if hasattr(f, "name") and isinstance(f.name, str):
            source.setSystemId(f.name)

    if source.getByteStream() is None:
        sysid = source.getSystemId()
        basehead = os.path.dirname(os.path.normpath(base))
        sysidfilename = os.path.join(basehead, sysid)

----------
components: XML
messages: 264111
nosy: sourcejedi
priority: normal
severity: normal
status: open
title: sax.xmlreader.InputSource.setCharacterStream() does not work?
versions: Python 3.4

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue26838>
_______________________________________


More information about the New-bugs-announce mailing list