[issue18850] xml.etree.ElementTree accepts control chars.

Michele Orrù report at bugs.python.org
Tue Aug 27 17:04:22 CEST 2013


Michele Orrù added the comment:

> Michele, could you elaborate how you would exploit this issue as a security risk?
Sure. What I meant in my message is: assume you have a script that simply stores each message it receives (from stdin, from a tcp stream, whatever) inside an xml tree like 
'<text>{message1}</text><text>{message2}<text>',
and prints the tree on SIGINT.

What I would expect is the xml document not to allow control chars, as "restricted and discouraged", and consistent with lxml. 
What instead happens is that the control chars are not handled, and thus anybody can send control chars in my terminal. Changing the terminal title is a trivial example of those. 

For sure an echo server may have the same issue, but the premises are different, because I expect to print just a byte stream.
Mentioning this fact in the documentation may be a possible solution, but I believe more that keeping consistency with lxml is the right way.

> I mean, I can easily create a (non-)XML-document with control characters manually, and the parser would reject it.
False? The parser is *not* rejecting control chars.
 
> What part of the create-to-serialise process exactly is a problem here?
ElementTree.tostring().

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18850>
_______________________________________


More information about the Python-bugs-list mailing list