[issue38941] xml.etree.ElementTree.Element inconsistent warning for bool

Stefan Behnel report at bugs.python.org
Sun Dec 1 09:15:33 EST 2019


Stefan Behnel <stefan_ml at behnel.de> added the comment:

Not so quick. :) You're probably aware of the details, but still, let me state clearly what this is about, to make sure that we're all on the same page here.
(I'm also asking in Serhiy, because he did quite some work on ET in the past and has experience with it.)

The Element class in ElementTree represents an XML tag, but it also mimics a sequence container of child elements. Its behaviour is quite list-like, in that it supports indexing and slicing of its children "list", and also has an "append()" method to add one at the end. All of this is a nice and simple API, so what's the catch?

The gotcha is the behaviour of "__bool__". Following the sequence protocol, the truth value of an Element depends on its children. No children – false. Has children, true. While this makes sense from the sequence protocol side, I don't think it is what users expect. In Python, it is very common to say "if something: …" in order to test if "something" is "a thing". When doing this with Elements, users usually don't think of the element's children but the Element itself, often because they got it back from from kind of tree search, as in the OP's example using ".find()".

I would argue that this is brittle, because it works when testing with Elements that happen to have children, but it fails when the Element that was found has no children. The intention of the "FutureWarning" was to allow changing the behaviour at some point to always return true as the truth value, i.e. to remove the "__bool__" implementation all together.

This
1) fixes the behaviour for broken code that intended to test if it "is a thing"
2) does not change the behaviour when testing Elements (for whatever reason) that have children
3) returns true instead of false when truth-testing Elements that have no children, whether the code originally expected this or not.

The main question now is: what is more common? Code that only accidentally works because it didn't get in touch with child-free (leaf-)elements yet? Or code that would start failing with this change because it really wants to find out if an Element has children by truth-testing it.

I'm sure that both exist, because I have seen both, and my very limited *feeling* from what I've seen is that there's more brittle code out there than intentional code that truth-tests Elements, and that future users are also more likely to get it wrong until they get bitten by it and learn the actual meaning of the truth-test.

But, we don't know, and this isn't something to easily find out via a code search. I'm personally more leaning towards changing the behaviour to help new users, but lacking enough evidence, not changing something long-standing is always the safer choice. I'm aware that I will have to take the "final" decision in the end, but would be happy to hear some more intuitions, ideas or even data points first.

----------
nosy: +serhiy.storchaka

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue38941>
_______________________________________


More information about the Python-bugs-list mailing list