weirdness with list()

Cameron Simpson cs at cskk.id.au
Sun Feb 28 18:50:50 EST 2021


On 28Feb2021 10:51, Peter Otten <__peter__ at web.de> wrote:
>On 28/02/2021 01:17, Cameron Simpson wrote:
>>I noticed that it was stalling, and investigation revealed it was
>>stalling at this line:
>>
>>     subboxes = list(self)
>>
>>when doing the MDAT box. That box (a) has no subboxes at all and (b) has
>>a very large __len__ value.
[...]
>
>list(iter(self))
>
>should work, too. It may be faster than the explicit loop, but also
>defeats the list allocation optimization.

Yes, very neat. I went with [subbox for subbox in self] last night, but 
the above is better.

[...]
>>Still, thoughts? I'm interested in any approaches that would have let 
>>me
>>make list() fast while keeping __len__==binary_length.
>>
>>I'm accepting that __len__ != len(__iter__) is a bad idea now, though.
>
>Indeed. I see how that train wreck happened -- but the weirdness is not
>the list behavior.

I agree. The only weirdness is that list(empty-iterable) took a very 
long time. Weirdness in the eye of the beholder I guess.

>Maybe you can capture the intended behavior of your class with two
>classes, a MyIterable without length that can be converted into MyList
>as needed.

Hmm. Maybe.

What I've done so far is:

The afore mentioned [subbox for subbox in self] which I'll replace with 
your nicer one today.

Given my BinaryMixin a transcribed_length method which measures the 
length of the binary transcription. For small things that's actually 
fairly cheap, and totally general. By default it is aliased to __len__, 
which still seems a natural thing - the length of the binary object is 
the number of bytes required to serialise it.

The alias lets me override transcribed_length() for bulky things like 
MDAT where (a) transcription _is_ expensive and (b) the source data may 
not be present anyway ("skip" mode), but the measurement of the data 
from the parse is recorded.

And I can disassociate __len__ from transcribed_length() if need be in 
subclasses. I've not done that, given the iter() shuffle above.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Python-list mailing list