weirdness with list()

MRAB python at mrabarnett.plus.com
Sun Feb 28 19:06:31 EST 2021


On 2021-02-28 23:28, Peter Otten wrote:
> On 28/02/2021 23:33, Marco Sulla wrote:
>> On Sun, 28 Feb 2021 at 01:19, Cameron Simpson <cs at cskk.id.au> wrote:
>>> My object represents an MDAT box in an MP4 file: it is the ludicrously
>>> large data box containing the raw audiovideo data; for a TV episode it
>>> is often about 2GB and a movie is often 4GB to 6GB.
>>> [...]
>>> That length is presented via the object's __len__ method
>>> [...]
>>>
>>> I noticed that it was stalling, and investigation revealed it was
>>> stalling at this line:
>>>
>>>      subboxes = list(self)
>>>
>>> when doing the MDAT box. That box (a) has no subboxes at all and (b) has
>>> a very large __len__ value.
>>>
>>> BUT... It also has a __iter__ value, which like any Box iterates over
>>> the subboxes. For MDAT that is implemented like this:
>>>
>>>      def __iter__(self):
>>>          yield from ()
>>>
>>> What I was expecting was pretty much instant construction of an empty
>>> list. What I was getting was a very time consuming (10 seconds or more)
>>> construction of an empty list.
>>
>> I can't reproduce, Am I missing something?
>>
>> marco at buzz:~$ python3
>> Python 3.6.9 (default, Jan 26 2021, 15:33:00)
>> [GCC 8.4.0] on linux
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> class A:
>> ...     def __len__(self):
>> ...             return 1024**3
>> ...     def __iter__(self):
>> ...             yield from ()
>> ...
>>>>> a = A()
>>>>> len(a)
>> 1073741824
>>>>> list(a)
>> []
>>>>>
>>
>> It takes milliseconds to run list(a)
> 
> Looks like you need at least Python 3.8 to see this. Quoting
> https://docs.python.org/3/whatsnew/3.8.html:
> 
> """
> The list constructor does not overallocate the internal item buffer if
> the input iterable has a known length (the input implements __len__).
> This makes the created list 12% smaller on average. (Contributed by
> Raymond Hettinger and Pablo Galindo in bpo-33234.)
> """
> 
I'm not seeing a huge problem here:

Python 3.9.2 (tags/v3.9.2:1a79785, Feb 19 2021, 13:44:55) [MSC v.1928 64 
bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> import time
 >>> class A:
...     def __len__(self):
...         return 1024**3
...     def __iter__(self):
...         yield from ()
...
 >>> a = A()
 >>> len(a)
1073741824
 >>> s = time.time()
 >>> list(a)
[]
 >>> print(time.time() - s)
0.16294455528259277


More information about the Python-list mailing list