generator expressions: performance anomaly?

Antoon Pardon apardon at forel.vub.ac.be
Tue Jan 18 07:42:39 EST 2005


Op 2005-01-18, Steve Holden schreef <steve at holdenweb.com>:
> Antoon Pardon wrote:
>
>> Op 2005-01-18, Nick Coghlan schreef <ncoghlan at iinet.net.au>:
>> 
>>>Raymond Hettinger wrote:
>>>
>>>>[Delaney, Timothy C]
>>>>
>>>>
>>>>>Nick's other suggestion - that genexps propagate __len__ - might
>>>>>still be interesting. Of course, it would only be applicable for
>>>>>unconditional genexps(i.e. no if clause).
>>>>
>>>>Length transparency for iterators is not as general as one would expect.  I once
>>>>spent a good deal of effort exploring where it made sense, and I was surprised
>>>>to find that it only rarely works out.  Length transparency is an unexpectedly
>>>>thorny subject with many dead-ends which precludes a fully general solution such
>>>>as that proposed by Nick.
>>>>
>>>>For a recap of my research, see the docstring for Lib/test/test_iterlen.py .
>>>
>>>"""The situation slightly more involved whenever an object allows length 
>>>mutation during iteration. """
>>>
>>>Ouch. Nice understatement.
>>>
>>>It's rather unfortunate that we can't make use of the length information even 
>>>when the source *doesn't* mutate, though. I'll have to think some more to see if 
>>>I can come up with any concrete ideas for you to shoot down :)
>> 
>> 
>> Something else I was thinking about. I think it would be nice if the
>> python compilor could figure out whether a genexp in a list or tuple
>> expression always generates the same list or tuple and then instead
>> of generating code would generate the list or tuple in place.
>> 
> Since it doesn't yet optimize 2+5 to a constant-folded 7 you should 
> realize that you are suggesting a large increase in the compiler's 
> analytical powers.

Well I can dream, cant I?

> I agree it would be nice under certain circumstances, but don't forget 
> that unlike list comprehensions (for which it would be even nicer) the 
> whole point of generator expressions is often to defer the generation of 
> the individual items until they are required and thereby relieve stress 
> on memory.
>
> As an edge case to demonstrate the point, what about a constant but 
> infinite sequence?

Maybe I was not clear enough. I meant to limit it to cases such as

   lst = list(genexp)
   tpl = tuple(genexp)


Since in such cases the object is build in memory any way, I don't
think it would be a problem of having them prebuilt in memory, or am
I missing something?

-- 
Antoon Pardon



More information about the Python-list mailing list