152 is faster that 221 ? I think not ...

Tue Aug 28 10:28:35 EDT 2001

On Tue, 28 Aug 2001, Alex Martelli wrote:

> "Ignacio Vazquez-Abrams" <ignacio at openservices.net> wrote in message
> news:mailman.998996203.25913.python-list at python.org...
>     ...
> > It's quite simple. file.readlines() generates a true list in one go,
> whereas
> > xreadlines.xreadlines() creates a generator that has to be called each
> time
> > you want a line. Generators will never be faster than data.
>
> ...except when the data takes up enough physical memory to cause
> page faults, in which case it doesn't take much for the memory-lean
> generator to be faster.  In theory a similar effect could show up
> on a much smaller scale with cache-faults, but I've never observed
> that myself, to my knowledge.
>
>
> > If you test range() versus xrange() for very large values you'll find the
> same
> > thing, even in the same version of Python.
>
> Exactly -- the same thing: as long as the space consume by range
> is not enough to overfill your physical memory and thus cause
> page faults, range will be faster -- over that, xrange will.  To
> test this, better use a machine with not too much physical RAM,
> or a platform which lets you hard-constrain the amount of
> physical pages devoted to a given process.
>
>
> Alex

Very true in both cases. For a graphic demonstration of this, instantiate both
xrange(1e8) and range(1e8) (Hint: Don't ACTUALLY try range(1e8), unless you
feel like hosing your computer. It takes around 400MB just to hold the raw
integers generated; you don't want to know how much it takes to hold the whole
list...).

-- 
Ignacio Vazquez-Abrams  <ignacio at openservices.net>