[Tutor] __iter__ loops, partitioning list among children

Eric Abrahamsen eric at ericabrahamsen.net
Tue Aug 26 07:36:40 CEST 2008


I do apologize for the large quantities of confusing description –  
articulating the problem here has helped me understand exactly what it  
is I'm after (though it hasn't improved my code!), and I've got a  
better grasp of the problem now than I did when I first asked.

It isn't so much that I need to protect against crazy users, but that  
the pattern I'm making is very flexible, and could be used in vastly  
different ways. I want to make sure that it operates on efficient  
principles, so that people will get the best performance out of it no  
matter how they use it.

So my test case: a Month has a 'child' attribute pointing at Week,  
which has a 'child' attribute pointing at Day, so they all know what  
kind of child instances iteration should produce. With nested loops, a  
Month produces one Week, that Week produces seven Days, then the next  
Week is produced, it makes seven more Days, etc. That much is easy.

Then there's self.events. My original code looped over all of  
self.events for each child produced. A Month loops over its events  
four times, a Week seven times. This was the straightforward  
implementation, but it seemed inefficient. (I also, as you point out,  
might have been wrong about the way django QuerySets work). My thought  
was that only one loop over self.events should be necessary, in  
theory, since they're sorted by date.

A for loop creates an iterator from a sequence and calls next() on it,  
and it creates an entirely new iterator each time you start a new for  
loop: each for loop starts from the beginning of the sequence. But if  
you create your own iterator from the sequence and run a for loop on  
it, then using break to jump out of the for loop should leave the  
iterator just where you left it, since it maintains state. Doing  
another for loop on it (the next time through the date-based while  
loop), should pick up where it left off. That's why the line  
"iterevents = iter(self.events)" is outside of the while loop: it is  
only created once, and the loops later in the function all make use of  
the same iterator, instead of creating a new one every time.

I'm pretty sure this works in theory, because calling event_count() on  
the Weeks as they come out returns the correct number of events. But,  
for some reason, those events are not making it into the Day children.

I had originally assumed that a QuerySet pulled objects out of the  
database in a rolling fashion – ie iterating on a Month would first  
draw a Week's worth of events from the database, then another Week,  
then two more. But if it loads them all at first access, then I might  
as well just call list() on the QuerySet at object instantiation, and  
save myself some trouble.

I hope that's a little clearer. My central issue is maintaining my  
place in the self.events loop, and only advancing it as far as the  
date-based while loop advances. Whether that's possible or not...


Thanks again,
Eric

On Aug 26, 2008, at 1:12 AM, Kent Johnson wrote:

> I'm not following your code very well. I don't understand the
> relationship between the first loop and the iter_children() function.
>
> A couple of things that might help:
> - Django QuerySets can be qualified with additional tests, so you
> could have each of your month/week/etc classes have its own correctly
> qualified QuerySet. This will result in one database query for each
> event class, and multiple copies of the actual events.
> - When you start iterating a QuerySet, it fetches all the model
> instances into a list. I think you are trying to use iterators to
> prevent this fetch but Django doesnt' work that way. You might as well
> just use the list.
> - Iterators can't be restarted, I think that is why your latest
> iter_children() doesn't work as you expect.
>
> Can you say some more about why you are doing this? Where do all the
> initial constraints come from? Do you really have to be so careful to
> protect against a 'madman' user? Perhaps you could set a limit on the
> number of events you will retrieve and use a count() on the QuerySet
> to ensure that not too many records have been fetched. Maybe you
> should try a straightforward implementation and when you have
> something working you can worry about making it better.
>
> Kent



More information about the Tutor mailing list