A gnarly little python loop
rusi
rustompmody at gmail.com
Mon Nov 12 02:09:31 EST 2012
On Nov 11, 3:58 am, Roy Smith <r... at panix.com> wrote:
> I'm trying to pull down tweets with one of the many twitter APIs. The
> particular one I'm using (python-twitter), has a call:
>
> data = api.GetSearch(term="foo", page=page)
>
> The way it works, you start with page=1. It returns a list of tweets.
> If the list is empty, there are no more tweets. If the list is not
> empty, you can try to get more tweets by asking for page=2, page=3, etc.
> I've got:
>
> page = 1
> while 1:
> r = api.GetSearch(term="foo", page=page)
> if not r:
> break
> for tweet in r:
> process(tweet)
> page += 1
>
> It works, but it seems excessively fidgety. Is there some cleaner way
> to refactor this?
This is a classic problem -- structure clash of parallel loops -- nd
Steve Howell has given the classic solution using the fact that
generators in python simulate/implement lazy lists.
As David Beazley http://www.dabeaz.com/coroutines/ explains,
coroutines are more general than generators and you can use those if
you prefer.
The classic problem used to be stated like this:
There is an input in cards of 80 columns.
It needs to be copied onto printer of 132 columns.
The structure clash arises because after reading 80 chars a new card
has to be read; after printing 132 chars a linefeed has to be given.
To pythonize the problem, lets replace the 80,132 by 3,4, ie take the
char-square
abc
def
ghi
and produce
abcd
efgh
i
The important difference (explained nicely by Beazley) is that in
generators the for-loop pulls the generators, in coroutines, the
'generator' pushes the consuming coroutines.
---------------
from __future__ import print_function
s= ["abc", "def", "ghi"]
# Coroutine-infrastructure from pep 342
def consumer(func):
def wrapper(*args,**kw):
gen = func(*args, **kw)
gen.next()
return gen
return wrapper
@consumer
def endStage():
while True:
for i in range(0,4):
print((yield), sep='', end='')
print("\n", sep='', end='')
def genStage(s, target):
for line in s:
for i in range(0,3):
target.send(line[i])
if __name__ == '__main__':
genStage(s, endStage())
More information about the Python-list
mailing list