Good use for itertools.dropwhile and itertools.takewhile

Neil Cerutti neilc at norwich.edu
Wed Dec 5 09:34:37 EST 2012


On 2012-12-05, Chris Angelico <rosuav at gmail.com> wrote:
> On Wed, Dec 5, 2012 at 12:17 PM, Nick Mellor <thebalancepro at gmail.com> wrote:
>>
>> takewhile mines for gold at the start of a sequence, dropwhile
>> drops the dross at the start of a sequence.
>
> When you're using both over the same sequence and with the same
> condition, it seems odd that you need to iterate over it twice.
> Perhaps a partitioning iterator would be cleaner - something
> like this:
>
> def partitionwhile(predicate, iterable):
>     iterable = iter(iterable)
>     while True:
>         val = next(iterable)
>         if not predicate(val): break
>         yield val
>     raise StopIteration # Signal the end of Phase 1
>     for val in iterable: yield val # or just "yield from iterable", I think
>
> Only the cold hard boot of reality just stomped out the spark
> of an idea. Once StopIteration has been raised, that's it,
> there's no "resuming" the iterator. Is there a way around that?
> Is there a clean way to say "Done for now, but next time you
> ask, there'll be more"?
>
> I tested it on Python 3.2 (yeah, time I upgraded, I know).

Well, shoot! Then this is a job for groupby, not takewhile.

def prod_desc(s):
    """split s into product name and product description.

    >>> prod_desc("CAR FIFTY TWO Chrysler LeBaron.")
    ['CAR FIFTY TWO', 'Chrysler LeBaron.']

    >>> prod_desc("MR. JONESEY Saskatchewan's finest")
    ['MR. JONESEY', "Saskatchewan's finest"]

    >>> prod_desc("no product name?")
    ['', 'no product name?']

    >>> prod_desc("NO DESCRIPTION")
    ['NO DESCRIPTION', '']
    """
    prod = ''
    desc = ''
    for k, g in itertools.groupby(s.split(),
            key=lambda w: any(c.islower() for c in w)):
        a = ' '.join(g)
        if k:
            desc = a 
        else:
            prod = a
    return [prod, desc]

This has no way to preserve odd white space which could break
evil product name differences.

-- 
Neil Cerutti



More information about the Python-list mailing list