Good use for itertools.dropwhile and itertools.takewhile

Tue Dec 4 09:47:21 EST 2012

Hi Neil,

Nice! But fails if the first word of the description starts with a capital letter.

Nick

On Wednesday, 5 December 2012 01:23:34 UTC+11, Neil Cerutti  wrote:
> On 2012-12-04, Nick Mellor <thebalancepro at gmail.com> wrote:
> 
> > I have a file full of things like this:
> 
> >
> 
> > "CAPSICUM RED fresh from Queensland"
> 
> >
> 
> > Product names (all caps, at start of string) and descriptions
> 
> > (mixed case, to end of string) all muddled up in the same
> 
> > field. And I need to split them into two fields. Note that if
> 
> > the text had said:
> 
> >
> 
> > "CAPSICUM RED fresh from QLD"
> 
> >
> 
> > I would want QLD in the description, not shunted forwards and
> 
> > put in the product name. So (uncontrived) list comprehensions
> 
> > and regex's are out.
> 
> >
> 
> > I want to split the above into:
> 
> >
> 
> > ("CAPSICUM RED", "fresh from QLD")
> 
> >
> 
> > Enter dropwhile and takewhile. 6 lines later:
> 
> >
> 
> > from itertools import takewhile, dropwhile
> 
> > def split_product_itertools(s):
> 
> >     words = s.split()
> 
> >     allcaps = lambda word: word == word.upper()
> 
> >     product, description = takewhile(allcaps, words), dropwhile(allcaps, words)
> 
> >     return " ".join(product), " ".join(description)
> 
> >
> 
> > When I tried to refactor this code to use while or for loops, I
> 
> > couldn't find any way that felt shorter or more pythonic:
> 
> 
> 
> I'm really tempted to import re, and that means takewhile and
> 
> dropwhile need to stay. ;)
> 
> 
> 
> But seriously, this is a quick implementation of my first thought.
> 
> 
> 
> description = s.lstrip(string.ascii_uppercase + ' ')
> 
> product = s[:-len(description)-1]
> 
> 
> 
> -- 
> 
> Neil Cerutti