Good use for itertools.dropwhile and itertools.takewhile
Nick Mellor
thebalancepro at gmail.com
Tue Dec 4 09:47:21 EST 2012
Hi Neil,
Nice! But fails if the first word of the description starts with a capital letter.
Nick
On Wednesday, 5 December 2012 01:23:34 UTC+11, Neil Cerutti wrote:
> On 2012-12-04, Nick Mellor <thebalancepro at gmail.com> wrote:
>
> > I have a file full of things like this:
>
> >
>
> > "CAPSICUM RED fresh from Queensland"
>
> >
>
> > Product names (all caps, at start of string) and descriptions
>
> > (mixed case, to end of string) all muddled up in the same
>
> > field. And I need to split them into two fields. Note that if
>
> > the text had said:
>
> >
>
> > "CAPSICUM RED fresh from QLD"
>
> >
>
> > I would want QLD in the description, not shunted forwards and
>
> > put in the product name. So (uncontrived) list comprehensions
>
> > and regex's are out.
>
> >
>
> > I want to split the above into:
>
> >
>
> > ("CAPSICUM RED", "fresh from QLD")
>
> >
>
> > Enter dropwhile and takewhile. 6 lines later:
>
> >
>
> > from itertools import takewhile, dropwhile
>
> > def split_product_itertools(s):
>
> > words = s.split()
>
> > allcaps = lambda word: word == word.upper()
>
> > product, description = takewhile(allcaps, words), dropwhile(allcaps, words)
>
> > return " ".join(product), " ".join(description)
>
> >
>
> > When I tried to refactor this code to use while or for loops, I
>
> > couldn't find any way that felt shorter or more pythonic:
>
>
>
> I'm really tempted to import re, and that means takewhile and
>
> dropwhile need to stay. ;)
>
>
>
> But seriously, this is a quick implementation of my first thought.
>
>
>
> description = s.lstrip(string.ascii_uppercase + ' ')
>
> product = s[:-len(description)-1]
>
>
>
> --
>
> Neil Cerutti
More information about the Python-list
mailing list