Help with splitting

Jeremy Bowers jerf at jerf.org
Fri Apr 1 18:20:59 EST 2005


On Fri, 01 Apr 2005 18:01:49 -0500, Brian Beck wrote:
> py> from itertools import groupby
> py> [''.join(g) for k, g in groupby('  test ing ', lambda x: x.isspace())]
> ['  ', 'test', ' ', 'ing', ' ']
> 
> I tried replacing the lambda thing with an attrgetter, but apparently my 
> understanding of that isn't perfect... it groups by the identify of the 
> bound method instead of calling it...

Unfortunately, as you pointed out, it is slower:

python timeit.py -s 
"import re; x = 'a ab c' * 1000; whitespaceSplitter = re.compile('(\w+)')"

"whitespaceSplitter.split(x)" 

100 loops, best of 3: 9.47 msec per loop

python timeit.py -s
"from itertools import groupby; x = 'a ab c' * 1000;" 

"[''.join(g) for k, g in groupby(x, lambda y: y.isspace())]"

10 loops, best of 3: 65.8 msec per loop

(tried to break it up to be easier to read)

But I like yours much better theoretically. It's also a pretty good demo
of "groupby".



More information about the Python-list mailing list