re.sub
Massimo Di Pierro
mdipierro at cti.depaul.edu
Tue Oct 16 02:54:36 EDT 2007
Shouldn't this
>>> print re.sub('a','\\n','bab')
b
b
output
b\nb
instead?
Massimo
On Oct 16, 2007, at 1:34 AM, George Sakkis wrote:
> On Oct 15, 11:02 pm, 7stud <bbxx789_0... at yahoo.com> wrote:
>> I'm applying groupby() in a very simplistic way to split up some
>> data,
>> but when I timeit against another method, it takes twice as long.
>> The
>> following groupby() code groups the data between the "</tr>" strings:
>>
>> data = [
>> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>",
>> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>",
>> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>",
>> ]
>>
>> import itertools
>>
>> def key(s):
>> if s[0] == "<":
>> return 'a'
>> else:
>> return 'b'
>>
>> def test3():
>>
>> master_list = []
>> for group_key, group in itertools.groupby(data, key):
>> if group_key == "b":
>> master_list.append(list(group) )
>>
>> def test1():
>> master_list = []
>> row = []
>>
>> for elmt in data:
>> if elmt[0] != "<":
>> row.append(elmt)
>> else:
>> if row:
>> master_list.append(" ".join(row) )
>> row = []
>>
>> import timeit
>>
>> t = timeit.Timer("test3()", "from __main__ import test3, key, data")
>> print t.timeit()
>> t = timeit.Timer("test1()", "from __main__ import test1, data")
>> print t.timeit()
>>
>> --output:---
>> 42.791079998
>> 19.0128788948
>>
>> I thought groupby() would be faster. Am I doing something wrong?
>
> Yes and no. Yes, the groupby version can be improved a little by
> calling a builtin method instead of a Python function. No, test1 still
> beats it hands down (and with Psyco even further); it is almost good
> as it gets in pure Python.
>
> FWIW, here's a faster and more compact version with groupby:
>
> def test3b(data):
> join = ' '.join
> return [join(group) for key,group in
> itertools.groupby(data, "</tr>".__eq__)
> if not key]
>
>
> George
>
> --
> http://mail.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list