re.sub

Massimo Di Pierro mdipierro at cti.depaul.edu
Tue Oct 16 02:54:36 EDT 2007


Shouldn't this

 >>> print re.sub('a','\\n','bab')
b
b

output

b\nb

instead?

Massimo

On Oct 16, 2007, at 1:34 AM, George Sakkis wrote:

> On Oct 15, 11:02 pm, 7stud <bbxx789_0... at yahoo.com> wrote:
>> I'm applying groupby() in a very simplistic way to split up some  
>> data,
>> but when I timeit against another method, it takes twice as long.   
>> The
>> following groupby() code groups the data between the "</tr>" strings:
>>
>> data = [
>> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>",
>> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>",
>> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>",
>> ]
>>
>> import itertools
>>
>> def key(s):
>>     if s[0] == "<":
>>         return 'a'
>>     else:
>>         return 'b'
>>
>> def test3():
>>
>>     master_list = []
>>     for group_key, group in itertools.groupby(data, key):
>>         if group_key == "b":
>>             master_list.append(list(group) )
>>
>> def test1():
>>     master_list = []
>>     row = []
>>
>>     for elmt in data:
>>         if elmt[0] != "<":
>>             row.append(elmt)
>>         else:
>>             if row:
>>                 master_list.append(" ".join(row) )
>>                 row = []
>>
>> import timeit
>>
>> t = timeit.Timer("test3()", "from __main__ import test3, key, data")
>> print t.timeit()
>> t = timeit.Timer("test1()", "from __main__ import test1, data")
>> print t.timeit()
>>
>> --output:---
>> 42.791079998
>> 19.0128788948
>>
>> I thought groupby() would be faster.  Am I doing something wrong?
>
> Yes and no. Yes, the groupby version can be improved a little by
> calling a builtin method instead of a Python function. No, test1 still
> beats it hands down (and with Psyco even further); it is almost good
> as it gets in pure Python.
>
> FWIW, here's a faster and more compact version with groupby:
>
> def test3b(data):
>     join = ' '.join
>     return [join(group) for key,group in
>             itertools.groupby(data, "</tr>".__eq__)
>             if not key]
>
>
> George
>
> --
> http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list