re.sub

Massimo Di Pierro mdipierro at cti.depaul.edu
Tue Oct 16 14:02:46 EDT 2007


Even stranger

 >>> re.sub('a', '\\n','bab')
'b\nb'
 >>> print re.sub('a', '\\n','bab')
b
b

Massimo


On Oct 16, 2007, at 1:54 AM, DiPierro, Massimo wrote:

> Shouldn't this
>
>>>> print re.sub('a','\\n','bab')
> b
> b
>
> output
>
> b\nb
>
> instead?
>
> Massimo
>
> On Oct 16, 2007, at 1:34 AM, George Sakkis wrote:
>
>> On Oct 15, 11:02 pm, 7stud <bbxx789_0... at yahoo.com> wrote:
>>> I'm applying groupby() in a very simplistic way to split up some
>>> data,
>>> but when I timeit against another method, it takes twice as long.
>>> The
>>> following groupby() code groups the data between the "</tr>"  
>>> strings:
>>>
>>> data = [
>>> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>",
>>> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>",
>>> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>",
>>> ]
>>>
>>> import itertools
>>>
>>> def key(s):
>>>     if s[0] == "<":
>>>         return 'a'
>>>     else:
>>>         return 'b'
>>>
>>> def test3():
>>>
>>>     master_list = []
>>>     for group_key, group in itertools.groupby(data, key):
>>>         if group_key == "b":
>>>             master_list.append(list(group) )
>>>
>>> def test1():
>>>     master_list = []
>>>     row = []
>>>
>>>     for elmt in data:
>>>         if elmt[0] != "<":
>>>             row.append(elmt)
>>>         else:
>>>             if row:
>>>                 master_list.append(" ".join(row) )
>>>                 row = []
>>>
>>> import timeit
>>>
>>> t = timeit.Timer("test3()", "from __main__ import test3, key, data")
>>> print t.timeit()
>>> t = timeit.Timer("test1()", "from __main__ import test1, data")
>>> print t.timeit()
>>>
>>> --output:---
>>> 42.791079998
>>> 19.0128788948
>>>
>>> I thought groupby() would be faster.  Am I doing something wrong?
>>
>> Yes and no. Yes, the groupby version can be improved a little by
>> calling a builtin method instead of a Python function. No, test1  
>> still
>> beats it hands down (and with Psyco even further); it is almost good
>> as it gets in pure Python.
>>
>> FWIW, here's a faster and more compact version with groupby:
>>
>> def test3b(data):
>>     join = ' '.join
>>     return [join(group) for key,group in
>>             itertools.groupby(data, "</tr>".__eq__)
>>             if not key]
>>
>>
>> George
>>
>> --
>> http://mail.python.org/mailman/listinfo/python-list
>
> --
> http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list