[Tutor] Help with re.sub()

Fri Mar 17 06:18:16 CET 2006

Hi John,

I would just like to suggest a different approach. Like the old saying goes:

     	Some people, when confronted with a problem, think “I know, I’ll 
use regular expressions.” Now they have two problems.
		— Jamie 	Zawinski, in comp.lang.emacs

If the delimiter is always the same ('@') you can use split() to get the 
data. Then you can arrange the data in a dictionary of lists, like this.

collapsed_data = {}

for line in mydata:
     id_part, data_part = line[:-1].split('@')

     try:
         collapsed_data[id_part].append(data_part)
     except KeyError:
         #first time insert for that key
         collapsed_data[id_part] = [data_part]

for id, data in collapsed_data.iteritems():
     print '@'.join([id] + data)

That should be it. Python's data types are very powerful.  Of course you 
could just build a huge list comprehension that does it...

Hope that helps,

Hugo