Set like feature

Alex Martelli aleaxit at yahoo.com
Mon Nov 15 17:14:29 EST 2004


Hari  Pulapaka <hari04 at gmail.com> wrote:

> I have a list of space delimited strings ending in a newline.
> Eg: a = ['a sfds sdf s df 34 ew\n', 'df sdf s f s ssf\n']
> 
> Now inside each row, I have a space delimited list of fields.
> 
> Now I want to compare the fields in each row of the array and see which
> fields do not match.
> 
> Think of it as a 2 dimensional array of size mn, and comparing each
> each element on a column by column basis.
> 
> I am using python2.2 so no sets. Can anyone think of an efficient way
> to do this? 

Do you want to compare corresponding fields?  That's the only way I can
read that 'column by column basis', and thus I don't see what sets could
possibly have to do with it.

Do you want to compare each row with every other row?  I also note in
your example that the number of fields in each row appear to be
variable, so how do you want to deal with 'missing' fields?

Too many unanswered questions, I guess.  But for some specified set of
answers to those question, you might do...:

def compare_fields(i, j, base, other):
    for k, f1, f2 in zip(xrange(sys.maxint), base, other):
        if f1 != f2:
            print 'DIFF', i, j, k, repr(f1), repr(f2)

def lots_of_compares(list_of_strings):
    list_of_lists_of_fields = [row.split() for row in list_of_strings]
    num_rows = len(list_of_lists_of_fields)
    for i in xrange(num_rows):
        base_row = list_of_lists_of_fields[i]
        for j in xrange(i+1, num_rows):
            compare_fields(i, j, base_row, list_of_lists_of_fields[j])

You can do better with enumerate, itertools and other things which 2.2
didn't have, but sets wouldn't help.  Now, I hope this clarifies the
many unanswered questions which your 'specs' leave open, so you can work
out exactly what you want.

And, btw: upgrate to 2.4.  Sets or no sets, the performance enhancement
by itself will be vastly sufficient to repay whatever inconvenience you
think the upgrade might cause.


Alex



More information about the Python-list mailing list