Merging lists has made my brain hurt.
Thorsten Kampe
thorsten at thorstenkampe.de
Sun Oct 6 12:03:11 EDT 2002
* Alex Martelli
> The intersection of two "sets" (Sets per se are only added in Python
> 2.3, but you can use either lists OR dictionaries in Python 2.2 as
> sets -- all you need is iterability and ability to use 'in' to test
> membership) is pretty easy:
> [...]
> dictionaries are far faster than lists for membership-tests and
> for deletion of items. Whether it's worth building the dicts
> corresponding to your lists depends on how long are your lists:
> try both ways, pick the faster one if it matters.
Treating lists a priori as sets for reasons of speed is definitely not
the way to go because you're "losing information". I think it's better
to treat them as tuples (in a mathematical sense) whereby you can
always delete 'duplicates' afterwards if you don't care about them.
Example: What's the intersection between [2, 1, 2, 3, 2] and
[0, 1, 2, 3, 2, 4]? An intersection (union, symmetric difference) of
tuples can only be meaningful for the /corresponding/ items of the
tuples/lists, so it's neither [2, 1, 3] nor [2, 1, 2, 3, 2] but
[2, 1, 2, 3].
I tried to write a reliable (though maybe slow) program that takes
care of this issue (especially that intersection is _commutative_):
#v+
def boolean(seq0, seq1, boolean_modus):
"""" perform 'and', 'not', 'or', or 'xor' operation on sequences """
if boolean_modus == 'and': # intersection
seq1 = seq1[:]
intersection = []
for item in seq0:
if item in seq1:
intersection.append(item)
seq1.remove(item)
return intersection
elif boolean_modus == 'not': # set difference
setdiff = seq0[:]
for item in seq1:
if item in setdiff:
setdiff.remove(item)
return setdiff
elif boolean_modus == 'or': # union
return seq0 + boolean(seq1, seq0, 'not')
elif boolean_modus == 'xor': # symmetric difference
return boolean(boolean(seq0, seq1, 'or'), boolean(seq0, seq1, 'and'), 'not')
#v-
Thorsten
More information about the Python-list
mailing list