sets and subsets

Dave K dk123456789 at REMOVEhotmail.com
Thu Feb 12 18:49:10 EST 2004


On Wed, 11 Feb 2004 18:57:15 -0500 in comp.lang.python, François
Pinard <pinard at iro.umontreal.ca> wrote:

>[Peter Otten]
>
>> (untested)
>> from sets import set
>
>> inputFile = file('ips.txt', 'r') #Super-set
>> include = Set(inputFile.readlines())
>> inputFile.close()
>
>> readFile = file('excluded_ips.txt', 'r') #Sub-set to exclude
>> exclude = Set(readFile.readlines())
>> readFile.close()
>
>> # No Magic of Elaine
>
>> outputFile = file('pruned_ips.txt' , 'w')
>> for i in include - exclude:
>>     print >> outputFile, i,
>> outputFile.close()
>
>Here is an equivalent, shorter algorithm (tested):
>
>from sets import Set
>file('pruned_ips.txt', 'w').writelines(
>        Set(file('ips.txt')) - Set(file('excluded_ips.txt')))
>
>This code relies on `writelines' accepting an iterable, sets returning
>their members whenever iterated, Set constructors accepting an iterable,
>and files returning their lines whenever iterated.  And of course, on
>`close' rarely being needed in Python! :-)
>
>The order of lines in the produced file is kind of random, however.

That's very compact and neat, but for completeness I'd like to point
out that it could also be written (more clumsily) in one line with
list comprehensions, retaining the same order of elements as in the
original list:

file('pruned_ips.txt', 'w').writelines([ip for ip in file('ips.txt')
                              if ip not in file('excluded_ips.txt')])

Of course, your example using sets is much clearer, so I prefer that.

Dave





More information about the Python-list mailing list