Query regarding set([])?

Peter Otten __peter__ at web.de
Fri Jul 10 10:59:53 EDT 2009


vox wrote:

> On Jul 10, 4:17 pm, Dave Angel <da... at ieee.org> wrote:
>> vox wrote:
>> > On Jul 10, 2:04 pm, Peter Otten <__pete... at web.de> wrote:
>>
>> >> You are probably misinterpreting len(s3). s3 contains lines occuring
>> >> in "file1" but not in "file2". Duplicate lines are only counted once,
>> >> and the order doesn't matter.
>>
>> >> So there are 119 lines that occur at least once in "file2", but not in
>> >> "file1".
>>
>> >> If that is not what you want you have to tell us what exactly you are
>> >> looking for.
>>
>> >> Peter
>>
>> > Hi,
>> > Thanks for the answer.
>>
>> > I am looking for a script that compares file1 and file2, for each line
>> > in file1, check if line is present in file2. If the line from file1 is
>> > not present in file2, print that line/write it to file3, because I
>> > have to know what lines to add to file2.
>>
>> > BR,
>> > Andy
>>
>> There's no more detail in that response.  To the level of detail you
>> provide, the program works perfectly.  Just loop through the set and
>> write the members to the file.
>>
>> But you have some unspecified assumptions:
>> 1) order doesn't matter
>> 2) duplicates are impossible in the input file, or at least not
>> meaningful.  So the correct output file could very well be smaller than
>> either of the input files.
>>
>> And a few others that might matter:
>> 3) the two files are both text files, with identical line endings
>> matching your OS default
>> 4) the two files are ASCII, or at least 8 bit encoded, using the
>> same encoding  (such as both UTF-8)
>> 5) the last line of each file DOES have a trailing newline sequence
> 
> Thanks all for the input!
> I have guess I have to think it through a couple times more. :)

Indeed. Note that others thinking through related problems have come up with

http://docs.python.org/library/difflib.html

Peter




More information about the Python-list mailing list