Delete duplicate rows in textfile - except it contains a "{" or "}"
Mark Lawrence
breamoreboy at yahoo.co.uk
Wed Oct 10 05:28:24 EDT 2012
On 10/10/2012 09:51, Joon Ki Choi wrote:
>
> Hello Pythonistas,
>
> i have a very large textfile with contents like:
>
> @INBOOK{Ackermann1999-b,
> author = {Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann,
> K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F.
> and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and
> Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann,
> K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F.
> and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and
> Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann,
> K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F.
> and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and
> Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann},
> year = {1980},
> timestamp = {1995-12-02}
> }
>
> And i want to delete the duplicate rows except these rows containing the brackets { or }.
> The result should look like:
>
> @INBOOK{Ackermann1999-b,
> author = {Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann,
> Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann},
> year = {1980},
> timestamp = {1995-12-02}
> }
>
> I come across with this Python-Skript:
>
> lines_seen = set() # holds lines already seen
> outfile = open("literatur_clean.txt", "w")
Slight aside, you could use this so there's no need to explicitly close
the file.
with open("literatur_dupl.txt", "r") as infile
> for line in infile:
> if line not in lines_seen: # not a duplicate
> outfile.write(line)
> lines_seen.add(line)
Something like:-
if "{" in line or "}" in line or line not in lines_seen:
> outfile.close()
>
> But it deletes also the lines with a closing bracket } and the lines with the same authordata.
> Therefor i need the condition of the brackets.
>
> Could someone point me out to adding this condition?
>
> Thanks in advance,
> Joon
>
--
Cheers.
Mark Lawrence.
More information about the Python-list
mailing list