removing duplicates, or, converting Set() to string

John Machin sjmachin at lexicon.net
Wed Jul 26 20:22:17 EDT 2006


maphew at gmail.com wrote:
> Hello,
>
> I have some lists for which I need to remove duplicates. I found the
> sets.Sets() module which does exactly this

I think you mean that you found the sets.Set() constructor in the set
module.
If you are using Python 2.4, use the built-in set() function instead.
If you are using Python 2.3, consider upgrading if you can.

> but how do I get the set
> back out again?


>
> # existing input: A,B,B,C,D
> # desired result: A,B,C,D
>
> import sets
> dupes = ['A','B','B','C','D']
> clean = sets.Set(dupes)
>
> out = open('clean-list.txt','w')
> out.write(clean)
> out.close
>
> ---
> out.write(clean) fails with "TypeError: argument 1 must be string or
> read-only character buffer, not Set"

as expected

> and out.write( str(clean) )
> creates "Set(['A', 'C', 'B', 'D'])" instead of just A,B,C,D.

again as expected.

BTW, in practice you'd probably want to append '\n' to the string that
you're writing.

You should be able to get a (possibly unsorted) list of the contents of
*any* container like this (but note that dictionaries divulge only
their keys):

>>> dupes = ['A','B','B','C','D']
>>> clean = set(dupes) # the Python 2.4+ way
>>> clean
set(['A', 'C', 'B', 'D'])
>>> [x for x in clean]
['A', 'C', 'B', 'D']
>>>

list(clean) would work as well.

If you want the output sorted, then use the list.sort() method. Details
in the manual.

HTH,
John




More information about the Python-list mailing list