Is there a unique method in python to unique a list?

John H. Li typetoken at gmail.com
Sun Sep 9 03:18:54 EDT 2012


Thanks again. What you explain is reasonable. I try to the second method to
unique the list. It does turn out that python just works and works without
result. Maybe because it do iterate a long list in my example and slow.

>>> def average_polysemy(pos):
synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
unique_lemma_list = []
for w in lemma_list:
if not w in unique_lemma_list:
unique_lemma_list.append(w)
return unique_lemma_list
for lemma in unique_lemma_list:
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(unique_lemma_list)

>>> average_polysemy('n')

On Sun, Sep 9, 2012 at 2:36 PM, Donald Stufft <donald.stufft at gmail.com>wrote:

>  For a short list the difference is going to be negligible.
>
> For a long list the difference is that checking if an item in a list
> requires iterating over the list internally to find it but checking if an
> item is inside of a set uses a faster method that doesn't require iterating
> over the list. This doesn't matter if you have 20 or 30 items, but imagine
> if instead you have 50 million items. Your going to be iterating over the
> list a lot and that can introduce significant slow dow.
>
> On the other hand using a set is faster in that case, but because you are
> storing an additional copy of the data you are using more memory to store
> extra copies of everything.
>
> On Sunday, September 9, 2012 at 2:31 AM, John H. Li wrote:
>
> Thanks first, I could understand the second approach easily. The first
> approach is  a bit puzzling. Why are  seen=set() and seen.add(x)  still
> necessary there if we can use unique.append(x) alone? Thanks for your
> enlightenment.
>
> On Sun, Sep 9, 2012 at 1:59 PM, Donald Stufft <donald.stufft at gmail.com>wrote:
>
>  seen = set()
> uniqued = []
> for x in original:
>     if not x in seen:
>         seen.add(x)
>         uniqued.append(x)
>
> or
>
> uniqued = []
> for x in oriignal:
>     if not x in uniqued:
>         uniqued.append(x)
>
> The difference between is option #1 is more efficient speed wise, but uses
> more memory (extraneous set hanging around), whereas the second is slower
> (``in`` is slower in lists than in sets) but uses less memory.
>
> On Sunday, September 9, 2012 at 1:56 AM, John H. Li wrote:
>
> Many thanks. If I want keep the order, how can I deal with it?
> or we can list(set([1, 1, 2, 3, 4])) = [1,2,3,4]
>
>
> On Sun, Sep 9, 2012 at 1:47 PM, Donald Stufft <donald.stufft at gmail.com>wrote:
>
>  If you don't need to retain order you can just use a set,
>
> set([1, 1, 2, 3, 4]) = set([1, 2, 3, 4])
>
> But set's don't retain order.
>
> On Sunday, September 9, 2012 at 1:43 AM, Token Type wrote:
>
> Is there a unique method in python to unique a list? thanks
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20120909/5534c43d/attachment.html>


More information about the Python-list mailing list