Is there a unique method in python to unique a list?

John H. Li typetoken at gmail.com
Sun Sep 9 03:29:45 EDT 2012


One more test result to add, if I use your first method to unique:

seen = set()
uniqued = []
for x in original:
    if not x in seen:
        seen.add(x)
        uniqued.append(x)

The results pops up in a few seconds. It makes a dramatic difference.
Thanks. See the following fasted codes:
>>> import nltk
>>> from nltk.corpus import wordnet as wn
>>> def average_polysemy(pos):
synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
unique_lemma_list = []
seen = set()
for w in lemma_list:
if not w in seen:
seen.add(w)
unique_lemma_list.append(w)
for lemma in unique_lemma_list:
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(unique_lemma_list)

>>> average_polysemy('n')
1

On Sun, Sep 9, 2012 at 3:18 PM, John H. Li <typetoken at gmail.com> wrote:

> Thanks again. What you explain is reasonable. I try to the second method
> to unique the list. It does turn out that python just works and works
> without result. Maybe because it do iterate a long list in my example and
> slow.
>
> >>> def average_polysemy(pos):
> synset_list = list(wn.all_synsets(pos))
> sense_number = 0
>  lemma_list = []
> for synset in synset_list:
> lemma_list.extend(synset.lemma_names)
>  unique_lemma_list = []
> for w in lemma_list:
>  if not w in unique_lemma_list:
> unique_lemma_list.append(w)
>  return unique_lemma_list
> for lemma in unique_lemma_list:
>  sense_number_new = len(wn.synsets(lemma, pos))
> sense_number = sense_number + sense_number_new
>  return sense_number/len(unique_lemma_list)
>
> >>> average_polysemy('n')
>
> On Sun, Sep 9, 2012 at 2:36 PM, Donald Stufft <donald.stufft at gmail.com>wrote:
>
>>  For a short list the difference is going to be negligible.
>>
>> For a long list the difference is that checking if an item in a list
>> requires iterating over the list internally to find it but checking if an
>> item is inside of a set uses a faster method that doesn't require iterating
>> over the list. This doesn't matter if you have 20 or 30 items, but imagine
>> if instead you have 50 million items. Your going to be iterating over the
>> list a lot and that can introduce significant slow dow.
>>
>> On the other hand using a set is faster in that case, but because you are
>> storing an additional copy of the data you are using more memory to store
>> extra copies of everything.
>>
>> On Sunday, September 9, 2012 at 2:31 AM, John H. Li wrote:
>>
>> Thanks first, I could understand the second approach easily. The first
>> approach is  a bit puzzling. Why are  seen=set() and seen.add(x)  still
>> necessary there if we can use unique.append(x) alone? Thanks for your
>> enlightenment.
>>
>> On Sun, Sep 9, 2012 at 1:59 PM, Donald Stufft <donald.stufft at gmail.com>wrote:
>>
>>  seen = set()
>> uniqued = []
>> for x in original:
>>     if not x in seen:
>>         seen.add(x)
>>         uniqued.append(x)
>>
>> or
>>
>> uniqued = []
>> for x in oriignal:
>>     if not x in uniqued:
>>         uniqued.append(x)
>>
>> The difference between is option #1 is more efficient speed wise, but
>> uses more memory (extraneous set hanging around), whereas the second is
>> slower (``in`` is slower in lists than in sets) but uses less memory.
>>
>> On Sunday, September 9, 2012 at 1:56 AM, John H. Li wrote:
>>
>> Many thanks. If I want keep the order, how can I deal with it?
>> or we can list(set([1, 1, 2, 3, 4])) = [1,2,3,4]
>>
>>
>> On Sun, Sep 9, 2012 at 1:47 PM, Donald Stufft <donald.stufft at gmail.com>wrote:
>>
>>  If you don't need to retain order you can just use a set,
>>
>> set([1, 1, 2, 3, 4]) = set([1, 2, 3, 4])
>>
>> But set's don't retain order.
>>
>> On Sunday, September 9, 2012 at 1:43 AM, Token Type wrote:
>>
>> Is there a unique method in python to unique a list? thanks
>> --
>> http://mail.python.org/mailman/listinfo/python-list
>>
>>
>>
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20120909/6028d656/attachment.html>


More information about the Python-list mailing list