Is there a unique method in python to unique a list?

Donald Stufft donald.stufft at gmail.com
Sun Sep 9 02:36:24 EDT 2012


For a short list the difference is going to be negligible. 

For a long list the difference is that checking if an item in a list requires iterating over the list internally to find it but checking if an item is inside of a set uses a faster method that doesn't require iterating over the list. This doesn't matter if you have 20 or 30 items, but imagine if instead you have 50 million items. Your going to be iterating over the list a lot and that can introduce significant slow dow.

On the other hand using a set is faster in that case, but because you are storing an additional copy of the data you are using more memory to store extra copies of everything. 


On Sunday, September 9, 2012 at 2:31 AM, John H. Li wrote:

> Thanks first, I could understand the second approach easily. The first approach is  a bit puzzling. Why are  seen=set() and seen.add(x)  still necessary there if we can use unique.append(x) alone? Thanks for your enlightenment.
> 
> On Sun, Sep 9, 2012 at 1:59 PM, Donald Stufft <donald.stufft at gmail.com (mailto:donald.stufft at gmail.com)> wrote:
> > seen = set() 
> > uniqued = []
> > for x in original:
> >     if not x in seen:
> >         seen.add(x)
> >         uniqued.append(x)
> > 
> > or
> > 
> > uniqued = []
> > for x in oriignal:
> >     if not x in uniqued:
> >         uniqued.append(x)
> > 
> > The difference between is option #1 is more efficient speed wise, but uses more memory (extraneous set hanging around), whereas the second is slower (``in`` is slower in lists than in sets) but uses less memory. 
> > 
> > On Sunday, September 9, 2012 at 1:56 AM, John H. Li wrote:
> > 
> > > Many thanks. If I want keep the order, how can I deal with it?
> > > or we can list(set([1, 1, 2, 3, 4])) = [1,2,3,4]
> > > 
> > > 
> > > On Sun, Sep 9, 2012 at 1:47 PM, Donald Stufft <donald.stufft at gmail.com (mailto:donald.stufft at gmail.com)> wrote:
> > > > If you don't need to retain order you can just use a set, 
> > > > 
> > > > set([1, 1, 2, 3, 4]) = set([1, 2, 3, 4])
> > > > 
> > > > But set's don't retain order. 
> > > > 
> > > > On Sunday, September 9, 2012 at 1:43 AM, Token Type wrote:
> > > > 
> > > > > Is there a unique method in python to unique a list? thanks
> > > > > -- 
> > > > > http://mail.python.org/mailman/listinfo/python-list
> > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > 
> > 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20120909/5fa437a0/attachment.html>


More information about the Python-list mailing list