[python-win32] Identify unique data from sequence array
Mike Diehn
mike.diehn at ansys.com
Wed Dec 22 17:01:52 CET 2010
I'm a unix guy. That's what we call a sort-uniq operation, after the
pipeline we'd use: sort datafile | uniq > uniq-lines.txt. So I google that
with python and ....
As Jason Petrone wrote when he withdrew PEP 270 in
http://www.python.org/dev/peps/pep-0270/:
"creating a sequence without duplicates is just a matter of
choosing a different data structure: a set instead of a list."
At the time, sets.py was a nifty new thing. Since then, the set datatype
has
been added to python's base.
set() can consume a list of tuples, but not a list of lists, like the X you
showed us. You're job will be getting your massive list of lists into a
list of tuples.
This works, but for your very large arrays, may take large time:
X = [[1,2], [1,2], [3,4], [3,4]]
Y = set( [tuple(x) for x in X] )
There may be faster methods. The map() function might help, but I really
don't know. Here's something to try:
Y = set( map(tuple, X )
Or you can go old school route, from before the days of set(), that is:
http://code.activestate.com/recipes/52560-remove-duplicates-from-a-sequence/
Best,
Mike
On Wed, Dec 22, 2010 at 10:28 AM, Aahz <aahz at pythoncraft.com> wrote:
> On Wed, Dec 22, 2010, otrov wrote:
> >
> > I failed in my first idea to solve this problem with matlab/octave,
> > as I just started using this tools for data manipulation, and then
> > thought to try python as more feature rich descriptive language and
> > post this problem to python group I'm subscribed already
>
> You may get better answers posting to a general Python group (e.g.
> comp.lang.python).
> --
> Aahz (aahz at pythoncraft.com) <*>
> http://www.pythoncraft.com/
>
> "Think of it as evolution in action." --Tony Rand
> _______________________________________________
> python-win32 mailing list
> python-win32 at python.org
> http://mail.python.org/mailman/listinfo/python-win32
>
--
Mike Diehn
Senior Systems Administrator
ANSYS, Inc - Lebanon, NH Office
mike.diehn at ansys.com, (603) 727-5492
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-win32/attachments/20101222/fb64dcbf/attachment.html>
More information about the python-win32
mailing list