[Numpy-discussion] Does a `mergesorted` function make sense?

Jaime Fernández del Río jaime.frio at gmail.com
Wed Aug 27 13:02:52 EDT 2014


A request was open in github to add a `merge` function to numpy that would
merge two sorted 1d arrays into a single sorted 1d array. I have been
playing around with that idea for a while, and have a branch in my numpy
fork that adds a `mergesorted` function to `numpy.lib`:

https://github.com/jaimefrio/numpy/commit/ce5d480afecc989a36e5d2bf4ea1d1ba58a83b0a

I drew inspiration from C++ STL algorithms, and merged into a single
function what merge, set_union, set_intersection, set_difference and
set_symmetric_difference do there.

My first thought when implementing this was to not make it a public
function, but use it under the hood to speed-up some of the functions of
`arraysetops.py`, which are now merging two already sorted functions by
doing `np.sort(np.concatenate((a, b)))`. I would need to revisit my
testing, but the speed-ups weren't that great.

One other thing I saw value in for some of the `arraysetops.py` functions,
but couldn't fully figure out, was in providing extra output aside from the
merged arrays, either in the form of indices, or of boolean masks,
indicating which items of the original arrays made it into the merged one,
and/or where did they end up in it.

Since there is at least one other person out there that likes it, is there
any more interest in such a function? If yes, any comments on what the
proper interface for extra output should be? Although perhaps the best is
to leave that out for starters and see what use people make of it, if any.

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140827/78889fa7/attachment.html>


More information about the NumPy-Discussion mailing list