[SciPy-User] get best few of many: argsort( few= ) using std::partial_sort ?

Keith Goodman kwgoodman at gmail.com
Thu Jul 8 13:30:28 EDT 2010


On Thu, Jul 8, 2010 at 10:17 AM, denis <denis-bz-gg at t-online.de> wrote:
> Folks,
>  to get the best few of a large number of objects,
> e.g. vectors near a given one, or small distances in
> spatial.distance.cdist or .pdist,
> argsort( bigArray )[: a few ] is not so hot.  It would be nice if
>    argsort( bigArray, few= )
> did this -- faster, save mem too. Would anyone else find this useful ?
>
> I recently stumbled across partial_sort in stl; fwiw,
> std:partial_sort( A, A + sqrt(N), A + N ) is ~ 10 times faster than
> std:sort
> on my old mac ppc, even for N 100.
> Also fwiw, nth_element alone is ~ twice as slow as partial_sort --
> odd.

I think a lot of people would like a partial sort. It comes up on the
list now and then. There's a ticket with a cython partial sort:

http://projects.scipy.org/numpy/ticket/1213

Here's the docstring:

def select(a, k, inplace=False):
    '''
    Wirth's version of Hoare's quick select

    Parameters
    ----------
    a : array_like
    k : integer
    inplace : boolean
        The partial sort is done inplace if a is a
        contiguous ndarray and inplace=True.
        Default: False.

    Returns
    -------
    out : ndarray
        Partially sorted a such that out[k] is
        the k largest element. Elements smaller than
        out[k] are unsorted in out[:k]. Elements larger
        than out[k] are unsorted in out[k:].

    '''



More information about the SciPy-User mailing list