[Scipy-svn] r6884 - trunk/scipy/cluster

Sun Nov 14 04:58:41 EST 2010

Author: rgommers
Date: 2010-11-14 03:58:41 -0600 (Sun, 14 Nov 2010)
New Revision: 6884

Modified:
   trunk/scipy/cluster/__init__.py
   trunk/scipy/cluster/hierarchy.py
   trunk/scipy/cluster/vq.py
Log:
DOC: merge wiki edits for cluster module.

Modified: trunk/scipy/cluster/__init__.py
===================================================================

--- trunk/scipy/cluster/__init__.py	2010-11-14 09:58:13 UTC (rev 6883)
+++ trunk/scipy/cluster/__init__.py	2010-11-14 09:58:41 UTC (rev 6884)
@@ -1,3 +1,20 @@
+"""
+Vector Quantization / Kmeans
+============================
+Clustering algorithms are useful in information theory, target detection,
+communications, compression, and other areas.  The `vq` module only
+supports vector quantization and the k-means algorithms.  Development of
+self-organizing maps (SOM) and other approaches is underway.
+
+Hierarchical Clustering
+=======================
+The `hierarchy` module provides functions for hierarchical and
+agglomerative clustering.  Its features include generating hierarchical
+clusters from distance matrices, computing distance matrices from
+observation vectors, calculating statistics on clusters, cutting linkages
+to generate flat clusters, and visualizing clusters with dendrograms.
+
+"""
 #
 # spatial - Distances
 #

Modified: trunk/scipy/cluster/hierarchy.py
===================================================================
--- trunk/scipy/cluster/hierarchy.py	2010-11-14 09:58:13 UTC (rev 6883)
+++ trunk/scipy/cluster/hierarchy.py	2010-11-14 09:58:41 UTC (rev 6884)
@@ -6,109 +6,75 @@
 or find the roots of the forest formed by a cut by providing the flat
 cluster ids of each observation.
 
-+------------------+-------------------------------------------------+
-|*Function*        | *Description*                                   |
-+------------------+-------------------------------------------------+
-|fcluster          |forms flat clusters from hierarchical clusters.  |
-+------------------+-------------------------------------------------+
-|fclusterdata      |forms flat clusters directly from data.          |
-+------------------+-------------------------------------------------+
-|leaders           |singleton root nodes for flat cluster.           |
-+------------------+-------------------------------------------------+
+.. autosummary::
+   :toctree: generated/
 
+   fcluster
+   fclusterdata
+   leaders
+
 These are routines for agglomerative clustering.
 
-+------------------+-------------------------------------------------+
-|*Function*        | *Description*                                   |
-+------------------+-------------------------------------------------+
-|linkage           |agglomeratively clusters original observations.  |
-+------------------+-------------------------------------------------+
-|single            |the single/min/nearest algorithm. (alias)        |
-+------------------+-------------------------------------------------+
-|complete          |the complete/max/farthest algorithm. (alias)     |
-+------------------+-------------------------------------------------+
-|average           |the average/UPGMA algorithm. (alias)             |
-+------------------+-------------------------------------------------+
-|weighted          |the weighted/WPGMA algorithm. (alias)            |
-+------------------+-------------------------------------------------+
-|centroid          |the centroid/UPGMC algorithm. (alias)            |
-+------------------+-------------------------------------------------+
-|median            |the median/WPGMC algorithm. (alias)              |
-+------------------+-------------------------------------------------+
-|ward              |the Ward/incremental algorithm. (alias)          |
-+------------------+-------------------------------------------------+
+.. autosummary::
+   :toctree: generated/
 
+   linkage
+   single
+   complete
+   average
+   weighted
+   centroid
+   median
+   ward
+
 These routines compute statistics on hierarchies.
 
-+------------------+-------------------------------------------------+
-|*Function*        | *Description*                                   |
-+------------------+-------------------------------------------------+
-|cophenet          |computes the cophenetic distance between leaves. |
-+------------------+-------------------------------------------------+
-|from_mlab_linkage |converts a linkage produced by MATLAB(TM).       |
-+------------------+-------------------------------------------------+
-|inconsistent      |the inconsistency coefficients for cluster.      |
-+------------------+-------------------------------------------------+
-|maxinconsts       |the maximum inconsistency coefficient for each   |
-|                  |cluster.                                         |
-+------------------+-------------------------------------------------+
-|maxdists          |the maximum distance for each cluster.           |
-+------------------+-------------------------------------------------+
-|maxRstat          |the maximum specific statistic for each cluster. |
-+------------------+-------------------------------------------------+
-|to_mlab_linkage   |converts a linkage to one MATLAB(TM) can         |
-|                  |understand.                                      |
-+------------------+-------------------------------------------------+
+.. autosummary::
+   :toctree: generated/
 
+   cophenet
+   from_mlab_linkage
+   inconsistent
+   maxinconsts
+   maxdists
+   maxRstat
+   to_mlab_linkage
+
 Routines for visualizing flat clusters.
 
-+------------------+-------------------------------------------------+
-|*Function*        | *Description*                                   |
-+------------------+-------------------------------------------------+
-|dendrogram        |visualizes linkages (requires matplotlib).       |
-+------------------+-------------------------------------------------+
+.. autosummary::
+   :toctree: generated/
 
+   dendrogram
+
 These are data structures and routines for representing hierarchies as
 tree objects.
 
-+------------------+-------------------------------------------------+
-|*Function*        | *Description*                                   |
-+------------------+-------------------------------------------------+
-|ClusterNode       |represents cluster nodes in a cluster hierarchy. |
-+------------------+-------------------------------------------------+
-|leaves_list       |a left-to-right traversal of the leaves.         |
-+------------------+-------------------------------------------------+
-|to_tree           |represents a linkage matrix as a tree object.    |
-+------------------+-------------------------------------------------+
+.. autosummary::
+   :toctree: generated/
 
+   ClusterNode
+   leaves_list
+   to_tree
+
 These are predicates for checking the validity of linkage and
 inconsistency matrices as well as for checking isomorphism of two
 flat cluster assignments.
 
-+------------------+-------------------------------------------------+
-|*Function*        | *Description*                                   |
-+------------------+-------------------------------------------------+
-|is_valid_im       |checks for a valid inconsistency matrix.         |
-+------------------+-------------------------------------------------+
-|is_valid_linkage  |checks for a valid hierarchical clustering.      |
-+------------------+-------------------------------------------------+
-|is_isomorphic     |checks if two flat clusterings are isomorphic.   |
-+------------------+-------------------------------------------------+
-|is_monotonic      |checks if a linkage is monotonic.                |
-+------------------+-------------------------------------------------+
-|correspond        |checks whether a condensed distance matrix       |
-|                  |corresponds with a linkage                       |
-+------------------+-------------------------------------------------+
-|num_obs_linkage   |the number of observations corresponding to a    |
-|                  |linkage matrix.                                  |
-+------------------+-------------------------------------------------+
+.. autosummary::
+   :toctree: generated/
 
+   is_valid_im
+   is_valid_linkage
+   is_isomorphic
+   is_monotonic
+   correspond
+   num_obs_linkage
 
 * MATLAB and MathWorks are registered trademarks of The MathWorks, Inc.
 
 * Mathematica is a registered trademark of The Wolfram Research, Inc.
 
-
 References
 ----------
 
@@ -1332,38 +1298,40 @@
 def fcluster(Z, t, criterion='inconsistent', depth=2, R=None, monocrit=None):
     """
     Forms flat clusters from the hierarchical clustering defined by
-    the linkage matrix ``Z``. The threshold ``t`` is a required parameter.
+    the linkage matrix ``Z``.
 
-    :Arguments:
+    Parameters
+    ----------
+    Z : ndarray
+        The hierarchical clustering encoded with the matrix returned
+        by the `linkage` function.
+    t : float
+        The threshold to apply when forming flat clusters.
+    criterion : str, optional
+        The criterion to use in forming flat clusters. This can
+        be any of the following values:
 
-        - Z : ndarray
-          The hierarchical clustering encoded with the matrix returned
-          by the ``linkage`` function.
-
-        - t : double
-          The threshold to apply when forming flat clusters.
-
-        - criterion : string (optional)
-          The criterion to use in forming flat clusters. This can
-          be any of the following values:
-
-              * 'inconsistent': If a cluster node and all its
-              decendents have an inconsistent value less than or equal
-              to ``t`` then all its leaf descendents belong to the
+          'inconsistent':
+              If a cluster node and all its
+              descendants have an inconsistent value less than or equal
+              to ``t`` then all its leaf descendants belong to the
               same flat cluster. When no non-singleton cluster meets
               this criterion, every node is assigned to its own
               cluster. (Default)
 
-              * 'distance': Forms flat clusters so that the original
+          'distance':
+              Forms flat clusters so that the original
               observations in each flat cluster have no greater a
               cophenetic distance than ``t``.
 
-              * 'maxclust': Finds a minimum threshold ``r`` so that
+          'maxclust':
+              Finds a minimum threshold ``r`` so that
               the cophenetic distance between any two original
               observations in the same flat cluster is no more than
               ``r`` and no more than ``t`` flat clusters are formed.
 
-              * 'monocrit': Forms a flat cluster from a cluster node c
+          'monocrit':
+              Forms a flat cluster from a cluster node c
               with index i when ``monocrit[j] <= t``.
 
               For example, to threshold on the maximum mean distance
@@ -1373,38 +1341,38 @@
                 MR = maxRstat(Z, R, 3)
                 cluster(Z, t=0.8, criterion='monocrit', monocrit=MR)
 
-              * 'maxclust_monocrit': Forms a flat cluster from a
+          'maxclust_monocrit':
+              Forms a flat cluster from a
               non-singleton cluster node ``c`` when ``monocrit[i] <=
               r`` for all cluster indices ``i`` below and including
               ``c``. ``r`` is minimized such that no more than ``t``
               flat clusters are formed. monocrit must be
               monotonic. For example, to minimize the threshold t on
               maximum inconsistency values so that no more than 3 flat
-              clusters are formed, do:
+              clusters are formed, do::
 
                 MI = maxinconsts(Z, R)
                 cluster(Z, t=3, criterion='maxclust_monocrit', monocrit=MI)
 
-         - depth : int (optional)
-           The maximum depth to perform the inconsistency calculation.
-           It has no meaning for the other criteria. (default=2)
+    depth : int, optional
+        The maximum depth to perform the inconsistency calculation.
+        It has no meaning for the other criteria. Default is 2.
+    R : ndarray, optional
+        The inconsistency matrix to use for the 'inconsistent'
+        criterion. This matrix is computed if not provided.
+    monocrit : ndarray, optional
+        An array of length n-1. ``monocrit[i]`` is the
+        statistics upon which non-singleton i is thresholded. The
+        monocrit vector must be monotonic, i.e. given a node c with
+        index i, for all node indices j corresponding to nodes
+        below c, ``monocrit[i] >= monocrit[j]``.
 
-         - R : ndarray (optional)
-           The inconsistency matrix to use for the 'inconsistent'
-           criterion. This matrix is computed if not provided.
+    Returns
+    -------
+    fcluster : ndarray
+        An array of length n. T[i] is the flat cluster number to
+        which original observation i belongs.
 
-         - monocrit : ndarray (optional)
-           A ``(n-1)`` numpy vector of doubles. ``monocrit[i]`` is the
-           statistics upon which non-singleton ``i`` is thresholded. The
-           monocrit vector must be monotonic, i.e. given a node ``c`` with
-           index ``i``, for all node indices j corresponding to nodes
-           below ``c``, ``monocrit[i] >= monocrit[j]``.
-
-    :Returns:
-
-        - T : ndarray
-            A vector of length ``n``. ``T[i]`` is the flat cluster number to
-            which original observation ``i`` belongs.
     """
     Z = np.asarray(Z, order='c')
     is_valid_linkage(Z, throw=True, name='Z')
@@ -1444,67 +1412,56 @@
 def fclusterdata(X, t, criterion='inconsistent', \
                  metric='euclidean', depth=2, method='single', R=None):
     """
-    ``T = fclusterdata(X, t)``
+    Cluster observation data using a given metric.
 
-    Clusters the original observations in the ``n`` by ``m`` data
-    matrix ``X`` (``n`` observations in ``m`` dimensions), using the
-    euclidean distance metric to calculate distances between original
-    observations, performs hierarchical clustering using the single
-    linkage algorithm, and forms flat clusters using the inconsistency
-    method with t as the cut-off threshold.
+    Clusters the original observations in the n-by-m data
+    matrix X (n observations in m dimensions), using the euclidean
+    distance metric to calculate distances between original observations,
+    performs hierarchical clustering using the single linkage algorithm,
+    and forms flat clusters using the inconsistency method with `t` as the
+    cut-off threshold.
 
-    A one-dimensional numpy array ``T`` of length ``n`` is
-    returned. ``T[i]`` is the index of the flat cluster to which the
-    original observation ``i`` belongs.
+    A one-dimensional array T of length n is returned. T[i] is the index
+    of the flat cluster to which the original observation i belongs.
 
-    :Arguments:
+    Parameters
+    ----------
+    X : ndarray
+        n by m data matrix with n observations in m dimensions.
+    t : float
+        The threshold to apply when forming flat clusters.
+    criterion : str, optional
+        Specifies the criterion for forming flat clusters.  Valid
+        values are 'inconsistent' (default), 'distance', or 'maxclust'
+        cluster formation algorithms. See `fcluster` for descriptions.
+    method : str, optional
+        The linkage method to use (single, complete, average,
+        weighted, median centroid, ward). See `linkage` for more
+        information. Default is "single".
+    metric : str, optional
+        The distance metric for calculating pairwise distances. See
+        `distance.pdist` for descriptions and linkage to verify
+        compatibility with the linkage method.
+    t : double, optional
+        The cut-off threshold for the cluster function or the
+        maximum number of clusters (criterion='maxclust').
+    depth : int, optional
+        The maximum depth for the inconsistency calculation. See
+        `inconsistent` for more information.
+    R : ndarray, optional
+        The inconsistency matrix. It will be computed if necessary
+        if it is not passed.
 
-        - X : ndarray
-          ``n`` by ``m`` data matrix with ``n`` observations in ``m``
-          dimensions.
+    Returns
+    -------
+    T : ndarray
+        A vector of length n. T[i] is the flat cluster number to
+        which original observation i belongs.
 
-        - t : double
-          The threshold to apply when forming flat clusters.
-
-        - criterion : string
-          Specifies the criterion for forming flat clusters.  Valid
-          values are 'inconsistent', 'distance', or 'maxclust' cluster
-          formation algorithms. See ``fcluster`` for descriptions.
-
-        - method : string
-          The linkage method to use (single, complete, average,
-          weighted, median centroid, ward). See ``linkage`` for more
-          information.
-
-        - metric : string
-          The distance metric for calculating pairwise distances. See
-          distance.pdist for descriptions and linkage to verify
-          compatibility with the linkage method.
-
-        - t : double
-          The cut-off threshold for the cluster function or the
-          maximum number of clusters (criterion='maxclust').
-
-        - depth : int
-          The maximum depth for the inconsistency calculation. See
-          ``inconsistent`` for more information.
-
-        - R : ndarray
-          The inconsistency matrix. It will be computed if necessary
-          if it is not passed.
-
-
-    :Returns:
-
-        - T : ndarray
-          A vector of length ``n``. ``T[i]`` is the flat cluster number to
-          which original observation ``i`` belongs.
-
     Notes
     -----
+    This function is similar to the MATLAB function clusterdata.
 
-    This function is similar to MATLAB(TM) clusterdata function.
-
     """
     X = np.asarray(X, order='c', dtype=np.double)
 

Modified: trunk/scipy/cluster/vq.py
===================================================================
--- trunk/scipy/cluster/vq.py	2010-11-14 09:58:13 UTC (rev 6883)
+++ trunk/scipy/cluster/vq.py	2010-11-14 09:58:41 UTC (rev 6884)
@@ -1,71 +1,77 @@
-""" K-means Clustering and Vector Quantization Module
+"""
+K-means Clustering and Vector Quantization Module
 
-    Provides routines for k-means clustering, generating code books
-    from k-means models, and quantizing vectors by comparing them with
-    centroids in a code book.
+Provides routines for k-means clustering, generating code books
+from k-means models, and quantizing vectors by comparing them with
+centroids in a code book.
 
-    The k-means algorithm takes as input the number of clusters to
-    generate, k, and a set of observation vectors to cluster.  It
-    returns a set of centroids, one for each of the k clusters.  An
-    observation vector is classified with the cluster number or
-    centroid index of the centroid closest to it.
+The k-means algorithm takes as input the number of clusters to
+generate, k, and a set of observation vectors to cluster.  It
+returns a set of centroids, one for each of the k clusters.  An
+observation vector is classified with the cluster number or
+centroid index of the centroid closest to it.
 
-    A vector v belongs to cluster i if it is closer to centroid i than
-    any other centroids. If v belongs to i, we say centroid i is the
-    dominating centroid of v. Common variants of k-means try to
-    minimize distortion, which is defined as the sum of the distances
-    between each observation vector and its dominating centroid.  Each
-    step of the k-means algorithm refines the choices of centroids to
-    reduce distortion. The change in distortion is often used as a
-    stopping criterion: when the change is lower than a threshold, the
-    k-means algorithm is not making sufficient progress and
-    terminates.
+A vector v belongs to cluster i if it is closer to centroid i than
+any other centroids. If v belongs to i, we say centroid i is the
+dominating centroid of v. The k-means algorithm tries to
+minimize distortion, which is defined as the sum of the squared distances
+between each observation vector and its dominating centroid.  Each
+step of the k-means algorithm refines the choices of centroids to
+reduce distortion. The change in distortion is used as a
+stopping criterion: when the change is lower than a threshold, the
+k-means algorithm is not making sufficient progress and
+terminates. One can also define a maximum number of iterations.
 
-    Since vector quantization is a natural application for k-means,
-    information theory terminology is often used.  The centroid index
-    or cluster index is also referred to as a "code" and the table
-    mapping codes to centroids and vice versa is often referred as a
-    "code book". The result of k-means, a set of centroids, can be
-    used to quantize vectors. Quantization aims to find an encoding of
-    vectors that reduces the expected distortion.
+Since vector quantization is a natural application for k-means,
+information theory terminology is often used.  The centroid index
+or cluster index is also referred to as a "code" and the table
+mapping codes to centroids and vice versa is often referred as a
+"code book". The result of k-means, a set of centroids, can be
+used to quantize vectors. Quantization aims to find an encoding of
+vectors that reduces the expected distortion.
 
-    For example, suppose we wish to compress a 24-bit color image
-    (each pixel is represented by one byte for red, one for blue, and
-    one for green) before sending it over the web.  By using a smaller
-    8-bit encoding, we can reduce the amount of data by two
-    thirds. Ideally, the colors for each of the 256 possible 8-bit
-    encoding values should be chosen to minimize distortion of the
-    color. Running k-means with k=256 generates a code book of 256
-    codes, which fills up all possible 8-bit sequences.  Instead of
-    sending a 3-byte value for each pixel, the 8-bit centroid index
-    (or code word) of the dominating centroid is transmitted. The code
-    book is also sent over the wire so each 8-bit code can be
-    translated back to a 24-bit pixel value representation. If the
-    image of interest was of an ocean, we would expect many 24-bit
-    blues to be represented by 8-bit codes. If it was an image of a
-    human face, more flesh tone colors would be represented in the
-    code book.
+All routines expect obs to be a M by N array where the rows are
+the observation vectors. The codebook is a k by N array where the
+i'th row is the centroid of code word i. The observation vectors
+and centroids have the same feature dimension.
 
-    All routines expect obs to be a M by N array where the rows are
-    the observation vectors. The codebook is a k by N array where the
-    i'th row is the centroid of code word i. The observation vectors
-    and centroids have the same feature dimension.
+As an example, suppose we wish to compress a 24-bit color image
+(each pixel is represented by one byte for red, one for blue, and
+one for green) before sending it over the web.  By using a smaller
+8-bit encoding, we can reduce the amount of data by two
+thirds. Ideally, the colors for each of the 256 possible 8-bit
+encoding values should be chosen to minimize distortion of the
+color. Running k-means with k=256 generates a code book of 256
+codes, which fills up all possible 8-bit sequences.  Instead of
+sending a 3-byte value for each pixel, the 8-bit centroid index
+(or code word) of the dominating centroid is transmitted. The code
+book is also sent over the wire so each 8-bit code can be
+translated back to a 24-bit pixel value representation. If the
+image of interest was of an ocean, we would expect many 24-bit
+blues to be represented by 8-bit codes. If it was an image of a
+human face, more flesh tone colors would be represented in the
+code book.
 
-    whiten(obs) --
-        Normalize a group of observations so each feature has unit
-        variance.
-    vq(obs,code_book) --
-        Calculate code book membership of a set of observation
-        vectors.
-    kmeans(obs,k_or_guess,iter=20,thresh=1e-5) --
-        Clusters a set of observation vectors. Learns centroids with
-        the k-means algorithm, trying to minimize distortion.  A code
-        book is generated that can be used to quantize vectors.
-    kmeans2 --
-        A different implementation of k-means with more methods for
-        initializing centroids.  Uses maximum number of iterations as
-        opposed to a distortion threshold as its stopping criterion.
+Functions
+---------
+`whiten` :
+    Normalize a group of observations so each feature has unit
+    variance.
 
+`vq` :
+    Calculate code book membership of a set of observation
+    vectors.
+
+`kmeans` :
+    Clusters a set of observation vectors. Learns centroids with
+    the k-means algorithm, trying to minimize distortion.  A code
+    book is generated that can be used to quantize vectors.
+
+`kmeans2` :
+    A different implementation of k-means with more methods for
+    initializing centroids.  Uses maximum number of iterations as
+    opposed to a distortion threshold as its stopping criterion.
+
 """
 __docformat__ = 'restructuredtext'
 
@@ -393,62 +399,68 @@
     return code_book, avg_dist[-1]
 
 def kmeans(obs, k_or_guess, iter=20, thresh=1e-5):
-    """Performs k-means on a set of observation vectors forming k
-       clusters. This yields a code book mapping centroids to codes
-       and vice versa. The k-means algorithm adjusts the centroids
-       until sufficient progress cannot be made, i.e. the change in
-       distortion since the last iteration is less than some
-       threshold.
+    """
+    Performs k-means on a set of observation vectors forming k clusters.
 
-    :Parameters:
-        obs : ndarray
-            Each row of the M by N array is an observation vector. The
-            columns are the features seen during each observation.
-            The features must be whitened first with the whiten
-            function.
+    The k-means algorithm adjusts the centroids until sufficient
+    progress cannot be made, i.e. the change in distortion since
+    the last iteration is less than some threshold. This yields
+    a code book mapping centroids to codes and vice versa.
 
-        k_or_guess : int or ndarray
-            The number of centroids to generate. A code is assigned to
-            each centroid, which is also the row index of the centroid
-            in the code_book matrix generated.
+    Distortion is defined as the sum of the squared differences
+    between the observations and the corresponding centroid.
 
-            The initial k centroids are chosen by randomly selecting
-            observations from the observation matrix. Alternatively,
-            passing a k by N array specifies the initial k centroids.
+    Parameters
+    ----------
+    obs : ndarray
+       Each row of the M by N array is an observation vector. The
+       columns are the features seen during each observation.
+       The features must be whitened first with the `whiten` function.
 
-        iter : int
-            The number of times to run k-means, returning the codebook
-            with the lowest distortion. This argument is ignored if
-            initial centroids are specified with an array for the
-            k_or_guess paramter. This parameter does not represent the
-            number of iterations of the k-means algorithm.
+    k_or_guess : int or ndarray
+       The number of centroids to generate. A code is assigned to
+       each centroid, which is also the row index of the centroid
+       in the code_book matrix generated.
 
-        thresh : float
-            Terminates the k-means algorithm if the change in
-            distortion since the last k-means iteration is less than
-            thresh.
+       The initial k centroids are chosen by randomly selecting
+       observations from the observation matrix. Alternatively,
+       passing a k by N array specifies the initial k centroids.
 
-    :Returns:
-        codebook : ndarray
-            A k by N array of k centroids. The i'th centroid
-            codebook[i] is represented with the code i. The centroids
-            and codes generated represent the lowest distortion seen,
-            not necessarily the globally minimal distortion.
+    iter : int, optional
+       The number of times to run k-means, returning the codebook
+       with the lowest distortion. This argument is ignored if
+       initial centroids are specified with an array for the
+       ``k_or_guess`` parameter. This parameter does not represent the
+       number of iterations of the k-means algorithm.
 
-        distortion : float
-           The distortion between the observations passed and the
-           centroids generated.
+    thresh : float, optional
+       Terminates the k-means algorithm if the change in
+       distortion since the last k-means iteration is less than
+       or equal to thresh.
 
-    :SeeAlso:
-        - kmeans2: a different implementation of k-means clustering
-          with more methods for generating initial centroids but without
-          using a distortion change threshold as a stopping criterion.
-        - whiten: must be called prior to passing an observation matrix
-          to kmeans.
+    Returns
+    -------
+    codebook : ndarray
+       A k by N array of k centroids. The i'th centroid
+       codebook[i] is represented with the code i. The centroids
+       and codes generated represent the lowest distortion seen,
+       not necessarily the globally minimal distortion.
 
+    distortion : float
+       The distortion between the observations passed and the
+       centroids generated.
+
+    See Also
+    --------
+    kmeans2 : a different implementation of k-means clustering
+       with more methods for generating initial centroids but without
+       using a distortion change threshold as a stopping criterion.
+
+    whiten : must be called prior to passing an observation matrix
+       to kmeans.
+
     Examples
     --------
-
     >>> from numpy import array
     >>> from scipy.cluster.vq import vq, kmeans, whiten
     >>> features  = array([[ 1.9,2.3],