[SciPy-dev] chararray docstrings

David Goldsmith d.l.goldsmith at gmail.com
Mon Oct 12 13:11:18 EDT 2009


All sounds good to me (except, if you're gonna say "if you need arrays of
variable length strings, use such and such," be explicit in the previous
sentence about the recommendation being for fixed length strings, IMO).

DG

On Mon, Oct 12, 2009 at 8:40 AM, Michael Droettboom <mdroe at stsci.edu> wrote:

> I was able to make my big chararray commit today.  If I understand
> correctly, I need to wait 24 hours for the doc editor to sync with SVN,
> and then I should mark all the chararray-related docstrings as "needs
> review".
>
> The primary change to the docstrings is that all of the methods of the
> chararray class are now free functions.  These free functions represent
> the "primary" entry points, and thus have detailed documentation, and
> the chararray methods now have short "pointer" docstrings to the free
> functions.
>
> Where the docstring content itself has been updated, it is mainly to
> bring them closer to the Python standard library descriptions of these
> functions, which in most cases was more precise (since we are, in fact,
> calling the stdlib function under the hood) and concise (because the
> stdlib docs have been through a number of revisions and really get it
> right by now).
>
> I do have a concern about one phrase that was used in a number of places
> that probably deserves some discussion:
>
> "The chararray module exists for backwards compatibility with Numarray,
> it is not recommended for new development. If one needs arrays of
> strings, use arrays of dtype
> <http://docs.scipy.org/numpy/docs/numpy.dtype/#dtype> object."
>
> There are many use cases (such as handling a binary structured format
> like FITS) where a dtype of 'string_' is more appropriate than a dtype
> of 'object_', and we shouldn't imply that all uses of chararray should
> now use object arrays.  Additionally, fast vectorized string operations
> will perform best on arrays of type 'string_' and 'unicode_', though
> 'object_' will work, it requires casting all objects to strings along
> the way, and could fail thousands of items in to an operation.  It's a
> "best tool for the job" judgment call, not a "one tool fits all".
> Perhaps the above should read:
>
> "If one needs arrays of strings, use arrays of dtype
> <http://docs.scipy.org/numpy/docs/numpy.dtype/#dtype> string_ or
> unicode_.  If one needs arrays of variable-length strings, use arrays of
> dtype object_."
>
> Mike
>
> --
> Michael Droettboom
> Science Software Branch
> Operations and Engineering Division
> Space Telescope Science Institute
> Operated by AURA for NASA
>
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20091012/2c26b75c/attachment.html>


More information about the SciPy-Dev mailing list