[Python-Dev] [Python-checkins] r87505 - in python/branches/py3k: Doc/c-api/unicode.rst Include/unicodeobject.h

Tue Dec 28 18:14:52 CET 2010

On Tue, 28 Dec 2010 10:28:51 +0100, Victor Stinner <victor.stinner at haypocalc.com> wrote:
> Le lundi 27 dÃ©cembre 2010 Ã  23:13 -0500, R. David Murray a Ã©crit :
> > > Modified: python/branches/py3k/Doc/c-api/unicode.rst
> > > ==============================================================================
> > > --- python/branches/py3k/Doc/c-api/unicode.rst      (original)
> > > +++ python/branches/py3k/Doc/c-api/unicode.rst      Mon Dec 27 02:49:29 2010
> > > @@ -1063,7 +1063,8 @@
> > >  .. c:function:: int PyUnicode_CompareWithASCIIString(PyObject *uni, char *string)
> > >
> > >     Compare a unicode object, *uni*, with *string* and return -1, 0, 1 for less
> > > -   than, equal, and greater than, respectively.
> > > +   than, equal, and greater than, respectively. *string* is an ASCII-encoded
> > > +   string (it is interpreted as ISO-8859-1).
> >
> > Does it mean anything to say that an ASCII string is interpreted as
> > ISO-8859-1?  If it is ASCII-encoded it shouldn't have any bytes with
> > the 8th bit set, leaving no room for interpretation.  So presumably
> > you mean it is (treated as) an ISO-8859-1 encoded string, despite the
> > function name?
> 
> Oh. Someone noticed :-) I would like to say that it is better to pass
> only ASCII-encoded string, but the function supports ISO-8859-1.
> 
> Would it be more clear to say that the function expects ISO-8859-1
> encoded string?
> 
> But I don't want to patch the function.

I think your first paragraph is what you should put in the docs: "it is
best to pass only ASCII-encoded strings, but the function interprets
the input string as ISO-8859-1 if it contains non-ASCII characters".

A bit harder to compress that into an in-line comment in the code...

--
R. David Murray                                      www.bitdance.com