[Python-Dev] Mini-Pep: An Empty String ABC

Guido van Rossum guido at python.org
Mon Jun 2 01:06:10 CEST 2008


This PEP is incomplete without specifying exactly which built-in and
stdlib types should be registered as String instances.

I'm also confused -- the motivation seems mostly "so that you can skip
iterating over it when flattening a nested sequence" but earlier you
rejected my "Atomic" proposal, saying "Earlier in the thread it was
made clear that that atomicity is not an intrinsic property of a type;
instead it varies across applications [...]". Isn't this String
proposal just that by another name?

Finally, I fully expect lots of code writing isinstance(x, String) and
making many more assumptions than promised by the String ABC. For
example that s[0] has the same type as s (not true for bytes). Or that
it is hashable (the Sequence class doesn't define __hash__). Or that
s1+s2 will work (not in the Sequence class either). And many more.

All this makes me lean towards a rejection of this proposal -- it
seems worse than no proposal at all. It could perhaps be rescued by
adding some small set of defined operations.

--Guido

On Sat, May 31, 2008 at 11:59 PM, Raymond Hettinger <python at rcn.com> wrote:
> Mini-Pep:  An Empty String ABC
> Target:  Py2.6 and Py3.0
> Author:  Raymond Hettinger
>
> Proposal
> --------
>
> Add a new collections ABC specified as:
>
>   class String(Sequence):
>       pass
>
> Motivation
> ----------
> Having an ABC for strings allows string look-alike classes to declare
> themselves as sequences that contain text.  Client code (such as a flatten
> operation or tree searching tool) may use that ABC to usefully differentiate
> strings from other sequences (i.e. containers vs containees).  And in code
> that only relies on sequence behavior, isinstance(x,str) may be usefully
> replaced by isinstance(x,String) so that look-alikes can be substituted in
> calling code.
>
> A natural temptation is add other methods to the String ABC, but strings are
> a
> tough case.  Beyond simple sequence manipulation, the string methods get
> very
> complex.  An ABC that included those methods would make it tough to write a
> compliant class that could be registered as a String.  The split(),
> rsplit(),
> partition(), and rpartition() methods are examples of methods that would be
> difficult to emulate correctly.  Also, starting with Py3.0, strings are
> essentially abstract sequences of code points, meaning that an encode()
> method
> is essential to being able to usefully transform them back into concrete
> data.
> Unfortunately, the encode method is so complex that it cannot be readily
> emulated by an aspiring string look-alike.
>
> Besides complexity, another problem with the concrete str API is the
> extensive
> number of methods.  If string look-alikes were required to emulate the likes
> of zfill(), ljust(), title(), translate(), join(), etc., it would
> significantly add to the burden of writing a class complying with the String
> ABC.
>
> The fundamental problem is that of balancing a client function's desire to
> rely on a broad number of behaviors against the difficulty of writing a
> compliant look-alike class.  For other ABCs, the balance is more easily
> struck
> because the behaviors are fewer in number, because they are easier to
> implement correctly, and because some methods can be provided as mixins.
>  For
> a String ABC, the balance should lean toward minimalism due to the large
> number of methods and how difficult it is to implement some of the
> correctly.
>
> A last reason to avoid expanding the String API is that almost none of the
> candidate methods characterize the notion of "stringiness".  With something
> calling itself an integer, an __add__() method would be expected as it is
> fundamental to the notion of "integeriness".  In contrast, methods like
> startswith() and title() are non-essential extras -- we would not discount
> something as being not stringlike if those methods were not present.
>
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list