[Python-3000] Four new failing tests

Adam Olsen rhamph at gmail.com
Sat Aug 11 22:46:07 CEST 2007


On 8/11/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> > ======================================================================
> > ERROR: test_char_write (__main__.TestArrayWrites)
> > ----------------------------------------------------------------------
> > Traceback (most recent call last):
> >   File "Lib/test/test_csv.py", line 648, in test_char_write
> >     a = array.array('u', string.letters)
> > ValueError: string length not a multiple of item size
>
> I think some decision should be made wrt. string.letters.
>
> Clearly, string.letters cannot reasonably contain *all* letters
> (i.e. all characters of categories Ll, Lu, Lt, Lo). Or can it?
>
> Traditionally, string.letters contained everything that is a letter
> in the current locale. Still, computing this string might be expensive
> assuming you have to go through all Unicode code points and determine
> whether they are letters in the current locale.
>
> So I see the following options:
> 1. remove it entirely. Keep string.ascii_letters instead
> 2. remove string.ascii_letters, and make string.letters to be
>    ASCII only.
> 3. Make string.letters contain all letters in the current locale.
> 4. Make string.letters truly contain everything that is classified
>    as a letter in the Unicode database.

Wasn't unicodedata.ascii_letters suggested at one point (to eliminate
the string module), or was that my imagination?

IMO, if there is a need for unicode or locale letters, we should
provide a function to generate them as needed.  It can be passed
directly to set or whatever datastructure is actually needed.  We
shouldn't burden the startup cost with such a large datastructure
unless absolutely necessary (nor should we use a property to load it
when first needed; expensive to compute attribute and all that).

-- 
Adam Olsen, aka Rhamphoryncus


More information about the Python-3000 mailing list