[SciPy-dev] Some Q's vis-a-vis Numpy unicode support

josef.pktd at gmail.com josef.pktd at gmail.com
Tue Aug 11 20:41:33 EDT 2009


On Tue, Aug 11, 2009 at 7:49 PM, David Goldsmith<d_l_goldsmith at yahoo.com> wrote:
> OK, may have answered Q1 myself: unless I'm misunderstanding what I'm seeing, what I'm finding is that capitalize() does nothing at all if the chararray is of dtype unicode - correct?  Thanks,


>>> b
chararray(u'\xe9',
      dtype='<U1')
>>> b.capitalize()
chararray(u'\xc9',
      dtype='<U1')

see http://stackoverflow.com/questions/1006450/capitalizing-non-ascii-words-in-python



>
> DG
>
> --- On Tue, 8/11/09, David Goldsmith <d_l_goldsmith at yahoo.com> wrote:
>
>> From: David Goldsmith <d_l_goldsmith at yahoo.com>
>> Subject: Some Q's vis-a-vis Numpy unicode support
>> To: scipy-dev at scipy.org
>> Date: Tuesday, August 11, 2009, 4:02 PM
>> First, a "reality check" question:
>>
>> 0) Is Windows (DOS) Terminal capable of rendering unicode?

not by default ( in US english at least)
but the code page number can be changed, which I never tried

>help graftabl
Enable Windows to display an extended character set in graphics mode.

GRAFTABL [xxx]
GRAFTABL /STATUS

   xxx      Specifies a code page number.
   /STATUS  Displays the current code page selected for use with GRAFTABL.



from python session in windows command shell (it prints correctly in
case mail doesn't render it)
>>> print u'\xe9'
é
>>> print u'\xe9'.capitalize()
É
>>> u'\xe9'.capitalize()
u'\xc9'
>>>


but I cannot print any numpy.chararrays without getting
>>> c= np.array(u'\xe9','<U1')
>>> print c
....
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
position 0: ordinal not in range(128)

(this is in Idle, with cp1252 I think)

the usual encode, decode problems with unicode, which take several
hours of trial and error and reading docs to figure out.

Josef

>>
>> Unless the answer is "No," my real question:
>>
>> 1) Does chararray.capitalize() capitalize non-Roman letters
>> that have different lower-case and upper-case forms (e.g.,
>> the Greek letters)?  If "yes," are there any exceptions
>> (e.g., Russian letters)?
>>
>> Thanks!
>>
>> DG
>>
>>
>>
>>
>
>
>
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>



More information about the SciPy-Dev mailing list