[ python-Bugs-1026480 ] iso-latin-1 strings and functions lower & upper

SourceForge.net noreply at sourceforge.net
Sat Sep 11 23:28:34 CEST 2004


Bugs item #1026480, was opened at 2004-09-11 18:28
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1026480&group_id=5470

Category: None
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Tomasz Kowaltowski (kowaltowski)
Assigned to: Nobody/Anonymous (nobody)
Summary: iso-latin-1 strings and functions lower & upper

Initial Comment:
I have no problems in Python in using strings which
contain accented letters (my Emacs has no problems in
producing them using one-byte iso-8859-1 encoding).
However functions 'lower' and 'upper' do not work
properly on these letters as shown below (I hope all
accents appear properly within your browsers):

-------------------------------------------------------------
as = "aáàâãä"      # except for the first 'a', all
other have accents
AS = "AÁÀÂÃÄ"      # except for the first 'A', all
other have accents
print "direct: %s -- %s" % (as, AS)
print "lower:  %s -- %s" % (as.lower(), AS.lower())
print "upper:  %s -- %s" % (as.upper(), AS.upper())
-------------------------------------------------------------

The output is:
--------------------------------------------------------------
direct: aáàâãä -- AÁÀÂÃÄ
lower:  aáàâãä -- aÁÀÂÃÄ
upper:  Aáàâãä -- AÁÀÂÃÄ
--------------------------------------------------------------

i.e., accented letters (above 128) are not translated.
It did not make any difference to put the line 

# -*- coding: iso-latin-1 -*-

about the encoding as recommended by PEP 0263.

I am not sure whether this is a bug or it is
intentional, i.e. these functions work only for pure
ASCII letters. However it is a major inconvenience for
those who use any language which is not English but
uses the Latin aplphabet :-(. 

There should be some mechanism to signal these
functions which Latin variant (iso-8859-1, iso-8859-2,
...) is being used, so that they behave properly; eg,
optional second argument?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1026480&group_id=5470


More information about the Python-bugs-list mailing list