[ python-Bugs-947906 ] calendar.weekheader(n): n should mean chars
not bytes
SourceForge.net
noreply at sourceforge.net
Thu Jun 3 00:43:11 EDT 2004
Bugs item #947906, was opened at 2004-05-04 20:38
Message generated for change (Comment added) made by loewis
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=947906&group_id=5470
Category: Python Library
Group: Python 2.3
Status: Open
Resolution: Accepted
Priority: 7
Submitted By: Leonardo Rochael Almeida (rochael)
>Assigned to: Nobody/Anonymous (nobody)
Summary: calendar.weekheader(n): n should mean chars not bytes
Initial Comment:
calendar.weekheader(n) is locale aware, which is good
in principle. The parameter n, however, is interpreted
as meaning bytes, not chars, which can generate broken
strings for, e.g. localized weekday names:
>>> calendar.weekheader(2)
'Mo Tu We Th Fr Sa Su'
>>> locale.setlocale(locale.LC_ALL, "pt_BR.UTF-8")
'pt_BR.UTF-8'
>>> calendar.weekheader(2)
'Se Te Qu Qu Se S\xc3 Do'
Notice how "Sábado" (Saturday) above is missing the
second utf-8 byte for the encoding of "á":
>>> u"Sá".encode("utf-8")
'S\xc3\xa1'
The implementation of weekheader (and of all of
calendar.py, it seems) is based on localized 8 bit
strings. I suppose the correct fix for this bug will
involve a roundtrip thru unicode.
----------------------------------------------------------------------
>Comment By: Martin v. Löwis (loewis)
Date: 2004-06-03 06:43
Message:
Logged In: YES
user_id=21627
Adding an ucalendar module would be reasonable, IMO.
Introducing ustrftime is not necessary - we could just apply
the "unicode in/unicode out" procedure (i.e. if the format
is a Unicode string, return a Unicode result). The tricky
part of that is to convert the strftime result to Unicode.
We could try mbstowcs, but that would fail if the locale
doesn't use Unicode for wchar_t.
Once ucalendar is written, we could document that the
calendar module has known problems if the locale's encoding
is not Latin-1.
However, I'm not going to implement that any time soon, so
unassigning.
----------------------------------------------------------------------
Comment By: Walter Dörwald (doerwalter)
Date: 2004-06-02 21:08
Message:
Logged In: YES
user_id=89016
Maybe we should have a second version of calendar (named
ucalendar?) that works with unicode strings? Could those two
modules be rewritten to use as much common functionality as
possible? Or we could use a module global to configure
whether str or unicode should be returned?
Most of the localization functionality in calendar seems to
come from datetime.datetime.strftime(), so it probably would
help to have a method datetime.datetime.ustrftime() that
returns the formatted string as unicode (using the locale
encoding).
Assigning to MvL as the locale/unicode expert.
----------------------------------------------------------------------
Comment By: Hye-Shik Chang (perky)
Date: 2004-05-08 01:57
Message:
Logged In: YES
user_id=55188
I think calendar.weekheader should mean not chars nor bytes
but width.
Because the function is currectly used for fixed width
representations
of calendars.
Yes. They are same for western alphabets. But, for many of CJK
characters are in full width. So, they need only 1 character for
calendar.weekheader(2); and it's conventional in real life, too.
But, we don't have unicode.width() support to implement the
feature yet.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=947906&group_id=5470
More information about the Python-bugs-list
mailing list