[issue5562] Locale-based date formatting crashes on non-ASCII data

"Martin v. Löwis" <report@bugs.python.org> at psf.upfronthosting.co.za "Martin v. Löwis" <report@bugs.python.org> at psf.upfronthosting.co.za
Thu Mar 26 01:43:23 CET 2009


Martin v. Löwis <martin at v.loewis.de> added the comment:

I think the problem is that creation of the Unicode string defaults to 
UTF-8. It should instead use the locale's encoding.

You are right that it could be an issue that there is no Python codec 
for the locale's encoding. To be robust against this case, I think the 
locale's mbcs->wcs routines should be used (i.e. mbstowcs). Better yet, 
use wcsftime in the first place. AFAICT, wcsftime is C99, so not all 
systems might support it. However, it appears that MSVC has it, so we 
could assume it exists and wait until someone complains. One issue 
apparently is that some implementations of wcsftime expect the format as 
char* (and again, I would defer dealing with that until somebody 
complains).

In either case, you end up with a wchar_t. In principle, the locale 
might use a non-Unicode wide charset for wchar_t, but these got out of 
use some time ago, and Python had always assumed that wchar_t is 
Unicode.

----------
nosy: +loewis

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5562>
_______________________________________


More information about the Python-bugs-list mailing list