[ python-Bugs-1039270 ] time zone tests fail on Windows
SourceForge.net
noreply at sourceforge.net
Tue Oct 5 20:56:46 CEST 2004
Bugs item #1039270, was opened at 2004-10-03 12:44
Message generated for change (Comment added) made by quiver
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1039270&group_id=5470
Category: Python Library
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: George Yoshida (quiver)
Assigned to: Brett Cannon (bcannon)
Summary: time zone tests fail on Windows
Initial Comment:
Following tests fail on Win 2K(Japanese locale):
# test_strptime.py
test_compile (__main__.TimeRETests) ... FAIL
test_bad_timezone (__main__.StrptimeTests) ... ERROR
test_timezone (__main__.StrptimeTests) ... ERROR
test_day_of_week_calculation
(__main__.CalculationTests) ... ERROR
test_gregorian_calculation
(__main__.CalculationTests) ... ERROR
test_julian_calculation (__main__.CalculationTests) ...
ERROR
# test_time.py
test_strptime (test.test_time.TimeTestCase) ... FAIL
===
They all stem from time zone tests and can be divided
into two groups:
FAIL of test_compile is basically same as #bug 883604.
http://www.python.org/sf/883604
Local time values include regular expression's
metacharacters, but they are not escaped.
The rest is caused because strptime can't parse the
values of strftime.
>>> import time
>>> time.tzname
('\x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)', '\x93
\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)')
>>> time.strptime(time.strftime('%Z', time.gmtime()))
Traceback (most recent call last):
File "<pyshell#1>", line 1, in -toplevel-
time.strptime(time.strftime('%Z', time.gmtime()))
File "C:\Python24\lib\_strptime.py", line 291, in strptime
raise ValueError("time data did not match format:
data=%s fmt=%s" %
ValueError: time data did not match format: data=東京
(標準時) fmt=%a %b %d %H:%M:%S %Y
The output of running test_time.py and test_strptime.py
is attached.
----------------------------------------------------------------------
>Comment By: George Yoshida (quiver)
Date: 2004-10-06 03:56
Message:
Logged In: YES
user_id=671362
bcannon write:
> The .lower() call is intended to normalize since
capitalization
> is not standard across OSs. But if it is a Unicode string it
> should be fine. And even if it isn't, it is all lowercased for
> comparison anyway, so as long as it is consistent, shouldn't
it
> still work?
Hmm.
> As for your example of strptime not being able to parse, you
have
> a bug in it; you forgot the format string. It should have
been
> ``time.strptime(time.strftime('%Z'), '%Z')``. Give that a
run
> and let me know what the output is.
Yeah, it's my fault. I forget to specify a format. Even so,
strptime couldn't parse timezone.
> As for this whole multi-byte issue, is it all being returned as
> Unicod e strings, or is it just a regular string? In other
> words, what is ``type(time.tzname[0])`` spitting out? And
what
> character encoding is all of this in (i.e., what should I pass
> to unicode so as to not have it raise UnicodeDecodeError)?
It returns strings(not a unicode), and the encoding is cp932.
This is a default encoding of Japanese Windows.
>>> unicode(time.tzname[0], 'cp932')
u'\u6771\u4eac (\u6a19\u6e96\u6642)'
> And finally, for the regex metacharacter stuff, why the hell
ar
> e there parentheses in a timezone?!? Whoever decided
that wa
> s good did it just to upset me.
Ask M$ Japan :-;
I don't regard 'Tokyo (standard time)' as an acceptable
representation for time zone at all, but this is what Windows
returns as a time zone on my box.
> That does need to be fixed. Apply the patch I just
uploaded and let
> me know if it at least deals with that problem.
With your patch, all tests succeed without any Error or Fail,
and
strftime <-> strptime conversions work well. This is a backport
candidate, so I created a new patch against Python 2.3 with
listcomps instead of genexprs.
But there is one problem left.
On IDLE, strptime still can't parse. I haven't looked into it in
details, but probably patch #590913 has something to do with
it.
This patch sets locale at IDLE's start up time and this can
affect
behaviors of string-related functions and constants.
[PEP 263 support in IDLE]
http://www.python.org/sf/590913
# patch applied
>>> time.strptime(time.strptime('%Z'), '%Z')
Traceback (most recent call last):
File "<pyshell#93>", line 1, in -toplevel-
time.strptime(time.strptime('%Z'), '%Z')
File "C:\Python24\lib\_strptime.py", line 291, in strptime
if not found:
ValueError: time data did not match format: data=%Z fmt=%
a %b %d %H:%M:%S %Y
>>> import locale
>>> locale.getlocale()
['Japanese_Japan', '932'] # culprit?
> Have I mentioned I hate timezones? In case I haven't, I do.
I agree with you one hundred percent.
--George
----------------------------------------------------------------------
Comment By: Brett Cannon (bcannon)
Date: 2004-10-04 08:16
Message:
Logged In: YES
user_id=357491
The .lower() call is intended to normalize since capitalization is not
standard across OSs. But if it is a Unicode string it should be fine. And
even if it isn't, it is all lowercased for comparison anyway, so as long as
it is consistent, shouldn't it still work?
As for your example of strptime not being able to parse, you have a bug
in it; you forgot the format string. It should have been
``time.strptime(time.strftime('%Z'), '%Z')``. Give that a run and let me
know what the output is.
As for this whole multi-byte issue, is it all being returned as Unicode
strings, or is it just a regular string? In other words, what is
``type(time.tzname[0])`` spitting out? And what character encoding is
all of this in (i.e., what should I pass to unicode so as to not have it raise
UnicodeDecodeError)?
And finally, for the regex metacharacter stuff, why the hell are there
parentheses in a timezone?!? Whoever decided that was good did it just
to upset me. That does need to be fixed. Apply the patch I just
uploaded and let me know if it at least deals with that problem.
Have I mentioned I hate timezones? In case I haven't, I do. Thanks for
catching this all, though, George.
----------------------------------------------------------------------
Comment By: George Yoshida (quiver)
Date: 2004-10-04 00:05
Message:
Logged In: YES
user_id=671362
I've found another bug.
Line 167 & 169 of Lib/_strptime.py contains the expression:
time.tzname[0].lower()
I guess this is intended to normalize alphabets, but for
multibyte characters this is really dangerous.
>>> import time
>>> time.tzname[0]
'\x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)'
>>> _.lower()
'\x93\x8c\x8b\x9e (\x95w\x8f\x80\x8e\x9e)'
\x95W and \x95w is not the same character.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1039270&group_id=5470
More information about the Python-bugs-list
mailing list