[ python-Bugs-1290505 ] strptime(): can't switch locales more than once

SourceForge.net noreply at sourceforge.net
Thu Mar 29 22:03:08 CEST 2007


Bugs item #1290505, was opened at 2005-09-13 15:50
Message generated for change (Comment added) made by bcannon
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1290505&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.5
Status: Pending
Resolution: None
Priority: 5
Private: No
Submitted By: Adam Monsen (meonkeys)
Assigned to: Brett Cannon (bcannon)
Summary: strptime(): can't switch locales more than once

Initial Comment:
After calling strptime() once, it appears that
subsequent efforts to modify the locale settings (so
dates strings in different locales can be parsed) throw
a ValueError. I'm pasting everything here since spacing
is irrelevant:

import locale, time
print locale.getdefaultlocale()        # ('en_US', 'utf')
print locale.getlocale(locale.LC_TIME) # (None, None)
# save old locale
old_loc = locale.getlocale(locale.LC_TIME)
locale.setlocale(locale.LC_TIME, 'nl_NL')
print locale.getlocale(locale.LC_TIME) # ('nl_NL',
'ISO8859-1')
# parse local date
date = '10 augustus 2005 om 17:26'
format = '%d %B %Y om %H:%M'
dateTuple = time.strptime(date, format)
# switch back to previous locale
locale.setlocale(locale.LC_TIME, old_loc)
print locale.getlocale(locale.LC_TIME) # (None, None)
date = '10 August 2005 at 17:26'
format = '%d %B %Y at %H:%M'
dateTuple = time.strptime(date, format)

The output I get from this script is:

('en_US', 'utf')
(None, None)
('nl_NL', 'ISO8859-1')
(None, None)
Traceback (most recent call last):
  File "switching.py", line 17, in ?
    dateTuple = time.strptime(date, format)
  File "/usr/lib/python2.4/_strptime.py", line 292, in
strptime
    raise ValueError("time data did not match format: 
data=%s  fmt=%s" %
ValueError: time data did not match format:  data=10
August 2005 at 17:26  fmt=%d %B %Y at %H:%M


One workaround I found is by manually busting the
regular expression cache in _strptime:

import _strptime
_strptime._cache_lock.acquire()
_strptime._TimeRE_cache = _strptime.TimeRE()
_strptime._regex_cache = {}
_strptime._cache_lock.release()

If I do all that, I can change the LC_TIME part of the
locale as many times as I choose.

If this isn't a bug, this should at least be in the
documentation for the locale module and/or strptime().

----------------------------------------------------------------------

>Comment By: Brett Cannon (bcannon)
Date: 2007-03-29 13:03

Message:
Logged In: YES 
user_id=357491
Originator: NO

The test was checking that the TimeRE instance is recreated when the
locale changes.  You do have a valid point about the 'if' check; should
have put the setlocale call in an try/except block and just returned if an
exception was raised.

As for the %d usage of strptime, that is just to force a call into
strptime and thus trigger the new instance of TimeRE.  That is why the test
checks the id of the objects; don't really care about strptime directly
failing.  Did the test not fail properly even when you removed the 'if' but
left everything else alone?

----------------------------------------------------------------------

Comment By: Javier Sanz (kovan)
Date: 2007-03-29 09:53

Message:
Logged In: YES 
user_id=1426755
Originator: NO

I've been looking at the test case, and I noticed that isn't actually
checking anything, because locale.getlocale(locale.LC_TIME) is returning
(None,None), which is ok and just means that the default locale (which is
the C locale, not the system locale) is being used.
After removing that 'if' I also changed de_DE by es_ES to fit my system,
and strptime('10', '%d') by strptime('Fri', '%a') and strptime('vie','%a');
because '10' is '10' in all -occidental- languages, and the test would not
fail when the wrong locale is being used.

Once I made these changes to the test case, it successfully failed when
using the non-patched _strptime.py, AND ran ok when using the patched
version.

This is the test case I ended up using:



    def test_TimeRE_recreation(self):
        # The TimeRE instance should be recreated upon changing the
locale.
        locale_info = locale.getlocale(locale.LC_TIME)
        locale.setlocale(locale.LC_TIME, ('en_US', 'UTF8'))
        try:
            _strptime.strptime('Fri', '%a')
            first_time_re_id = id(_strptime._TimeRE_cache)
            locale.setlocale(locale.LC_TIME, ('es_ES', 'UTF8'))
            _strptime.strptime('vie', '%a')
            second_time_re_id = id(_strptime._TimeRE_cache)
            self.failIfEqual(first_time_re_id, second_time_re_id)
        finally:
            locale.setlocale(locale.LC_TIME, locale_info)


----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2007-03-28 19:07

Message:
Logged In: YES 
user_id=357491
Originator: NO

I have uploaded a patch for test_strptime that adds a test to make sure
that the TimeRE instance is recreated if the locale changes (went with
en_US and de_DE, but could easily be other locales if there are other ones
that are more common).  Let me know if the test runs fine and works.  Even
better is if it fails without the fix.
File Added: strptime_timere_test.diff

----------------------------------------------------------------------

Comment By: Javier Sanz (kovan)
Date: 2007-03-28 16:44

Message:
Logged In: YES 
user_id=1426755
Originator: NO

I'll be glad to help in whatever I can.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2007-03-28 16:40

Message:
Logged In: YES 
user_id=357491
Originator: NO

The power of procrastination in the morning.  =)  I am going to try to
come up with a test case for this.  I might ask, kovan, if you can run the
test case to make sure it works.

----------------------------------------------------------------------

Comment By: Javier Sanz (kovan)
Date: 2007-03-28 15:55

Message:
Logged In: YES 
user_id=1426755
Originator: NO

I applied the patch, and it works now :). 
Thanks bcannon for the quick responses.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2007-03-28 11:39

Message:
Logged In: YES 
user_id=357491
Originator: NO

kovan, can you please apply the patch I have uploaded to your copy of
_strptime and let me know if that fixes it?  I am oS X and switching
locales doesn't work for me so I don't have an easy way to test this.
File Added: strptime_cache.diff

----------------------------------------------------------------------

Comment By: Javier Sanz (kovan)
Date: 2007-03-28 00:06

Message:
Logged In: YES 
user_id=1426755
Originator: NO

This is the code:

def parseTime(strTime, format = "%a %b %d %H:%M:%S"):# example: Mon Aug 7
21:08:52                        

    locale.setlocale(locale.LC_TIME, ('en_US','UTF8'))    
    format = "%Y " + format
    strTime = str(datetime.now().year) + " " +strTime

    import _strptime
    _strptime._cache_lock.acquire()
    _strptime._TimeRE_cache = _strptime.TimeRE()
    _strptime._regex_cache = {}
    _strptime._cache_lock.release()    

    tuple = strptime(strTime, format)     
    return datetime(*tuple[0:6])


If I remove the code to clear the cache and add "print
format_regex.pattern" statement to _strptime.py after "format_regex =
time_re.compile(format)", I get 

(?P<Y>\d\d\d\d)\s*(?P<a>mi\�\�|s\�\�b|lun|mar|jue|vie|dom)\s*(?P<b>ene|feb|mar|abr|may|jun|jul|ago|sep|oct|nov|dic)\s*(?P<d>3[0-1]|[1-2]\d|0[1-9]|[1-9]|

[1-9])\s*(?P<H>2[0-3]|[0-1]\d|\d):(?P<M>[0-5]\d|\d):(?P<S>6[0-1]|[0-5]\d|\d)

which is in my system's locale (es), and it should be in english.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2007-03-27 19:35

Message:
Logged In: YES 
user_id=357491
Originator: NO

Can you show some code that recreatess the problem?

----------------------------------------------------------------------

Comment By: Javier Sanz (kovan)
Date: 2007-03-27 18:06

Message:
Logged In: YES 
user_id=1426755
Originator: NO

I think I'm having this issue with Python 2.5, as I can only make strptime
take into account locale.setlocale() calls if I clear strptime's internal
regexp cache between the calls to setlocal() and strptime().

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2005-09-14 19:42

Message:
Logged In: YES 
user_id=357491

OK, the problem was that the cache for the locale
information in terms of dates and time was being invalidated
and recreated, but the regex cache was not being touched.  I
has now been fixed in rev. 1.41 for 2.5 and in rev. 1.38.2.3
for 2.4 .

Thanks for reporting this, Adam.

----------------------------------------------------------------------

Comment By: Adam Monsen (meonkeys)
Date: 2005-09-13 15:57

Message:
Logged In: YES 
user_id=259388

I think there were some long lines in my code. Attaching
test case.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1290505&group_id=5470


More information about the Python-bugs-list mailing list