regular expression

Peter Hansen peter at engcorp.com
Fri Mar 25 23:54:32 EST 2005


Bengt Richter wrote:
> On Sat, 26 Mar 2005 02:07:15 GMT, aaron <asteele at berkeley.edu> wrote:
>>>>>pattern.sub(':', '375 mi. south of U.C.B is 3.4 degrees warmer.')
>>'375 mi: south of U:C:B is 3.4 degrees warmer:'
>>
>>so this works, but not in the following case:
>>>>>pattern.sub(':', '.3')
>>
> Brute force the exceptional case that happens at the start of the line?
> 
>  >>> import re
>  >>> pattern = re.compile(r'^[.]|(?!\d)[.](?!\d)')
>  >>> pattern.sub(':', '375 mi. south of U.C.B is 3.4 degrees warmer.')
>  '375 mi: south of U:C:B is 3.4 degrees warmer:'
>  >>> pattern.sub(':', '.3')
>  ':3'
>  >>> pattern.sub(':', '3.')
>  '3:'

Be careful... the OP has assumed something that isn't true,
and Bengt's fix isn't sufficient:

 >>> import re
 >>> s = 'x.3'
 >>> pattern = re.compile(r'^[.]|(?!\d)[.](?!\d)')
 >>> pattern.sub(':', '.3')
':3'
 >>> pattern.sub(':', s)
'x.3'

So the OP's "this works" comment was wrong.

Suggestion: whip up a variety of automated test cases and
make sure you run them all whenever you make changes to
this code...

(No, I don't have a solution to the continuing problem,
other than to wonder whether the input data really requires
all these edge cases to be handled properly.)

-Peter



More information about the Python-list mailing list