Freeze problem with Regular Expression

Reedick, Andrew jr9445 at ATT.COM
Wed Jun 25 13:45:06 EDT 2008


> -----Original Message-----
> From: python-list-bounces+jr9445=att.com at python.org [mailto:python-
> list-bounces+jr9445=att.com at python.org] On Behalf Of Kirk
> Sent: Wednesday, June 25, 2008 11:20 AM
> To: python-list at python.org
> Subject: Freeze problem with Regular Expression
> 
> Hi All,
> the following regular expression matching seems to enter in a infinite
> loop:
> 
> ################
> import re
> text = ' MSX INTERNATIONAL HOLDINGS ITALIA srl (di seguito MSX ITALIA)
> una '
> re.findall('[^A-Z|0-9]*((?:[0-9]*[A-Z]+[0-9|a-z|\-]*)+\s*[a-
> z]*\s*(?:[0-9]
> *[A-Z]+[0-9|a-z|\-]*\s*)*)([^A-Z]*)$', text)
> #################
> 
> No problem with perl with the same expression:
> 
> #################
> $s = ' MSX INTERNATIONAL HOLDINGS ITALIA srl (di seguito MSX ITALIA)
> una
> ';
> $s =~ /[^A-Z|0-9]*((?:[0-9]*[A-Z]+[0-9|a-z|\-]*)+\s*[a-z]*\s*(?:[0-
> 9]*[A-
> Z]+[0-9|a-z|\-]*\s*)*)([^A-Z]*)$/;
> print $1;
> #################
> 
> I've python 2.5.2 on Ubuntu 8.04.
> any idea?
> Thanks!
> 


It locks up on 2.5.2 on windows also.  Probably too much recursion going
on.


What's with the |'s in [0-9|a-z|\-]?  The '|' is a character not an 'or'
operator.  I think you meant to say either '[0-9a-z\-]' or '[0-9a-z\-|]'



*****

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers. GA621





More information about the Python-list mailing list