[Patches] Prevent re module blowup in IDLE's PyParse.py

Tim Peters tim_one@email.msn.com
Thu, 2 Mar 2000 23:37:33 -0500


Changes the one regexp in PyParse capable of making the re module blow the C
stack when passed unreasonable <0.9 wink> program text.  Jeremy Hylton
provoked this with a program of the form:

x = (1,
     2,
... # 9997 lines deleted here
     10000,
)

Programs "like this" will no longer (no matter how many lines they contain)
trigger re death.  OTOH, you can now make another class of unreasonable
program that will take much longer to parse.


I confirm that, to the best of my knowledge and belief, this
contribution is free of any claims of third parties under
copyright, patent or other rights or interests ("claims").  To
the extent that I have any such claims, I hereby grant to CNRI a
nonexclusive, irrevocable, royalty-free, worldwide license to
reproduce, distribute, perform and/or display publicly, prepare
derivative versions, and otherwise use this contribution as part
of the Python software and its related documentation, or any
derivative versions thereof, at no cost to CNRI or its licensed
users, and to authorize others to do so.

I acknowledge that CNRI may, at its sole discretion, decide
whether or not to incorporate this contribution in the Python
software and its related documentation.  I further grant CNRI
permission to use my name and other identifying information
provided to CNRI by me for use in connection with the Python
software and its related documentation.



*** idle-0.5/PyParse.py	Mon Jun 07 22:26:18 1999
--- new/PyParse.py	Thu Mar 02 22:56:16 2000
***************
*** 83,97 ****
      \b
  """, re.VERBOSE).match

! # Chew up non-special chars as quickly as possible, but retaining
! # enough info to determine the last non-ws char seen; if match is
! # successful, and m.group(1) isn't None, m.end(1) less 1 is the
! # index of the last non-ws char matched.

  _chew_ordinaryre = re.compile(r"""
!     (?: \s+
!     |   ( [^\s[\](){}#'"\\]+ )
!     )+
  """, re.VERBOSE).match

  # Build translation table to map uninteresting chars to "x", open
--- 83,95 ----
      \b
  """, re.VERBOSE).match

! # Chew up non-special chars as quickly as possible.  If match is
! # successful, m.end() less 1 is the index of the last boring char
! # matched.  If match is unsuccessful, the string starts with an
! # interesting char.

  _chew_ordinaryre = re.compile(r"""
!     [^[\](){}#'"\\]+
  """, re.VERBOSE).match

  # Build translation table to map uninteresting chars to "x", open
***************
*** 386,395 ****
              # suck up all except ()[]{}'"#\\
              m = _chew_ordinaryre(str, p, q)
              if m:
!                 i = m.end(1) - 1    # last non-ws (if any)
                  if i >= 0:
                      lastch = str[i]
-                 p = m.end()
                  if p >= q:
                      break

--- 384,397 ----
              # suck up all except ()[]{}'"#\\
              m = _chew_ordinaryre(str, p, q)
              if m:
!                 # we skipped at least one boring char
!                 p = m.end()
!                 # back up over totally boring whitespace
!                 i = p-1    # index of last boring char
!                 while i >= 0 and str[i] in " \t\n":
!                     i = i-1
                  if i >= 0:
                      lastch = str[i]
                  if p >= q:
                      break