[Python-checkins] [2.7] closes bpo-32997: Fix REDOS in fpformat (GH-5984)

Benjamin Peterson webhook-mailer at python.org
Tue Mar 6 00:59:05 EST 2018


https://github.com/python/cpython/commit/55d5bfba9482d39080f7b9ec3e6257ecd23f264f
commit: 55d5bfba9482d39080f7b9ec3e6257ecd23f264f
branch: 2.7
author: Jamie Davis <davisjam at vt.edu>
committer: Benjamin Peterson <benjamin at python.org>
date: 2018-03-05T21:59:02-08:00
summary:

[2.7] closes bpo-32997: Fix REDOS in fpformat (GH-5984)

The regex to decode a number in fpformat is susceptible to catastrophic backtracking. This is a potential DOS vector if a server is using fpformat on untrusted number strings.

Replace it with an equivalent non-vulnerable regex. The match behavior of the new regex is slightly different. It captures the whole integer part of the number in one group, Leading zeros are stripped off later.

files:
A Misc/NEWS.d/next/Security/2018-03-05-10-14-42.bpo-32997.hp2s8n.rst
M Lib/fpformat.py
M Lib/test/test_fpformat.py

diff --git a/Lib/fpformat.py b/Lib/fpformat.py
index 71cbb25f3c8b..0537a27b8820 100644
--- a/Lib/fpformat.py
+++ b/Lib/fpformat.py
@@ -19,7 +19,7 @@
 __all__ = ["fix","sci","NotANumber"]
 
 # Compiled regular expression to "decode" a number
-decoder = re.compile(r'^([-+]?)0*(\d*)((?:\.\d*)?)(([eE][-+]?\d+)?)$')
+decoder = re.compile(r'^([-+]?)(\d*)((?:\.\d*)?)(([eE][-+]?\d+)?)$')
 # \0 the whole thing
 # \1 leading sign or empty
 # \2 digits left of decimal point
@@ -41,6 +41,7 @@ def extract(s):
     res = decoder.match(s)
     if res is None: raise NotANumber, s
     sign, intpart, fraction, exppart = res.group(1,2,3,4)
+    intpart = intpart.lstrip('0');
     if sign == '+': sign = ''
     if fraction: fraction = fraction[1:]
     if exppart: expo = int(exppart[1:])
diff --git a/Lib/test/test_fpformat.py b/Lib/test/test_fpformat.py
index e6de3b0c11be..428623ebb35f 100644
--- a/Lib/test/test_fpformat.py
+++ b/Lib/test/test_fpformat.py
@@ -67,6 +67,16 @@ def test_failing_values(self):
         else:
             self.fail("No exception on non-numeric sci")
 
+    def test_REDOS(self):
+        # This attack string will hang on the old decoder pattern.
+        attack = '+0' + ('0' * 1000000) + '++'
+        digs = 5 # irrelevant
+
+        # fix returns input if it does not decode
+        self.assertEqual(fpformat.fix(attack, digs), attack)
+        # sci raises NotANumber
+        with self.assertRaises(NotANumber):
+            fpformat.sci(attack, digs)
 
 def test_main():
     run_unittest(FpformatTest)
diff --git a/Misc/NEWS.d/next/Security/2018-03-05-10-14-42.bpo-32997.hp2s8n.rst b/Misc/NEWS.d/next/Security/2018-03-05-10-14-42.bpo-32997.hp2s8n.rst
new file mode 100644
index 000000000000..3c78ba61ae34
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2018-03-05-10-14-42.bpo-32997.hp2s8n.rst
@@ -0,0 +1,4 @@
+A regex in fpformat was vulnerable to catastrophic backtracking. This regex
+was a potential DOS vector (REDOS). Based on typical uses of fpformat the
+risk seems low. The regex has been refactored and is now safe. Patch by
+Jamie Davis.



More information about the Python-checkins mailing list