[Python-checkins] bpo-36384: Leading zeros in IPv4 addresses are no longer tolerated (GH-25099)

ambv webhook-mailer at python.org
Sun May 2 08:00:52 EDT 2021


https://github.com/python/cpython/commit/60ce8f0be6354ad565393ab449d8de5d713f35bc
commit: 60ce8f0be6354ad565393ab449d8de5d713f35bc
branch: master
author: Christian Heimes <christian at python.org>
committer: ambv <lukasz at langa.pl>
date: 2021-05-02T14:00:35+02:00
summary:

bpo-36384: Leading zeros in IPv4 addresses are no longer tolerated (GH-25099)

Reverts commit e653d4d8e820a7a004ad399530af0135b45db27a and makes
parsing even more strict. Like socket.inet_pton() any leading zero
is now treated as invalid input.

Signed-off-by: Christian Heimes <christian at python.org>

Co-authored-by: Łukasz Langa <lukasz at langa.pl>

files:
A Misc/NEWS.d/next/Security/2021-03-30-16-29-51.bpo-36384.sCAmLs.rst
M Doc/library/ipaddress.rst
M Doc/tools/susp-ignored.csv
M Doc/whatsnew/3.9.rst
M Lib/ipaddress.py
M Lib/test/test_ipaddress.py

diff --git a/Doc/library/ipaddress.rst b/Doc/library/ipaddress.rst
index d6d1f1e362137b..1c2263b128a8fe 100644
--- a/Doc/library/ipaddress.rst
+++ b/Doc/library/ipaddress.rst
@@ -104,8 +104,7 @@ write code that handles both IP versions correctly.  Address objects are
    1. A string in decimal-dot notation, consisting of four decimal integers in
       the inclusive range 0--255, separated by dots (e.g. ``192.168.0.1``). Each
       integer represents an octet (byte) in the address. Leading zeroes are
-      tolerated only for values less than 8 (as there is no ambiguity
-      between the decimal and octal interpretations of such strings).
+      not tolerated to prevent confusion with octal notation.
    2. An integer that fits into 32 bits.
    3. An integer packed into a :class:`bytes` object of length 4 (most
       significant octet first).
@@ -117,6 +116,22 @@ write code that handles both IP versions correctly.  Address objects are
    >>> ipaddress.IPv4Address(b'\xC0\xA8\x00\x01')
    IPv4Address('192.168.0.1')
 
+   .. versionchanged:: 3.8
+
+      Leading zeros are tolerated, even in ambiguous cases that look like
+      octal notation.
+
+   .. versionchanged:: 3.10
+
+      Leading zeros are no longer tolerated and are treated as an error.
+      IPv4 address strings are now parsed as strict as glibc
+      :func:`~socket.inet_pton`.
+
+   .. versionchanged:: 3.9.5
+
+      The above change was also included in Python 3.9 starting with
+      version 3.9.5.
+
    .. attribute:: version
 
       The appropriate version number: ``4`` for IPv4, ``6`` for IPv6.
diff --git a/Doc/tools/susp-ignored.csv b/Doc/tools/susp-ignored.csv
index 5a2d85d262b2e7..d56a2b9fd0bfb9 100644
--- a/Doc/tools/susp-ignored.csv
+++ b/Doc/tools/susp-ignored.csv
@@ -149,8 +149,8 @@ library/ipaddress,,:db8,>>> ipaddress.IPv6Address('2001:db8::1000')
 library/ipaddress,,::,>>> ipaddress.IPv6Address('2001:db8::1000')
 library/ipaddress,,:db8,'2001:db8::1000'
 library/ipaddress,,::,'2001:db8::1000'
-library/ipaddress,231,:db8,">>> f'{ipaddress.IPv6Address(""2001:db8::1000""):s}'"
-library/ipaddress,231,::,">>> f'{ipaddress.IPv6Address(""2001:db8::1000""):s}'"
+library/ipaddress,,:db8,">>> f'{ipaddress.IPv6Address(""2001:db8::1000""):s}'"
+library/ipaddress,,::,">>> f'{ipaddress.IPv6Address(""2001:db8::1000""):s}'"
 library/ipaddress,,::,IPv6Address('ff02::5678%1')
 library/ipaddress,,::,fe80::1234
 library/ipaddress,,:db8,">>> ipaddress.ip_address(""2001:db8::1"").reverse_pointer"
diff --git a/Doc/whatsnew/3.9.rst b/Doc/whatsnew/3.9.rst
index 9a7f2cd3843c9d..772fb5a3fe7458 100644
--- a/Doc/whatsnew/3.9.rst
+++ b/Doc/whatsnew/3.9.rst
@@ -537,6 +537,10 @@ Scoped IPv6 addresses can be parsed using :class:`ipaddress.IPv6Address`.
 If present, scope zone ID is available through the :attr:`~ipaddress.IPv6Address.scope_id` attribute.
 (Contributed by Oleksandr Pavliuk in :issue:`34788`.)
 
+Starting with Python 3.9.5 the :mod:`ipaddress` module no longer
+accepts any leading zeros in IPv4 address strings.
+(Contributed by Christian Heimes in :issue:`36384`).
+
 math
 ----
 
@@ -1114,6 +1118,14 @@ Changes in the Python API
   compatible classes that don't inherit from those mentioned types.
   (Contributed by Roger Aiudi in :issue:`34775`).
 
+* Starting with Python 3.9.5 the :mod:`ipaddress` module no longer
+  accepts any leading zeros in IPv4 address strings. Leading zeros are
+  ambiguous and interpreted as octal notation by some libraries. For example
+  the legacy function :func:`socket.inet_aton` treats leading zeros as octal
+  notatation. glibc implementation of modern :func:`~socket.inet_pton` does
+  not accept any leading zeros.
+  (Contributed by Christian Heimes in :issue:`36384`).
+
 * :func:`codecs.lookup` now normalizes the encoding name the same way as
   :func:`encodings.normalize_encoding`, except that :func:`codecs.lookup` also
   converts the name to lower case. For example, ``"latex+latin1"`` encoding
diff --git a/Lib/ipaddress.py b/Lib/ipaddress.py
index 160b16dbc162fc..af7aedfa6e51a1 100644
--- a/Lib/ipaddress.py
+++ b/Lib/ipaddress.py
@@ -1223,6 +1223,11 @@ def _parse_octet(cls, octet_str):
         if len(octet_str) > 3:
             msg = "At most 3 characters permitted in %r"
             raise ValueError(msg % octet_str)
+        # Handle leading zeros as strict as glibc's inet_pton()
+        # See security bug bpo-36384
+        if octet_str != '0' and octet_str[0] == '0':
+            msg = "Leading zeros are not permitted in %r"
+            raise ValueError(msg % octet_str)
         # Convert to integer (we know digits are legal)
         octet_int = int(octet_str, 10)
         if octet_int > 255:
diff --git a/Lib/test/test_ipaddress.py b/Lib/test/test_ipaddress.py
index 3c070080a6aaeb..cdd9880c3c17fa 100644
--- a/Lib/test/test_ipaddress.py
+++ b/Lib/test/test_ipaddress.py
@@ -96,10 +96,23 @@ def pickle_test(self, addr):
 class CommonTestMixin_v4(CommonTestMixin):
 
     def test_leading_zeros(self):
-        self.assertInstancesEqual("000.000.000.000", "0.0.0.0")
-        self.assertInstancesEqual("192.168.000.001", "192.168.0.1")
-        self.assertInstancesEqual("016.016.016.016", "16.16.16.16")
-        self.assertInstancesEqual("001.000.008.016", "1.0.8.16")
+        # bpo-36384: no leading zeros to avoid ambiguity with octal notation
+        msg = "Leading zeros are not permitted in '\d+'"
+        addresses = [
+            "000.000.000.000",
+            "192.168.000.001",
+            "016.016.016.016",
+            "192.168.000.001",
+            "001.000.008.016",
+            "01.2.3.40",
+            "1.02.3.40",
+            "1.2.03.40",
+            "1.2.3.040",
+        ]
+        for address in addresses:
+            with self.subTest(address=address):
+                with self.assertAddressError(msg):
+                    self.factory(address)
 
     def test_int(self):
         self.assertInstancesEqual(0, "0.0.0.0")
diff --git a/Misc/NEWS.d/next/Security/2021-03-30-16-29-51.bpo-36384.sCAmLs.rst b/Misc/NEWS.d/next/Security/2021-03-30-16-29-51.bpo-36384.sCAmLs.rst
new file mode 100644
index 00000000000000..f956cde948ec57
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2021-03-30-16-29-51.bpo-36384.sCAmLs.rst
@@ -0,0 +1,6 @@
+:mod:`ipaddress` module no longer accepts any leading zeros in IPv4 address
+strings. Leading zeros are ambiguous and interpreted as octal notation by
+some libraries. For example the legacy function :func:`socket.inet_aton`
+treats leading zeros as octal notatation. glibc implementation of modern
+:func:`~socket.inet_pton` does not accept any leading zeros. For a while
+the :mod:`ipaddress` module used to accept ambiguous leading zeros.



More information about the Python-checkins mailing list