The split() function of Python's built-in module has changed in a puzzling way - is this a bug?

Dan Stromberg drsalists at gmail.com
Fri Apr 23 13:53:14 EDT 2021


On Thu, Apr 22, 2021 at 8:53 PM Andy AO <zen96285 at gmail.com> wrote:

> Upgrading from Python 3.6.8 to Python 3.9.0 and executing unit tests
> revealed a significant change in the behavior of re.split().
>
> but looking at the relevant documentation — Changelog <https://docs.
> python.org/3/whatsnew/changelog.html> and re - Regular expression
> operations - Python 3.9.4 documentation
> <https://docs.python.org/3/library/re.html?highlight=re%20search#re.split>
> yet no change is found.
>
> number = '123'def test_Asterisk_quantifier_with_capture_group(self):
>     resultList = re.split(r'(\d*)', self.number)
>     if platform.python_version() == '3.6.8':
>         self.assertEqual(resultList,['', '123', ''])
>
>     else:
>         self.assertEqual(resultList,['', '123', '', '', ''])
>
> I feel that this is clearly not in line with the description of the
> function in the split documentation, and it is also strange that after
> replacing * with +, the behavior is still the same as in 3.6.8.
>
>    1. why is this change not in the documentation? Is it because I didn’t
>    find it?
>    2. Why did the behavior change this way? Was a bug introduced, or was it
>    a bug fix?
> --
> https://mail.python.org/mailman/listinfo/python-list


Interesting, and make sure to check out the FutureWarning:

$ pythons --command 'import re; number = "123"; print(re.split(r"(\d*)",
number))'
below cmd output started 2021 Fri Apr 23 10:48:32 AM PDT
/usr/local/cpython-0.9/bin/python (unknown) good skipped
/usr/local/cpython-1.0/bin/python (1.0.1) bad
      File "<string>", line 1
        import re; number = "123"; print(re.split(r"(\d*)", number))
                                                         ^
    SyntaxError: invalid syntax
/usr/local/cpython-1.1/bin/python (1.1) bad
      File "<string>", line 1
        import re; number = "123"; print(re.split(r"(\d*)", number))
                                                         ^
    SyntaxError: invalid syntax
/usr/local/cpython-1.2/bin/python (1.2) bad
      File "<string>", line 1
        import re; number = "123"; print(re.split(r"(\d*)", number))
                                                         ^
    SyntaxError: invalid syntax
/usr/local/cpython-1.3/bin/python (1.3) bad
      File "<string>", line 1
        import re; number = "123"; print(re.split(r"(\d*)", number))
                                                         ^
    SyntaxError: invalid syntax
/usr/local/cpython-1.4/bin/python (1.4) bad
      File "<string>", line 1
        import re; number = "123"; print(re.split(r"(\d*)", number))
                                                         ^
    SyntaxError: invalid syntax
/usr/local/cpython-1.5/bin/python (1.5.2) good ['', '123', '']
/usr/local/cpython-1.6/bin/python (1.6.1) good ['', '123', '']
/usr/local/cpython-2.0/bin/python (2.0.1) good ['', '123', '']
/usr/local/cpython-2.1/bin/python (2.1.0) good ['', '123', '']
/usr/local/cpython-2.2/bin/python (2.2.0) good ['', '123', '']
/usr/local/cpython-2.3/bin/python (2.3.0) good ['', '123', '']
/usr/local/cpython-2.4/bin/python (2.4.0) good ['', '123', '']
/usr/local/cpython-2.5/bin/python (2.5.6) good ['', '123', '']
/usr/local/cpython-2.6/bin/python (2.6.9) good ['', '123', '']
/usr/local/cpython-2.7/bin/python (2.7.16) good ['', '123', '']
/usr/local/cpython-3.0/bin/python (3.0.1) good ['', '123', '']
/usr/local/cpython-3.1/bin/python (3.1.5) good ['', '123', '']
/usr/local/cpython-3.2/bin/python (3.2.5) good ['', '123', '']
/usr/local/cpython-3.3/bin/python (3.3.7) good ['', '123', '']
/usr/local/cpython-3.4/bin/python (3.4.8) good ['', '123', '']
/usr/local/cpython-3.5/bin/python (3.5.5) good
    ['', '123', '']
    /usr/local/cpython-3.5/lib/python3.5/re.py:203: FutureWarning: split()
requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/cpython-3.6/bin/python (3.6.0) good
    ['', '123', '']
    /usr/local/cpython-3.6/lib/python3.6/re.py:212: FutureWarning: split()
requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/cpython-3.7/bin/python (3.7.0) good ['', '123', '', '', '']
/usr/local/cpython-3.8/bin/python (3.8.0) good ['', '123', '', '', '']
/usr/local/cpython-3.9/bin/python (3.9.0) good ['', '123', '', '', '']
/usr/local/cpython-3.10/bin/python (3.10.0a6) good ['', '123', '', '', '']
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by jnr.posix.JavaLibCHelper
(file:/usr/local/jython-2.7/jython.jar) to method
sun.nio.ch.SelChImpl.getFD()
WARNING: Please consider reporting this to the maintainers of
jnr.posix.JavaLibCHelper
WARNING: Use --illegal-access=warn to enable warnings of further illegal
reflective access operations
WARNING: All illegal access operations will be denied in a future release
/usr/local/jython-2.7/bin/jython (2.7.0) good
    WARNING: An illegal reflective access operation has occurred
    WARNING: Illegal reflective access by jnr.posix.JavaLibCHelper
(file:/usr/local/jython-2.7/jython.jar) to method
sun.nio.ch.SelChImpl.getFD()
    WARNING: Please consider reporting this to the maintainers of
jnr.posix.JavaLibCHelper
    WARNING: Use --illegal-access=warn to enable warnings of further
illegal reflective access operations
    WARNING: All illegal access operations will be denied in a future
release
    ['', '123', '']
/usr/local/pypy-5.3.1/bin/pypy (2.7.10) good ['', '123', '']
/usr/local/pypy-5.9.0/bin/pypy (2.7.13) good ['', '123', '']
/usr/local/pypy-5.10.0/bin/pypy (2.7.13) good ['', '123', '']
/usr/local/pypy-6.0.0/bin/pypy (2.7.13) good ['', '123', '']
/usr/local/pypy-7.0.0/bin/pypy (2.7.13) good ['', '123', '']
/usr/local/pypy-7.3.0/bin/pypy (2.7.13) good ['', '123', '']
/usr/local/pypy3-5.5.0/bin/pypy3 (3.3.5) good ['', '123', '']
/usr/local/pypy3-5.8.0-with-lzma-fixes/bin/pypy3 (3.5.3) good
    ['', '123', '']
    /usr/local/pypy3-5.8.0-with-lzma-fixes/lib-python/3/re.py:203:
FutureWarning: split() requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/pypy3-5.8.0/bin/pypy3 (3.5.3) good
    ['', '123', '']
    /usr/local/pypy3-5.8.0/lib-python/3/re.py:203: FutureWarning: split()
requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/pypy3-5.9.0/bin/pypy3 (3.5.3) good
    ['', '123', '']
    /usr/local/pypy3-5.9.0/lib-python/3/re.py:203: FutureWarning: split()
requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/pypy3-5.10.0/bin/pypy3 (3.5.3) good
    ['', '123', '']
    /usr/local/pypy3-5.10.0/lib-python/3/re.py:203: FutureWarning: split()
requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/pypy3-6.0.0/bin/pypy3 (3.5.3) good
    ['', '123', '']
    /usr/local/pypy3-6.0.0/lib-python/3/re.py:203: FutureWarning: split()
requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/pypy3-7.0.0/bin/pypy3 (3.5.3) good
    ['', '123', '']
    /usr/local/pypy3-7.0.0/lib-python/3/re.py:203: FutureWarning: split()
requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/pypy3-7.2.0/bin/pypy3 (3.6.9) good
    ['', '123', '']
    /usr/local/pypy3-7.2.0/lib-python/3/re.py:212: FutureWarning: split()
requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/pypy3-7.3.0/bin/pypy3 (3.6.9) good
    ['', '123', '']
    /usr/local/pypy3-7.3.0/lib-python/3/re.py:212: FutureWarning: split()
requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/pypy3-7.3.3/bin/pypy3 (3.7.9) good
    ['', '123', '']
    /usr/local/pypy3-7.3.3/lib-python/3/re.py:215: FutureWarning: split()
requires a non-empty pattern match.
      return _compile(pattern, flags).split(string, maxsplit)
/usr/local/micropython-1.11/bin/micropython (3.4.0) good ['', '123', '']
/usr/local/micropython-1.12/bin/micropython (3.4.0) good ['', '123', '']
/usr/local/micropython-git-2017-06-16/bin/micropython (3.4.0) good ['',
'123', '']
/usr/local/micropython-git-2018-06-06/bin/micropython (3.4.0) good ['',
'123', '']


More information about the Python-list mailing list