[Python-Dev] \G (match last position) regex operator non-existant in python?

Nick Coghlan ncoghlan at gmail.com
Sat Oct 28 03:09:33 EDT 2017


On 28 October 2017 at 01:57, Guido van Rossum <guido at python.org> wrote:

> Oh. Yes, that is being discussed about once a year two. It seems Matthew
> isn't very interested in helping out with the port, and there are some
> concerns about backwards compatibility with the `re` module. I think it
> needs a champion!
>

Matthew's been amenable to the idea when it comes up, and he explicitly
wrote the module to be usable as a drop-in replacement for "re" (hence the
re-compatible v0 behaviour still being the default).

The resistance has more been from our side, since this is a case where
existing regex module users are clearly better off if it remains a separate
project, as that keeps upgrades independent of the relatively slow standard
library release cycle (and allows it to be used on Python 2.7 as well as in
3.x). By contrast, the potential benefits of standard library inclusion
accrue primarily to Python newcomers and folks writing scripts without the
benefit of package management tools, since they'll have a more capable
regex engine available as part of the assumed language baseline.

That means that if we add regex to the standard library in the regular way,
there's a more than fair chance that we'll end up with an outcome like the
json vs simplejson split, where we have one variant in the standard
library, and another variant on PyPI, and the variants may drift apart over
time if their maintenance is being handled by different people. (Note: one
may argue that we already have this split in the form of re vs regex. So if
regex was brought in specifically to replace _sre as the re module
implementation, rather than as a new public API, then we at least wouldn't
be making anything *worse* from a future behavioural consistency
perspective, but we'd be risking a compatibility break for anyone depending
on _sre and other internal implementation details of the re module).

One potential alternative approach that is then brought up (often by me) is
to suggest instead *bundling* the regex module with CPython, without
actually bringing it fully within the regular standard library maintenance
process. The idea there would be to both make the module available by
default in python.org downloads, *and* make it clear to redistributors that
the module is part of the expected baseline of Python functionality, but
otherwise keep it entirely in its current independently upgradable form.

That would still be hard (since it would involve establishing new
maintenance policy precedents that go beyond the current special-casing of
`pip` in order to bootstrap PyPI access), but would have the additional
benefit of paving the way for doing similar things with other modules where
we'd like them to be part of the assumed baseline for end users, but also
have reasons for wanting to avoid tightly coupling them to the standard
libary's regular maintenance policy (most notably, requests).

And that's where discussions tend to fizzle out:

* outright replacement of the current re module implementation with a
private copy of the regex module introduces compatibility risks that would
need a fiat decision from you as BDFL to say "Let's do it anyway, make sure
the test suite still works, and then figure out how to cope with any other
consequences as they arise"
* going down the bundling path requires making some explicit community
management decisions around what we actually want the standard library to
*be* (and whether or not there's a difference between "the standard
library" and "the assumed available package set" for Python installations
that are expected to run arbitrary third party scripts rather than specific
applications)
* having both the current re API and implementation *and* a new regex based
API and implementation in the standard library indefinitely seems like it
would be a maintainability nightmare that delivered the worst of all
possible outcomes for everyone involved (CPython maintainers, regex
maintainers, Python end users)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20171028/e6cec50d/attachment-0001.html>


More information about the Python-Dev mailing list