[Python-Dev] Misc re.match() complaint

Nick Coghlan ncoghlan at gmail.com
Tue Jul 16 08:20:38 CEST 2013


On 16 July 2013 14:53, Guido van Rossum <guido at python.org> wrote:
> Hm. I'd still like to change this, but I understand it's debatable...
> Is the group() method written in C or Python? If it's in C it should
> be simple enough to let it just do a little bit of pointer math and
> construct a bytes object from the given area of memory -- after all,
> it must have a pointer to that memory area in order to do the matching
> in the first place (although I realize the code may be separated by a
> gulf of abstraction :-).

It shouldn't be too bad - I tracked it down through sre_compile, and
everything seems to funnel into match_getslice_by_index [1], so it
should be possible to detect the non-bytes, non-strings there and
coerce them to bytes.

OTOH, you can already get the same effect by explicitly wrapping the
input in memoryview before passing it to re, and then converting the
output to bytes to release the reference to the underlying data, and
doing that doesn't raise ugly backwards compatibility concerns....

Cheers,
Nick.

[1] http://hg.python.org/cpython/file/daf9ea42b610/Modules/_sre.c#l3198

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list