[Python-Dev] SRE incompatibility

Guido van Rossum guido@python.org
Fri, 30 Jun 2000 11:07:16 -0500


> my latest changes fixed a couple of things, but broke
> one of the old RE tests, namely:
> 
>     re.match('\\x00ffffffffffffff', '\377') != None
> 
> or in other words, long hexadecimal escapes are cast
> down to 8-bit characters in RE.
> 
> in SRE (after the latest change), they're cast down to
> the size of the engine's internal word size (currently 16
> bits).
> 
> is the old behaviour worth keeping?  I'd rather not make
> the engine dependent on string types; it shouldn't really
> matter if you're using unicode patterns on 8-bit target
> strings, or vice versa.

To someone familiar with '\x00ffffffffffffff' == '\377', the failure
is surprising.  What Would Larry Do?  (I.e. is this in Perl?)

Maybe make it dependent on the type of the searched string ('\377')
rather than on the type of the pattern?

--Guido van Rossum (home page: http://www.python.org/~guido/)