Re module help?

Tim Peters tim_one at email.msn.com
Tue Jan 4 03:09:38 EST 2000


[Elf Sternberg]
> There is a line in 'xmllib.py' that has me puzzled.  I
> thought I understood regular expressions and XML, but this
> one has me confused.
>
> interesting = re.compile('[]&<]')
>
> I don't get it.  What is this looking for?

One of the three characters
    ] & <

> The '[]' can't mean an empty set, can it?

Right, the way to write a character class that can't match anything is

    [^\000-\377]

<arghggh!>.

> That doesn't make any sense.  Is this a special case
> where '[]...' allows you to include the ']' in things
> to search for without escaping it?

Yes, and it's common in regexp pkgs.  I think the line would be much clearer
as

    interesting = re.compile(r'[\]&<]')

There are similarly revolting special cases involving "-" in character sets,
which also ascribe meaning to what *looks* like an error.  That's a very
un-Pythonic thing to do, but Python's re package is intended to be
compatible with Perl's.

> Or is it, in fact, a bug?

Perl?  Yes <wink>.

non-judgmentally y'rs  - tim






More information about the Python-list mailing list