how to match u'\uff00' - u'\uff0f' in re module?

John Machin sjmachin at lexicon.net
Mon Jul 10 23:25:21 EDT 2006


On 11/07/2006 12:32 PM, yichao.zhang wrote:
> I'm trying to match the characters from u'\uff00' to u'\uff0f'.
> the code below and get a TypeError.
> p = re.compile(u'\uff00'-u'\uff0f')

That is not a valid regex. It is an attempt to subtract one unicode char 
from another, but this is (correctly) not supported, as the error 
message says. re.compile expects a string (8-bit or Unicode).

If you wanted to match ASCII characters from 'A' to 'Z', you wouldn't 
put re.compile('A'-'Z'), would you? Well I hope not, I hope you would 
use re.compile('[A-Z]') -- does that give you a clue?

> Traceback (most recent call last):
>   File "<interactive input>", line 1, in ?
> TypeError: unsupported operand type(s) for -: 'unicode' and 'unicode'
> 
> 
> so re module does NOT support this operation

Incorrect conclusion. The argument that you attempted to supply is not a 
valid expression and thus not a valid argument for *any* function. It 
was intercepted before it got to re.compile. re.compile is innocent.

> however, is there any alternative way to solve my problem?
> 
> Any comments/suggestions much appreciated!

1. Read the fantastic manual.
2. Learn to understand error messages.
3. Assume the most plausible cause (you stuffed up, not the people who 
worked on the re module).

HTH,
John



More information about the Python-list mailing list