regular expression: perl ==> python
Nick Craig-Wood
nick at craig-wood.com
Fri Dec 24 01:46:35 EST 2004
Fredrik Lundh <fredrik at pythonware.com> wrote:
> the undocumented sre.Scanner provides a ready-made mechanism for this
> kind of RE matching; see
>
> http://aspn.activestate.com/ASPN/Mail/Message/python-dev/1614344
>
> for some discussion.
>
> here's (a slight variation of) the code example they're talking about:
>
> def s_ident(scanner, token): return token
> def s_operator(scanner, token): return "op%s" % token
> def s_float(scanner, token): return float(token)
> def s_int(scanner, token): return int(token)
>
> scanner = sre.Scanner([
> (r"[a-zA-Z_]\w*", s_ident),
> (r"\d+\.\d*", s_float),
> (r"\d+", s_int),
> (r"=|\+|-|\*|/", s_operator),
> (r"\s+", None),
> ])
>
> >>> print scanner.scan("sum = 3*foo + 312.50 + bar")
> (['sum', 'op=', 3, 'op*', 'foo', 'op+', 312.5, 'op+', 'bar'],
> '')
That is very cool - exactly the kind of problem I come across quite
often!
I've found the online documentation (using pydoc) for re / sre in
general to be a bit lacking.
For instance nowhere in
pydoc sre
Does it tell you what methods a match object has (or even what type it
is). To find this out you have to look at the HTML documentation.
This is probably what Windows people look at by default but Unix
hackers like me expect everything (or at least a hint) to be in the
man/pydoc pages.
Just noticed in pydoc2.4 a new section
MODULE DOCS
http://www.python.org/doc/current/lib/module-sre.html
Which is at least a hint that you are looking in the wrong place!
...however that page doesn't exist ;-)
--
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick
More information about the Python-list
mailing list