regular expression: perl ==> python

Nick Craig-Wood nick at craig-wood.com
Fri Dec 24 01:46:35 EST 2004


Fredrik Lundh <fredrik at pythonware.com> wrote:
>  the undocumented sre.Scanner provides a ready-made mechanism for this
>  kind of RE matching; see
> 
>      http://aspn.activestate.com/ASPN/Mail/Message/python-dev/1614344
> 
>  for some discussion.
> 
>  here's (a slight variation of) the code example they're talking about:
> 
>      def s_ident(scanner, token): return token
>      def s_operator(scanner, token): return "op%s" % token
>      def s_float(scanner, token): return float(token)
>      def s_int(scanner, token): return int(token)
> 
>      scanner = sre.Scanner([
>          (r"[a-zA-Z_]\w*", s_ident),
>          (r"\d+\.\d*", s_float),
>          (r"\d+", s_int),
>          (r"=|\+|-|\*|/", s_operator),
>          (r"\s+", None),
>          ])
> 
>      >>> print scanner.scan("sum = 3*foo + 312.50 + bar")
>      (['sum', 'op=', 3, 'op*', 'foo', 'op+', 312.5, 'op+', 'bar'],
>      '')

That is very cool - exactly the kind of problem I come across quite
often!

I've found the online documentation (using pydoc) for re / sre in
general to be a bit lacking.

For instance nowhere in

  pydoc sre

Does it tell you what methods a match object has (or even what type it
is).  To find this out you have to look at the HTML documentation.
This is probably what Windows people look at by default but Unix
hackers like me expect everything (or at least a hint) to be in the
man/pydoc pages.

Just noticed in pydoc2.4 a new section

MODULE DOCS
    http://www.python.org/doc/current/lib/module-sre.html

Which is at least a hint that you are looking in the wrong place!
...however that page doesn't exist ;-)

-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick



More information about the Python-list mailing list