scanf style parsing

Bruce Dawson comments at cygnus-software.com
Thu Sep 27 02:34:08 EDT 2001


Wow! Great answers. And incredibly fast.

Thanks to all.

Rather than the long and complicated PEP route of adding scanf
style functionality to Python it would probably be enough to just add some
more examples to the Python regexp documentation. I searched the Python
documentation for scanf() and I looked at the regexp documentation, so if
it had contained the examples in this reply or others I would have done
my parsing properly long ago. For Perl hackers it is easy to figure out
regexp, but for us old C/C++ types, it's *tough*

Then again, I *always* say that the documentation is the problem...

Thanks again.

Richard Jones wrote:

> On Wednesday 26 September 2001 15:42, Bruce Dawson wrote:
> > I love programming in Python, but there are some things I have not found
> > the easy way to do.
>
> That's what we're here for :)
>
> > I understand that Python is supposed to be good at
> > text parsing, but I am having trouble with this simple task. Given this
> > text (the output from VisualC++) I want to find out how many errors and
> > warnings there were:
> >
> > smtpmail.exe - 0 error(s), 0 warning(s)
> >
> > In C/C++ that would be something like:
> > sscanf(buffer, "smtpmail.exe - %d error(s), %d warning(s)", &errors,
> > &warnings);
> >
> > It's not that I think the sscanf syntax is particularly elegant, but it
> > sure is compact! I saw the discussion about adding scanf to Python
> >
> > http://mail.python.org/pipermail/python-dev/2001-April/014027.html
> >
> > but I need to know what people do right now when faced with this task.
>
> Right now, people use regular expressions, which are more flexible that
> sscanf, but don't do the type conversions (everything comes out as a string)
> and are a little more verbose, code-wise.
>
> [richard at ike ~]% python
> Python 2.1.1 (#1, Jul 20 2001, 22:37:24)
> [GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.58mdk)] on linux-i386
> Type "copyright", "credits" or "license" for more information.
> >>> import re
> >>> scan = re.compile(r'smtpmail.exe - (\d+) error\(s\), (\d+) warning\(s\)')
> >>> result = scan.match("smtpmail.exe - 0 error(s), 0 warning(s)")
> >>> errors, warnings = map(int, result.groups())
> >>> errors
> 0
> >>> warnings
> 0
> >>>
>
> ... or something similar. The RE can be made more flexible to allow, eg. the
> non-existence of the ", %d warning(s)" part:
>
> >>> scan = re.compile(r'foo.exe - (\d+) error\(s\)(, (\d+) warning\(s\))?')
> >>> result = scan.match("foo.exe - 0 error(s)")
> >>> result.groups()
> ('0', None, None)
> >>> result = scan.match("foo.exe - 0 error(s), 0 warning(s)")
> >>> result.groups()
> ('0', ', 0 warning(s)', '0')
>
> ... and so on...
>
>      Richard




More information about the Python-list mailing list