sscanf ?
Chris Liechti
cliechti at gmx.net
Tue Nov 20 17:29:13 EST 2001
[posted and mailed]
"Bruce Edge" <bedge at troikanetworks.com> wrote in
news:2SzK7.605$px5.156976 at newsfeed.slurp.net:
>> It wouldn't be hard to write a simple wrapping script that takes a
>> printf format string, converts it to a regular expression, does a
>> match against a string, then pulls out the arguments and converts them
>> to types as appropriate. But one might just as well be using regular
>> expressions directly in the first place.
>>
>
> Here's what I did to convert the format string to a regex.
> It's not pretty, and many types will break it, but it serves the
> purpose, for now:
>
> def fmtstr2regex( str ):
> regex = ""
> while len(str):
> if str[0] == '%':
> x = re.match("%(?P<len>\d*)(?P<type>\w)(?P<rest>.*)$",str)
you don't match "%6.3f" and similar and not "%-4f"
> if not x:
> exc = CommandException()
> exc.reason = "Invalid print format specifier %s" %
> str raise exc
when you define an exception like this:
class CommandException(Exception): pass
you can simply call "raise CommandException("reason %s" % str)
> length = int( x.group("len") )
needs an if here...
if x.group("len"):
length = int( x.group("len") )
else:
length = None
> type = x.group("type")
> # Some types need to be changed from printf to regexp
> world if type == 'x':
> type = '['+string.hexdigits+']'
> else:
> type = "\\%s" % type
type 's' should also be transformed to 'w' ('\s' are whitespaces)
> regex += "(%s" % type
> if length:
> regex += "{%d,%d})" % ( length, length )
> else:
> regex += "+)"
> str = x.group('rest')
> else:
> regex += "\%s" % str[0]
whats the intent of '\' here if you want to escape the character you should
write two (ok "\%" -> '\\%' but writing it explicit is better, like you did
above). but escaping produces wrong regexes here.
>>> print fmtstr2regex("1.st %d")
\1\.\s\t\ (\d+)
'\s', '\t', '\1' are all special commands for regexes, you realy want is
r'1\.st (\d+)'
only reserved regex characters should be escaped. [](){}.+?*$^\
> str = str[1:]
> return regex
>
just a sidenote: "type" is a builtin function, which you shadow with your
variable
but overall a nice idea..
chris
--
Chris <cliechti at gmx.net>
More information about the Python-list
mailing list