Using regexes versus "in" membership test?

Chris Angelico rosuav at gmail.com
Thu Dec 13 01:19:57 EST 2012


On Thu, Dec 13, 2012 at 5:10 PM, Victor Hooi <victorhooi at gmail.com> wrote:
> Are there any other general pointers you might give for that regex? The lines I'm trying to match look something like this:
>
>     07:40:05.793627975 [Info  ] [SOME_MODULE] [SOME_FUNCTION] [SOME_OTHER_FLAG] [RequestTag=0 ErrorCode=3 ErrorText="some error message" ID=0:0x0000000000000000 Foo=1 Bar=5 Joe=5]
>
> Essentially, I'd want to strip out the timestamp, logging-level, module, function etc - and possibly the tag-value pairs?

If possible, can you do a simple test to find out whether or not you
want a line and then do more complex parsing to get the info you want
out of it? For instance, perhaps the presence of the word "ErrorCode"
is all you need to check - it wouldn't hurt if you have a few percent
of false positives that get discarded during the parse phase, it'll
still be quicker to do a single string-in-string check than a complex
regex to figure out if you even need to process the line at all.

ChrisA



More information about the Python-list mailing list