Need help in pulling SQL query out of log file...

alex23 wuwei23 at gmail.com
Tue Oct 14 01:39:54 EDT 2014


On 14/10/2014 11:47 AM, Sagar Deshmukh wrote:
> I have a log file which has lot of information like..SQL query.. number of records read...records loaded etc..
> My requirement is i would like to read the SQL query completly and write it to another txt file..

Generally we encourage people to post what they've tried to the list. It 
helps us identify what you know and what you need help with.

However, given:

 > the log file may not be always same so can not make static choices...

You'll probably want to use regular expressions:

     https://docs.python.org/howto/regex.html

Regexps let you search through the text for known patterns and extract 
any that match. To extract all SQL query sections, you'll need to come 
up with a way of uniquely identifying them from all other sections. 
Looking at your example log file, it looks like they're all of the format:

     SQL Query [<the actual sql query>]

 From that we can determine that all SQL queries are prefixed by 'SQL 
Query [' and suffixed by ']', so the content you want is everything 
between those markers. So a possible regular expression might be:

     SQL Query \[(.*?)\]

To quickly explain this:

     1. "SQL Query " matches on that string
     2. Because [] have meaning for regexes, to match on literal 
brackets you need to escape them via \[ and \]
     3. ( ) is a group, whats contained in here will be returned
     4. .* means to grab all matching text
     5. ? means to do an "ungreedy" grab ie it'll stop at the first \] 
it encounters.

Pulling the queries out of your log file should be as simple as:

     import re

     log = open('logfile').read()
     queries = re.findall("SQL Query \[(.*?)\]", log, re.DOTALL)

Because the queries can fall across multiple lines, the re.DOTALL flag 
is required to treat EOL markers as characters.

Hope this helps.



More information about the Python-list mailing list