Parsing log in SQL DB to change IPs to hostnames

Steve Holden steve at holdenweb.com
Tue Apr 10 15:47:52 EDT 2007


KDawg44 wrote:
> On Apr 10, 11:11 am, "Kushal Kumaran" <kushal.kuma... at gmail.com>
> wrote:
>> On Apr 10, 8:37 pm, "KDawg44" <KDaw... at gmail.com> wrote:
>>
>>
>>
>>> Hi,
>>> I am brand new to Python.  In learning anything, I find it useful to
>>> actually try to write a useful program to try to tackle an actual
>>> problem.
>>> I have a syslog server and I would like to parse the syslog messages
>>> and try to change any ips to resolved hostnames.  Unfortunately, I am
>>> not getting any matches on my regular expression.
>>> A message will look something like this:
>>>  Apr 10 2007 00:30:58 DEVICE : %DEVICEINFO: 1.1.1.1 Accessed URL
>>> 10.10.10.10:/folder/folder/page.html
>>> I would like to change the message to have the hostnames, or even
>>> better actually, have it appear as hostname-ip address.  So a changed
>>> message would look like:
>>>  Apr 10 2007 00:30:58 DEVICE : %DEVICEINFO: pcname-1.1.1.1 Accessed
>>> URLwww.asite.com-10.10.10.10:/folder/folder/page.html
>>> or some equivalent.
>>> Here is what i have so far.  Please be kind as it is my first python
>>> program.... :)
>>> #! /usr/bin/python
>>> import socket
>>> import re
>>> import string
>>> import MySQLdb
>>> ipRegExC = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
>>> ipRegEx = re.compile(ipRegExC)
>>> try:
>>>         conn = MySQLdb.connect(host="REMOVED", user="REMOVED",
>>> passwd="REMOVED", db="REMOVED")
>>> except MySQLdb.Error, e:
>>>         print "Error connecting to the database: %d - %s " %
>>> (e.args[0], e.args[1])
>>>         sys.exit(1)
>>> cursor = conn.cursor()
>>> cursor.execute("SELECT msg, seq FROM REMOVED WHERE seq = 507702")
>>> # one specific message so that it doesn't parse the whole DB during
>>> testing...
>>> while(1):
>>>         row = cursor.fetchone()
>>>         if row == None:
>>>                 break
>>>         if ipRegEx.match(row[0]):
>>>             ....
>>> <snipped rest of the code>
>> See the documentation of the re module for the difference between
>> matching and searching.
>>
>> --
>> Kushal
> 
> Thank you very much.  I think I have it figured out, except for an
> error on the SQL statement:
> 
> 
> [----- BEGIN ERROR ---]
> Traceback (most recent call last):
>   File "changeLogs.py", line 47, in ?
>     cursor.execute("""UPDATE logs SET msg = %s WHERE seq = %i""",
> (newMsg,seqNum))
>   File "/usr/lib/python2.4/site-packages/MySQLdb/cursors.py", line
> 148, in execute
>     query = query % db.literal(args)
> TypeError: int argument required
> [----- END ERROR ---]
> 
> Here is my code
> 
> [----- BEGIN CODE ---]
> #! /usr/bin/python
> 
> import socket
> import sys
> import re
> import string
> import MySQLdb
> 
> def resolveHost(ipAdds):
>         ipDict = {}
>         for ips in ipAdds:
>                 try:
>                         ipDict[ips] = socket.gethostbyaddr(ips)[0]
>                 except:
>                         ipDict[ips] = "Cannot resolve"
>         return ipDict
> 
> 
> ipRegExC = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
> ipRegEx = re.compile(ipRegExC)
> 
> try:
>         conn = MySQLdb.connect(host="REMOVED", user="REMOVED",
> passwd="REMOVED", db="REMOVED")
> 
> except MySQLdb.Error, e:
>         print "Error connecting to the database: %d - %s " %
> (e.args[0], e.args[1])
>         sys.exit(1)
> 
> cursor = conn.cursor()
> cursor.execute("SELECT msg, seq FROM `logs` WHERE seq = 507702")
> while(1):
>         row = cursor.fetchone()
>         ipAddresses = []
>         resolvedDict = {}
>         if row == None:
>                 break
>         if ipRegEx.search(row[0]):
>                 seqNum = row[1]
>                 ipAddresses = ipRegEx.findall(row[0])
>                 resolvedDict = resolveHost(ipAddresses)
>                 newMsg = row[0]
>                 for ip in resolvedDict.keys():
>                         newMsg = newMsg.replace(ip,ip + "-" +
> resolvedDict[ip])
>                 cursor.execute("""UPDATE REMOVED SET msg = %s WHERE
> seq = %i""", (newMsg,seqNum))
> 
> 
> [----- END CODE ---]
> 
> Thanks again!
> 
> 
Since the source line that the traceback complains about doesn't appear 
in the quoted code it's difficult to know what's going wrong. I'd hazard 
a guess that you have a string in seqNum instead of an integer message 
number (in which case try using int(seqNum) instead).

Otherwise show us the real code, not the one after you modified it to 
try and make it work, amd we might be able to help more ;-)

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb     http://del.icio.us/steve.holden
Recent Ramblings       http://holdenweb.blogspot.com




More information about the Python-list mailing list