[Tutor] 2016-02-01 Filter STRINGS in Log File and Pass as VARAIBLE within PYTHON script
knnleow GOOGLE
knnleow at gmail.com
Tue Feb 2 07:13:31 EST 2016
hello Cameron,
thank you for the positive input. this is my new code.
NEW CODE
----------------
$ more fail2ban-banned-ipAddress.py
#VERSION CONTROL:
#2016-01-31 - Initial build by Kuenn Leow
# - fail2ban package has to be installed
# - fail2ban leverage on linux iptables to work
#2016-02-02 - modified with recommendation from Cameron Simpson
#
#FIXED MODULE IMPORT and FIXED ARGV IMPORT
import sys
import os
import subprocess
import time
import traceback
myArray = sys.argv
def checkInputs():
if('-date' not in myArray):
#print(__doc__)
print('''
USAGE: python fail2ban-banned-ipAddress.py -date <YYYY-MM-DD>
EXAMPLE: python fail2ban-banned-ipAddress.py -date 2016-01-31
''')
sys.exit(1)
def main():
#START MAIN PROGRAM HERE!!!
try:
checkInputs()
myDate = myArray[myArray.index('-date') + 1]
timestamp01 = time.strftime("%Y-%m-%d")
timestamp02 = time.strftime("%Y-%m-%d-%H%M%S")
wd01 = ("/var/tmp/myKNN/1_mySAMPLEpython-ver-001/" +
timestamp01)
wd02 = ("/var/tmp/myKNN/1_mySAMPLEpython-ver-001/" +
timestamp02)
#print(" ")
#print(40 * "-")
#print("START DEBUG Log of MAIN Defined VARIABLE")
#print(40 * "-")
#print("myDate: " + myDate)
#print(" ")
#print("timestamp01: " + timestamp01)
#print("timestamp01: " + timestamp02)
#print(" ")
#print("wd01: " + wd01)
#print("wd02: " + wd02)
#print(38 * "-")
#print("END DEBUG Log of MAIN Defined VARIABLE")
#print(38 * "-")
#print(" ")
print(" ")
with open("/var/log/fail2ban.log") as fail_log:
for line in fail_log:
if("ssh" in line and "Ban" in line and
myDate in line):
words = line.split()
banIP = words[6]
print("banIP:" , banIP)
whoisFile = os.popen("whois -H
" + banIP + " |egrep -i \"name|country|mail\" |sort -u").read()
print("whoisFile:", whoisFile)
except KeyboardInterrupt:
print('Shutdown requested...exiting')
except Exception:
traceback.print_exc(file=sys.stdout)
sys.exit(0)
#END MAIN PROGRAM HERE!!!
#START RUN main program/functions HERE!!!
if __name__ == "__main__":
main()
#END RUN main program/functions HERE!!!
TEST RESULT:
-------------------
$ python ./fail2ban-banned-ipAddress.py -date 2016-01-31
banIP: 183.3.202.109
whoisFile: abuse-mailbox: anti-spam at ns.chinanet.cn.net
abuse-mailbox: antispam_gdnoc at 189.cn
country: CN
e-mail: anti-spam at ns.chinanet.cn.net
e-mail: gdnoc_HLWI at 189.cn
netname: CHINANET-GD
banIP: 183.3.202.109
whoisFile: abuse-mailbox: anti-spam at ns.chinanet.cn.net
abuse-mailbox: antispam_gdnoc at 189.cn
country: CN
e-mail: anti-spam at ns.chinanet.cn.net
e-mail: gdnoc_HLWI at 189.cn
netname: CHINANET-GD
banIP: 27.75.97.233
whoisFile: abuse-mailbox: hm-changed at vnnic.net.vn
country: VN
e-mail: hm-changed at vnnic.net.vn
e-mail: tiennd at viettel.com.vn
e-mail: truongpd at viettel.com.vn
netname: Newass2011xDSLHN-NET
remarks: For spamming matters, mail to tiennd at viettel.com.vn
banIP: 183.3.202.109
whoisFile: abuse-mailbox: anti-spam at ns.chinanet.cn.net
abuse-mailbox: antispam_gdnoc at 189.cn
country: CN
e-mail: anti-spam at ns.chinanet.cn.net
e-mail: gdnoc_HLWI at 189.cn
netname: CHINANET-GD
Cameron Simpson wrote:
> On 01Feb2016 15:53, knnleow GOOGLE <knnleow at gmail.com> wrote:
>> trying out on how to port my unix shell script to python.
>> get more complicated than i expected.....: (
>> i am not familiar with the modules available in python.
>> anyone care to share how to better the clumsy approach below.
>> regards,
>> kuenn
>>
>> timestamp02 = time.strftime("%Y-%m-%d-%H%M%S")
>> banIPaddressesFile = os.popen("cat
>> /var/log/fail2ban.log| egrep ssh| egrep Ban| egrep " + myDate + "|
>> awk \'{print $7}\'| sort -n| uniq >/tmp/banIPaddressesFile." +
>> timestamp02).read()
>
> First up, this is still essentially a shell script. You're
> constructing a shell pipeline like this (paraphrased):
>
> cat >/var/log/fail2ban.log
> | egrep ssh
> | egrep Ban
> | egrep myDate
> | awk '{print $7}'
> | sort -n
> | uniq >/tmp/banIPaddressesFile-timestamp
>
> So really, you're doing almost nothing in Python. You're also writing
> intermediate results to a temporary filename, then reading from it.
> Unless you really need to keep that file around, you won't need that
> either.
>
> Before I get into the Python side of things, there are a few small
> (small) criticisms of your shell script:
>
> - it has a "useless cat"; this is a very common shell inefficiency
> there people put "cat filename | filter1 | filter2 ..." when they
> could more cleanly just go "filter1 <filename | filter2 | ..."
>
> - you are searching for fixed strings; why are you using egrep? Just
> say "grep" (or even "fgrep" if you're old school - you're new to this
> so I presume not)
>
> - you're using "sort -n | uniq", presumably because uniq requires
> sorted input; you are better off using "sort -un" here and skipping
> uniq. I'd also point out that since these are IP addresses, "sort -n"
> doesn't really do what you want here.
>
> So, to the Python:
>
> You seem to want to read the file /var/log/fail2ban.log and for
> certain specific lines, record column 7 which I gather from the rest
> of the code (below) is an IP address. I gather you just want one copy
> of each unique IP address.
>
> So, to read lines of the file the standard idom goes:
>
> with open('/var/log/fail2ban.log') as fail_log:
> for line in fail_log:
> ... process lines here ...
>
> You seem to be checking for two keywords and a date in the interesting
> lines. You can do this with a simple test:
>
> if 'ssh' in line and 'Ban' in line and myDate in line:
>
> If you want the seventh column from the line (per your awk command)
> you can get it like this:
>
> words = line.split()
> word7 = words[6]
>
> because Python arrays count form 0, therefore index 6 is the seventh
> word.
>
> You want the unique IP addresses, so I suggest storing them all in a
> set and not bothering with a sort until some other time. So make an
> empty set before you read the file:
>
> ip_addrs = set()
>
> and add each address to it for the lines you select:
>
> ip_addrs.add(word7)
>
> After you have read the whole file you will have the desired addresses
> in the ip_addrs set.
>
> Try to put all that together and come back with working code, or come
> back with completed but not working code and specific questions.
>
> Cheers,
> Cameron Simpson <cs at zip.com.au>
More information about the Tutor
mailing list