[Tutor] Increase performance of the script

Alan Gauld alan.gauld at yahoo.co.uk
Sun Dec 9 11:14:10 EST 2018


On 09/12/2018 10:15, Asad wrote:

> f4 = open (r" /A/B/file1.log  ", 'r' )

Are you sure you want that space at the start ofthe filename?


> string2=f4.readlines()

Here you read the entire file into memory. OK for small files
but if it really can be 5GB that's a lot of memory being used.

> for i in range(len(string2)):

This is usually the wrong thing to do in Python. Aside
from the loss of readability it requires the interpreter
to do a lot of indexing operations which is not the
fastest way to access things.

>     position=i
>     lastposition =position+1
>     while True:
>          if re.search('Calling rdbms/admin',string2[lastposition]):

You are using regex to search for a fixed string.
Its simpler and faster to use string methods
either foo in string or string.find(foo)

>           break
>          elif lastposition==len(string2)-1:
>           break
>          else:
>           lastposition += 1

This means you iterate over the whole file content
multiple times. Once for every line in the file.
If the file has 1000 lines that means you do these
tests close to 1000000/2 times!

This is probably your biggest performance issue.

>     errorcheck=string2[position:lastposition]
>     for i in range ( len ( errorcheck ) ):
>         if re.search ( r'"error(.)*13?"', errorcheck[i] )

This use of regex is valid since its a pattern.
But it might be more efficient to join the lines
and do a single regex search across lone boundaries.
But you need to test/time it to see.

But you also do another loop inside the outer loop.
You need to look at how/whether you can eliminate
all these inner loops and just loop over the file
once - ideally without reading the entire thing
into memory before you start.

Processing it as you read it will be much more efficient.
On a previous thread we showed you several ways you
could approach that.

>             print "Reason of error \n", errorcheck[i]
>             print "script \n" , string2[position]
>             print "block of code \n"
>             print errorcheck[i-3]
>             print errorcheck[i-2]
>             print errorcheck[i-1]
>             print errorcheck[i]
>             print "Solution :\n"
>             print "Verify the list of objects belonging to Database "
>             break
>     else:
>         continue
>     break



-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos




More information about the Tutor mailing list