[Tutor] question about run time

Ertl, John john.ertl at fnmoc.navy.mil
Tue May 2 22:44:56 CEST 2006


Danny,

What is the best way to track the time uses in I/O and CPU.  Here is the
code.  It checks some text files for a user name and collects memory and
inode usage then adds them together and checks against a set limit...if the
limit is reached it calls a mail script to send an email warning.  The
program is used like a script with variables passed from the command line.
(maybe I should just write this in bash?)


import sys
import time
import glob
import os.path
import subprocess

class memoryUsage:

        def
__init__(self,userName,gpfs,emailAddress,memLimit=500000,inodeLimit=20000,
                     searchMonth="None",searchDay="None",searchYear="None"):

                self.BASEDIR = "/GPFS/Usage"
                self.gpfs=gpfs.lower()
                self.emailAddress=emailAddress
                self.memLimit=memLimit
                self.inodeLimit=inodeLimit
                self.userName=userName
                self.searchMonth=searchMonth
                self.searchDay=searchDay
                self.searchYear=searchYear
                self.memTotal = 0
                self.inodeTotal = 0
                self.gpfsList = []

                date = time.strftime("%b %d %Y")
                print "date =" ,date
                date = date.split()
                if self.searchMonth == "None":
                    self.searchMonth=date[0]
                if self.searchDay == "None":
                    self.searchDay = date[1]
                if self.searchYear == "None":
                    self.searchYear = date[2]

                if emailAddress == "None":
                    print "You must enter an email (-e)"
                    print "ex. memoryCheck.py -n ertlj -g all -e
j.e at fnmoc.navy.mil"
                    sys.exit()

        def getGPFSList(self):
        ## if self.gpfs is set to all get all availabel otherwise just add
the one asked for
                if self.gpfs == "all":
                    self.gpfsPathList = glob.glob(self.BASEDIR +"/gpfs*")
                    for each in self.gpfsPathList:
                        (base,gpfs) = os.path.split(each)
                        self.gpfsList.append(gpfs)
                else:
                    self.gpfsList.append(self.gpfs)

        def makeFilePath(self,gpfs):
        ## make the full path to the file tha contans the memory and inode
usage for each node
                self.filePath =
os.path.join(self.BASEDIR,gpfs,self.searchYear,
 
self.searchMonth,self.searchDay)

        def extractUserData(self):
        ## look in each file and search for the name...if found get the
memory and inode usage
        ## and return them
                print self.filePath
                fullList = open(self.filePath,"r").readlines()
                for line in fullList:
                        #print "line", line
                        singleList = line.split()
                        try:
                            if singleList[1] == self.userName:
                               print line
                               return singleList[2],singleList[3]
                        except:
                            pass
                return 0,0

        def add(self,memAmount,inodeAmount):
                self.memTotal = self.memTotal + int(memAmount)
                self.inodeTotal = self.inodeTotal + int(inodeAmount)

        def sendEmail(self,message):
            if self.emailAddress != "None":
                messagePath="/home/ertlj/localBin/blank"
                emailComand = "/u/curr/bin/smail.pl -f %s -t %s -s '%s'" %
(messagePath,self.emailAddress,message)
                p = subprocess.Popen(emailComand, shell=True)
                self.sts = os.waitpid(p.pid, 0)
            else:
                sys.exit("No valid email address given...can not email size
warning")

if __name__ == "__main__":


    from optparse import OptionParser

    parser = OptionParser()
    parser.add_option("-n", "--userName",
default=os.getenv("LOGNAME","None"), help="The user name to look for")
    parser.add_option("-g", "--gpfs", default="all", help="Enter the gpfs
you want to search or 'all'")
    parser.add_option("-e", "--email", default="None", help="The email the
report if any will be sent to")
    parser.add_option("-l", "--memLimit", default=5000000, help="The memory
size in KB that will triger a report")
    parser.add_option("-i", "--inodeLimit",default=50000, help="The inode
size that will trigger a report")
    parser.add_option("-d", "--day", default="None", help="The day of the
month you want to check")
    parser.add_option("-m", "--month",default="None" ,help="The month of the
year you want to check")
    parser.add_option("-y", "--year",default="None", help="The year you want
to check")

    (options, args) = parser.parse_args()

    myUse =
memoryUsage(options.userName,options.gpfs,options.email,options.memLimit,
 
options.inodeLimit,options.month,options.day,options.year)

    myUse.getGPFSList()


    for each in myUse.gpfsList:
                myUse.makeFilePath(each)
                (memAmount,inodeAmount) = myUse.extractUserData()
                myUse.add(memAmount,inodeAmount)

    print "Your memory usage is %s KB and your inode usage is %s" %
(myUse.memTotal,myUse.inodeTotal)
    print "Your memory limit is %s KB and your inode limit is %s" %
(myUse.memLimit, myUse.inodeLimit)

    if myUse.memLimit < myUse.memTotal or myUse.inodeLimit <
myUse.inodeTotal:
        print "You have excedded your limit"
        myUse.sendEmail("%s memory/inode limit reached on gpfs " %
myUse.userName)


 -----Original Message-----
From: 	Danny Yoo [mailto:dyoo at hkn.eecs.berkeley.edu] 
Sent:	Tuesday, May 02, 2006 1:32 PM
To:	Ertl, John
Cc:	tutor at python.org
Subject:	Re: [Tutor] question about run time



> I have been using python for sometime...and occasionally I noticed 
> significant delay before the code would run but unitl now I have been 
> able to write it off to other things.  Now I have a short script that I 
> wrote to check some files and print out a few lines.
>
> I have noticed that usually the first time I fire it up in the morning 
> or after a long time of not running it, it takes 10-15 seconds to run 
> and the output to the screen is very slow...maybe 1 second per line. 
> If I run it soon after that it runs and the output is on the screen in 
> less then a second.  I would think this has to do with compiling but I 
> am not sure.  Any ideas how to speed this up?
>
> I am running python 2.4 on a RHE3.0 cluster.
                                ^^^^^^^^^^^^^^

Hi John,

One thing to check is to see if the program is spending the majority of 
its time doing input and output (I/O Bound), or if it's really doing heavy 
computations (CPU bound).  Knowing this might provide clues as to why 
you're seeing this kind of jerky performance.

Also, you may want to check with your cluster folks on the possible 
effects the cluster's architecture may have on program startup.  You're 
running on a slightly specialized platform, so I wouldn't be surprised if 
the cluster architecture is contributing something special.

Finally, if you want to share that script for people to comment on, that 
might help.


Good luck!


More information about the Tutor mailing list