[Tutor] Script Optimization (PIL)

Jay Dorsey python@jaydorsey.com
Sun Mar 30 13:55:01 2003


I have a thumbnail script I wrote, which is running as a CGI script. 
The script receives an image name, a height and width, and uses the 
Python Imaging Library to thumbnail the image.  This script could 
potentially be called ~300 times a minute to thumbnail images so I want 
it to be as fast as possible.  I'm not sure if multi-threading is an 
option -- what I've read on threading in Python is confusing so I'm not 
sure if its possible, or if it will work in this situation.  The script 
creates the images dynamically, and doesn't save the thumbnail at all. 
The source images are all available locally.

Currently I benchmark the script by calling a PHP page which loops 
through 200 image names and does a GET to retrieve and display the 
thumbnails.  This is similar to how the actual service will work when it 
is finished.

I have the script broken up into two files:

The first, thumb.py, imports some functions I created in the second 
file, thumblib.py.  I found that by putting the functions into 
thumblib.py and importing them, the script ran much quicker.

thumb.py
----------begin file----------
#!/usr/local/bin/python
from thumblib import createThumb, sendImage, getqs

sendImage(createThumb(getqs()))
----------end file----------


thumblib.py
----------begin file----------
#!/usr/local/bin/python
import Image
# Borrowed the following two lines from Fredrik Lundh's
# contribution on pythonware.com.  This imports only the modules
# that I need; in this case, the Jpeg plugin
import JpegImagePlugin
Image._initialized = 1
# import this for temporary file storage
from cStringIO import StringIO
# used to get the querystring
from cgi import FieldStorage

def getqs():
	"""
	Reads in a querystring and assigns the querystring vars to
	variables
	"""
         form = FieldStorage()
         i, w, h = form.getvalue('i'), int(form.getvalue('w')), \
                 int(form.getvalue('h'))
         return i, w, h

def createThumb(vars):
	"""
	Does the actual thumbnail
	"""
         image, width, height = vars
	# my image file object
         thumb = StringIO()
         im = Image.open("".join(["/usr/local/xitami/webpages/images/", 
image]))
         im.thumbnail((int(width), int(height)))
	# saving the image as progressive speeds up the script
         im.save(thumb, "JPEG", progressive=1)
         thumb.seek(0)
         return thumb.getvalue()

def sendImage(image):
	"""
	Produces the proper HTTP headers and the resultant thumbnails
	"""
         print "Content-type: image/jpeg\nContent-length: %s\n\n%s" % 
(len(image)
, image)

----------end file----------

I've found a few little tricks, such as instead of saying:

x = 1
y = 2

use:

x, y = 1, 2

I've also used py_compile to compile both the thumb.py and thumblib.py 
files.

Currently, I can thumbnail 200 640x480 JPEG images in about 35 seconds, 
which meets my requirements of 300 in a minute, but I'm running on a 
local box and I know once I add some network overhead on it the speed is 
going to go down some.

Any tips/tricks/ideas would be appreciated, code examples will get you a 
beer next time I'm in town;).  I want this to run as a web service 
(multiple servers will be hitting it), but I'm not sure whats the best 
way to do this?  I've thought about writing the JPEG to a tmp directory, 
as a form of "caching", but if I can make this work fast enough without 
having to resort to such trickery ;) I would rather not.  Would I be 
able to make this script into some sort of daemon/service to cut down on 
python load time?

Thanks again,

-- 
Jay Dorsey
python at jay dorsey dot com