Request for tips on my first python script.

Sybren Stuvel sybrenUSE at YOURthirdtower.com.imagination
Fri Sep 8 07:41:19 EDT 2006


Lex Hider enlightened us with:
> Any tips on the code quality and use of python would be appreciated.
> I've got a feeling the overall structure is up the creek.

I'll post some remarks about the code ;-)

> HOME = os.path.expanduser("~")

I wouldn't use this. Just use os.environ['HOME']. In most cases it
turns out to be the same directory, but it adds more flexibility. If
someone wants your app to read/write to another directory, he/she can
simply change the HOME environment variable.

> # user configurable
> #maxChecksPerDay = 8
> #maxChecksPerDay = 12
> maxChecksPerDay = 24
> myTemp = '/tmp'
> #podDir = os.path.join(HOME, 'Audio/Podcasts')
> podDir = os.path.join(HOME, 'Podcasts')
> # end user configurable

A bit of nitpicking: place a blank line between the user configurable
part and the rest of the code.

> def exitFunc():
>     #f.close()
>     #log.close()
>     if boz:
>         print boz

Write function comments! Use the docstring to describe your function -
what does it do, and what does it return? Do this for all functions
you write.

> # render is used because feeds use a lot of html, not just plain text.
> def render(html):
>     if html:
>         html = re.sub('"', '\\"', html.encode('utf8'))

Use a raw string for the second argument to make it more readable:

    html = re.sub('"', r'\"', html.encode('utf8'))

>         command = 'echo "' + html + '" | lynx -dump -stdin -force_html'
>         os.system(command)

Use the subprocess module or the popen2 module to open lynx. That way,
you can feed the HTML to lynx directly, and you're not bound to the
maximum line length of the shell. It also removes the need to escape
quotes.

> def updateCache(feeds):
>     l = []

Use longer names than 'l'.

>     print "updating local xml cache..."
>     for feed in file(feeds, "r").read().split('\n'):
>         if not re.match('^http://', feed): # feedList ignores
>                                            # anything but urls

Here you say you only match URLs starting with "http://", but at the
start you claimed to only use URLs starting with "http". Be sure to
keep your documentation and your code in sync.

> def htmlTitle(mainTitle, subTitle):
>     s = '<HR>'
>     s += '<H2>' + mainTitle + '</H2>'
>     s += '<H3>' + subTitle + '</H3>'
>     return s

It might be easier on the eyes if you use:

    s = '<hr><h2>%s</h2><h3>%s</h3>' % (mainTitle, subTitle)

It might also be wise to use lower-case HTML tags to remain compatible
with XHTML.

> def downloadPod(url, dest):
>     kb = 2
>     success = 0
>     command = 'wget --continue -O "' + dest + '" "' + url + '"'
>     status  =  os.system(command)

Here you pass the arguments of the function to a system call. This
means that before the downloadPod function is called, the 'url' and
'dest' should have been escaped or cleared of unwanted characters.
This should really be documented.

Overall, your code needs to be commented and documented much better.
Think of other people reading the code, and explain _why_ you do
something, instead of explaining _what_ you're doing.

Sybren
-- 
Sybren Stüvel
Stüvel IT - http://www.stuvel.eu/



More information about the Python-list mailing list