Another newbie question
James T. Dennis
jadestar at idiom.com
Mon Jun 10 21:31:12 EDT 2002
SA <sarmstrong13 at mac.com> wrote:
> I am trying to find a way to list the contents of a directory into a list so
> that the contents of the list are then read into an html doc as hrefs.
> What I'm looking for is a way to ls a directory. Is there a way to access
> the shell command ls from within a python script and dumping the output into
> a list for further manipulation? Or does this have to be built from the
> ground up inside the python script?
> Thanks.
> SA
You could use popen():
import os
listing = os.popen('ls -l').readlines()
for eachLine in listing:
htmlify(eachLine[:-1])
... where "htmlify()" is a hypothetical function that you provide
to wrap each line in your preferred HTML tags. I'm assuming you
DON'T want the trailing linefeeds, so I'm trimming them using the
slice ([:-1] is sort of like perl's chop() function).
Of course this comes with the stock caveats about using popen()
--- the string you pass to popen() is passed to a shell, parsed
and executed by that shell and thus subject to many exploits
(you *DON'T* want to pass foreign/alien string variables to popen()
and you probably don't want to try "sanitizing" them yourself; you
probably can't account for all of the ways that an attacker could
trick your sanitizer into passing a "dirty" string to the shell).
It's probably much safer to use os.listdir() with os.stat(),
and some processing of the stat() info using the pwd and grp
modules (to look up user and group names given UIDs and GIDs) and
the time module to convert the st_mtime field into a readable
date/time stamp. You can write your own function to convert octal
st_mode values into the usual '-rwxr-x---' format and you'd want
to use the stat module to help with that (especially to set the
"type" character (- for regular files, l for symlinks, s for UNIX
domain sockets, etc) and you'd have to fuss with lstat vs stat
for symlinks.
In other words re-implementing a basic "ls -al" in Python is likely
to take about five or six modules and probably about 100 lines of
code. Supporting any significant number of ls options would add
quite a bit more code (and maybe require the glob and/or fnmatch
modules).
On the one hand that's not really alot of work (and you can take
shortcuts if you don't need the output to be "just like" ls' --
for example if you don't care to support non-regular files, don't
care about the file modes and/or don't care about ls' handling of
dates, where it shows one form for files within the last year and
another for older files.) That might bring it down to 50 lines of
code; which might be considerably less then you'd spend "sanitizing"
data to be suitable for passing to popen().
On the other hand, it is alot more work than two lines that it takes
to get the ls -l from popen().
Alot also depends on what else you're going to do and how robust
you want the app. to be (how you want to to exception handling).
using popen(), popen2(), popen3(), and popen4() provides limited
exception and error handling). In fact you might have to use the
popen2.Popen3() or Popen4() *classes* in the popen2 module in order to
get exit code values from your ls commands. That's assuming that you want
to gracefully handle some of the errors that you know *could* happen
under the circumstances that exist on your system(s) when your script is
running.
[As usually, writing sometthing that works for optimal input and
conditions is reasonably straightforward, accounting for degenerate
input and handling exceptional conditions is more challenging].
More information about the Python-list
mailing list