Another newbie question

Mon Jun 10 21:31:12 EDT 2002

SA <sarmstrong13 at mac.com> wrote:

> I am trying to find a way to list the contents of a directory into a list so
> that the contents of the list are then read into an html doc as hrefs.

> What I'm looking for is a way to ls a directory. Is there a way to access
> the shell command ls from within a python script and dumping the output into
> a list for further manipulation? Or does this have to be built from the
> ground up inside the python script?

> Thanks.
> SA

  You could use popen():

	import os
	listing = os.popen('ls -l').readlines()
	for eachLine in listing:
		htmlify(eachLine[:-1])

  ... where "htmlify()" is a hypothetical function that you provide
  to wrap each line in your preferred HTML tags.  I'm assuming you
  DON'T want the trailing linefeeds, so I'm trimming them using the
  slice ([:-1] is sort of like perl's chop() function).

  Of course this comes with the stock caveats about using popen()
   --- the string you pass to popen() is passed to a shell, parsed
  and executed by that shell and thus subject to many exploits
  (you *DON'T* want to pass foreign/alien string variables to popen()
  and you probably don't want to try "sanitizing" them yourself; you 
  probably can't account for all of the ways that an attacker could
  trick your sanitizer into passing a "dirty" string to the shell).

  It's probably much safer to use os.listdir() with os.stat(), 
  and some processing of the stat() info using the pwd and grp 
  modules (to look up user and group names given UIDs and GIDs) and
  the time module to convert the st_mtime field into a readable 
  date/time stamp.  You can write your own function to convert octal
  st_mode values into the usual '-rwxr-x---' format and you'd want
  to use the stat module to help with that (especially to set the 
  "type" character (- for regular files, l for symlinks, s for UNIX
  domain sockets, etc) and you'd have to fuss with lstat vs stat
  for symlinks.

  In other words re-implementing a basic "ls -al" in Python is likely
  to take about five or six modules and probably about 100 lines of 
  code.  Supporting any significant number of ls options would add 
  quite a bit more code (and maybe require the glob and/or fnmatch
  modules).

  On the one hand that's not really alot of work (and you can take 
  shortcuts if you don't need the output to be "just like" ls' --
  for example if you don't care to support non-regular files, don't
  care about the file modes and/or don't care about ls' handling of 
  dates, where it shows one form for files within the last year and 
  another for older files.)  That might bring it down to 50 lines of
  code; which might be considerably less then you'd spend "sanitizing"
  data to be suitable for passing to popen().

  On the other hand, it is alot more work than two lines that it takes 
  to get the ls -l from popen().  

  Alot also depends on what else you're going to do and how robust
  you want the app. to be (how you want to to exception handling).  
  using popen(), popen2(), popen3(), and popen4() provides limited
  exception and error handling).  In fact you might have to use the 
  popen2.Popen3() or Popen4() *classes* in the popen2 module in order to
  get exit code values from your ls commands.  That's assuming that you want
  to gracefully handle some of the errors that you know *could* happen
  under the circumstances that exist on your system(s) when your script is 
  running.

  [As usually, writing sometthing that works for optimal input and
  conditions is reasonably straightforward, accounting for degenerate 
  input and handling exceptional conditions is more challenging].