[Tutor] grrrr!

Rich Krauter rmkrauter at yahoo.com
Thu Feb 12 20:53:34 EST 2004


On Thu, 2004-02-12 at 19:08, Christopher Spears wrote:
> I'm writing some code that examines a directory and
> creates two lists.  One is a list of all the files in
> the directory excluding other directories and links
> along with the size of the files.  The result should
> look something like this:
> [[['openunit4.pyw', 48L], ['printlongestline.pyw',
> 214L]...]]
> 
> I have suceeded in making this work.  The next list
> should just contain the largest file in the directory
> and look like this:
> 
> [[huge_file, 247L]]
> 
> Here is the code:
> 
> def get_size(n,dir):
>     import os
>     files_size = []
>     biggest = []
>     files = os.listdir(dir)
>     files = filter(lambda x:not os.path.isdir(x) and  
>             not os.path.islink(x),files) 
> 
>     for f in range(len(files)):
>         files_size = files_size +
> [[files[f],os.path.getsize(files[f])]]
> 
>     for x in range(len(files_size)):
>         s = files_size[x][1]
>         for b in range(len(files_size)):
>             if s > files_size[b][1]:
>                 biggest = biggest + [files_size[x]]
>         
> 
>     print files_size
>     print biggest
> 
> I think I need some sort of second condition in the if
> statement, but I have problems comparing s to
> biggest[0][1].  I always get an index out of range
> error.
> 
> Suggestions, anyone?
> 
> 
Hi Chris,
Here's what I tried. I used a dict because I find that easier than using
two parallel lists when I want to keep two sequences in sync. 
By using the file sizes  as the keys of the dict I can get at the
largest file(s) or smallest files without doing any special tests. Looks
kinda long but its mostly comments.

import os 
def get_size(dir):
    files_of_size = {};sizes = [];
    biggest_files = [];file_sizes = [];

    files = filter(lambda x:not os.path.isdir(x)
                   and not os.path.islink(x),os.listdir(dir))

    for f in files:
        # make dict with file sizes as keys and
        # arrays of filenames as values 
        # I use arrays of files in case  two or more
        # files have the same size
        files_of_size.setdefault(os.path.getsize(f),[]).append(f)

    sizes = files_of_size.keys()
    sizes.sort();sizes.reverse();
    for s in sizes:
        print files_of_size[s],s
        # make an array of files and sizes, in descending order
        # of file size
        biggest_files.append([files_of_size[s],s])
    return biggest_files

if __name__ == '__main__':
    bf = get_size('/tmp');
    print bf
    biggest = bf[0]
    print biggest

Hope it helps.
Rich



More information about the Tutor mailing list