emulating du with os.walk

Gerrit gerrit at nl.linux.org
Tue Sep 28 03:44:21 EDT 2004


"Martin v. Löwis" wrote:
> Kirk Job-Sluder wrote:
> >There should be an easy way to get around this, or perhaps I'm better
> >off just parsing the output of du.
> 
> I suggest that you don't use os.path.walk, but write a recursive
> function yourself. You should find that the entire problem can
> be solved in 12 lines of Python code.

There are some nasty little problems which make it difficult.

First, what do you do with hardlinks? Suppose directory a/a, a/b and a/c
all contain the same 100 MiB file. Directory a/ only has 100 MiB, but a
naive script will report 300 MiB.

Most of the time, you'll want to stay in one filesystem.

You don't want to get stuck in recursive symlinks. If a/b is a symlink
to a/, you quickly get into an infinite loop.

Directories have a size too.

What do we do with files we can't read?

In /proc, even stranger subtleties exist which I don't understand -
ENOENT although listed by listdir() and that sort of thing.

Together with more options, human-readable file sizes and documentation,
it took be ~200 LOC at
http://topjaklont.student.utwente.nl/creaties/dkus.py

Note that du doesn't solve these problems either.

yours,
Gerrit.

-- 
Weather in Twenthe, Netherlands 28/09 08:55:
	15.0°C mist overcast wind 4.0 m/s SW (57 m above NAP)
-- 
In the councils of government, we must guard against the acquisition of
unwarranted influence, whether sought or unsought, by the
military-industrial complex. The potential for the disastrous rise of
misplaced power exists and will persist.
    -Dwight David Eisenhower, January 17, 1961



More information about the Python-list mailing list