Max files in unix folder from PIL process

Ivan Van Laningham ivanlan at pauahtun.org
Tue Mar 29 08:55:40 EST 2005


Hi All--

Rowdy wrote:
> 
> FreeDB (CD database) stores one file per CD in one directory per
> category.  The "misc" category/directory on my FreeBSD 5.3 system
> currently contains around 481,571 small files.  The "rock"
> directory/category contains 449,208 files.
> 
> As some have said, ls is *very* slow on these directories, but otherwise
> there don't seem to be any problems.
>  

I assume you're all using Linux.  The GNU version of ls does two things
that slow it down.  The System V and BSD versions were pretty much
identical, in that they processed the argv array in whatever order the
shell passed it in.  The GNU version re-orders the argv array and stuffs
all the arguments into a queue.  No big deal if you're just doing ls,
but for ls <multiple directory names> it can slow it down for large
argv[n] and/or recursive/deep ls.

The other thing it does different from SysV/BSD ls is that it provides
for default options in an environment variable.  If those env settings
specify to always use color, that will slow directory processing _way_
down, identically to the -F option.  That's because the color and -F
options _require_ a stat() on each and every file in the directory. 
Standard ls with no options (or old SysV/BSD ls that came with no
options) works nearly as fast as os.listdir() in Python, because it
doesn't require a stat().

The only thing faster, from a shell user's viewpoint, is 'echo *'.  That
may not be much help;-)

Metta,
Ivan
----------------------------------------------
Ivan Van Laningham
God N Locomotive Works
http://www.andi-holmes.com/
http://www.foretec.com/python/workshops/1998-11/proceedings.html
Army Signal Corps:  Cu Chi, Class of '70
Author:  Teach Yourself Python in 24 Hours



More information about the Python-list mailing list