os.stat bug?

Dan Stromberg drsalists at gmail.com
Tue Mar 22 19:37:47 EDT 2011


On Mon, Mar 21, 2011 at 1:32 AM, Laszlo Nagy <gandalf at shopzeus.com> wrote:

>
>  Hi All,
>
> I have a Python program that goes up to 100% CPU. Just like this (top):
>
>  PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU
> COMMAND
> 80212 user1           2  44    0 70520K 16212K select  1   0:30 100.00%
> /usr/local/bin/python process_updates_ss_od.py -l 10
>
> I have added extra logs and it turns out that there are two threads. One
> thread is calling "time.sleep()" and the other is calling "os.stat" call.
> (Actually it is calling os.path.isfile, but I hunted down the last link in
> the chain.) The most interesting thing is that the process is in "SELECT"
> state. As far as I know, CPU load should be 0% because "select" state should
> block program execution until the I/O completes.
>
> I must also tell you that the os.stat call is taking long because this
> system has about 7 million files on a slow disk. It would be normal for an
> os.stat call to return after 10 seconds. I have no problem with that. But I
> think that the 100% CPU is not acceptable. I guess that the code is running
> in kernel mode. I think this because I can send a KILL signal to it and the
> state changes to the following:
>
>
>  PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU
> COMMAND
> 80212 user1           2  44    0 70520K 15256K STOP    5   1:27 100.00%
> /usr/local/bin/python process_updates_ss_od.py -l 10
>
> So the state of the process changes to "STOP", but the program does not
> stop until the os.stat call returns back (sometimes for 30 seconds).
>
> Could it be a problem with the operation system? Is it possible that an
> os.stat call requires 100% CPU power from the OS? Or is it a problem with
> the Python implementation?
>
> (Unfortunately I cannot give you an example program. Giving an example
> would require giving you a slow I/O device with millions of files on it.)
>
> OS version: FreeBSD 8.1-STABLE amd64
> Python version: 2.6.6
>
> Thanks,
>
>   Laszlo <http://mail.python.org/mailman/listinfo/python-list>


1) Run it under the "time" command, to break down the CPU use like the
following (on Ubuntu in this example, but your results may resemble these).
User time is from userspace (the python interpreter: time spent on your
code, including the Python standard library and C library), sys is the time
spent in the kernel related to your process.  Real is the wall-clock time.
If the kernel time is high, look into using a database or using a filesystem
(tweak) that supports large directories well.  If the userspace time is
high, scrutinize the code in more detail:

$ time python -c 'for i in xrange(100000): pass'
cmd started 2011 Tue Mar 22 04:26:20 PM

real    0m0.132s
user    0m0.012s
sys     0m0.004s

2) Does FreeBSD's top command have an option to report on distinct threads
of a process individually?  Some top's do, but I'm not confident all of them
do.

3) Does the code run on Pypy?  If it does, it might be a lot faster.  The
difference can be pretty dramatic sometimes.

4) Profile it using something like profile or cProfile.  Sometimes the
issues so identified can be surprising.  This should tell you which part of
the code is consuming the most time.

5) If the process is staying in select state, then it's probably making
heavy use of the select syscall - conceivably more use of select than stat.
If there's a select in the code with a small timeout inside a loop, you
might check that over.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110322/06cbd755/attachment-0001.html>


More information about the Python-list mailing list