Adding Priority Scheduling feature to the subprocess

TimeHorse TimeHorse at gmail.com
Mon Feb 25 07:50:06 EST 2008


On Feb 22, 4:30 am, Nick Craig-Wood <n... at craig-wood.com> wrote:
> Interestingly enough this was changed in recent linux kernels.
> Process levels in linus kernels are logarithmic now, whereas before
> they weren't (but I wouldn't like to say exactly what!).

Wow!  That's a VERY good point.  I ran a similar test on Windows with
the 'start' command which is similar to nice but you need to specify
the Priority Class by name, e.g.

start /REALTIME python.exe bench1.py

Now, in a standard operating system you'd expect some variance between
runs, and I did find that.  So I wrote a script to compute the Mode
(but not the Standard Deviation as I didn't have time for it) for each
Priority Class, chosen each run at random, accumulated the running
value for each one.  Now, when I read the results, I really wish I'd
computed the Chi**2 to calculate the Standard Deviation because the
results all appeared within very close relation to one another, as if
the Priority Class had overall very little effect.  In fact, I would
be willing to guess that say NORMAL and ABOVENORMAL lie with one
Standard Deviation of one another!

That having been said, the tests all ran in about 10 seconds so it may
be that the process was too simple to show any statistical results.  I
know for instance that running ffmpeg as NORMAL or REALTIME makes a
sizable difference.

So, I concede the "Unified Priority" may indeed be dead in the water,
but I am thinking of giving it once last go with the following
suggestion:

0.0 == Zero-Page (Windows, e.g. 0) / +20 (Unix)
1.0 == Normal (Foreground) Priority (Windows, e.g. 9) / 0 (Unix)
MAX_PRIORITY == Realtime / Time Critical (Windows, e.g. 31) / -20
(Unix)

With the value of MAX_PRIORITY TBD.  Now, 0.0 would still represent
(relatively) 0% CPU usage, but now 1.0 would represent 100% of
'Normal' priority.  I would still map 0.0 - 1.0 linearly over the
scale corresponding to the given operating system (0 - 9, Window; +20
- 0, Unix), but higher priorities would correspond to > 1.0 values.

The idea here is that most user will only want to lower priority, not
raise it, so it makes lowering pretty intuitive.  As for the linear
mapping, I would leave a note in the documentation that although the
scale is "linear", the operating system may choose to behave as if the
scale is linear and that the user should consult the documentation for
their OS to determine specific behavior.  This is similar to the
documentation of the file time-stamps in os.stat, since their
granularity differs based on OS.  Most users, I should think, would
just want to make their spawn "slower" and use the scale do determine
"how much" in a relative fashion rather than expecting hard-and-fast
numbers for the actually process retardation.

Higher than Normal priorities may OTHO, be a bit harder to deal with.
It strikes me that maybe the best approach is to make MAX_PRIORITY
operating system dependent, specifically 31 - 9 + 1.0 = +23.0 for
Windows and -20 - 0 + 1.0 = +21.0 for Unix.  This way, again the
priorities map linearly and in this case 1:1.  I think for most users,
they would choose a "High Priority" relative to MAX_PRIORITY or just
choose a small increment about 1.0 to add just a little boost.

Of course, the 2 biggest problems with this approach are, IMHO, a) the
< Normal scale is percent but the > Normal scale is additive.
However, there is no "Simple" definition of MAX_PRIORITY, so I think
using the OS's definition is natural. b) This use of the priority
scale may be confusing to Unix users, since 1.0 now represents
"Normal" and +21, not +/-20 represents Max Priority.  However, the
definition of MAX_PRIORITY would be irrelevant to the definition of
setPriority and getPriority, since each would, in my proposal, compute
for p > 1.0:

Windows: 9 + int((p - 1) / (MAX_PRIORITY - 1) * 22 + .5)
Unix: -int((p - 1) / (MAX_PRIORITY - 1) * 20 + .5)

Anyway, that's how I'd propose to do the nitty-gritty.  But, more than
anything, I think the subprocess 'priority' methods should use a
priority scheme that is easy to explain.  And by that, I propose:

1.0 represents normal priority, 100%.  Any priority less than 1
represents a below normal priority, down to 0.0, the lowest possible
priority or 0%.  Any priority above 1.0 represents an above normal
priority, with MAX_PRIORITY being the highest priority level available
for a given os.

Granted, it's not much simpler than 0 is normal, -20 is highest and
+20 is lowest, except in so far as it being non-intuitive to consider
a lower priority number representing a higher priority.  Certainly, we
could conform all systems to the +20.0 to -20.0 floating point system,
but I prefer not to bias the methods and honestly feel percentage is
more intuitive.

So, how does that sound to people?  Is that more palatable?

Thanks again for all the input!

Jeffrey.




More information about the Python-list mailing list