Bitshifts and "And" vs Floor-division and Modular

Paul Rubin no.email at nospam.invalid
Fri Sep 7 00:32:27 EDT 2012


rusi <rustompmody at gmail.com> writes:
> On an 8086/8088 a MUL (multiply) instruction was of the order of 100
> clocks ...  On most modern processors (after the pentium) the
> difference has mostly vanished.  I cant find a good data sheet to
> quote though

See http://www.agner.org/optimize/ :

    4. Instruction tables: Lists of instruction latencies, throughputs
    and micro-operation breakdowns for Intel, AMD and VIA CPUs

Multiplication is now fast but DIV is still generally much slower.
There are ways to make fast parallel dividers that I think nobody
bothers with, because of chip area and because one can often optimize
division out of algorithms, replacing most of it with multiplication.

Worrying about this sort of micro-optimization in CPython is almost
always misplaced, since the interpreter overhead generally swamps any
slowness of the machine arithmetic.



More information about the Python-list mailing list