Compiling

"Martin v. Löwis" martin at v.loewis.de
Sun Feb 5 13:51:39 EST 2006


Ravi Teja wrote:
> This is a standard response to a rather frequent question here. But I
> am not sure I ever understood. Scheme / Lisp are about as dynamic as
> Python. Yet they have quite efficient native compilers. Ex: Bigloo
> Scheme.

You might be missing two details here:
1. those compilers are less efficient than you might think
2. Scheme/Lisp are indeed less dynamic than Python

Consider the following Scheme example:

(module a)

(define (add a b)
  (+ a b))

(print (add 3.0 4))

Bigloo generates the following code (which I have cleaned up
quite a bit):

/* toplevel-init */
obj_t BGl_toplevelzd2initzd2zzaz00()
{
  AN_OBJECT;
  obj_t BgL_arg1078z00_10;
  BgL_arg1078z00_10 = BGl_addz00zzaz00(((double)3.0), ((long)4));
  obj_t BgL_list1080z00_11;
  BgL_list1080z00_11 = MAKE_PAIR(BgL_arg1078z00_10, BNIL);
  return BGl_printz00zz__r4_output_6_10_3z00(BgL_list1080z00_11);
}

/* add */
obj_t BGl_addz00zzaz00(double BgL_az00_1, long BgL_bz00_2)
{
  AN_OBJECT;
  obj_t BgL_list1084z00_12;
  obj_t BgL_arg1087z00_13;
  BgL_arg1087z00_13 = MAKE_PAIR(BINT(BgL_bz00_2), BNIL);
  BgL_list1084z00_12 = MAKE_PAIR(DOUBLE_TO_REAL(BgL_az00_1),
				 BgL_arg1087z00_13);
  return BGl_zb2zb2zz__r4_numbers_6_5z00(BgL_list1084z00_12);
}


You can see several things from that:
1. The compiler was not able to/did not chose to implement
   the add operation using native C. Instead, it allocates
   two cons cells, and one object for the double; it then
   calls a generic implementation of +.
2. The compiler *did* infer that the procedure add is
   always called with (double long). This is because I
   didn't export it. If I exported the function, it
   would become

obj_t BGl_zc3anonymousza31077ze3z83zzaz00(obj_t BgL_envz00_16, obj_t
BgL_az00_17, obj_t BgL_bz00_18)
{
  AN_OBJECT;
  obj_t BgL_az00_8;obj_t BgL_bz00_9;
  BgL_az00_8 = BgL_az00_17;
  BgL_bz00_9 = BgL_bz00_18;
  obj_t BgL_list1079z00_11;
  obj_t BgL_arg1081z00_12;
  BgL_arg1081z00_12 = MAKE_PAIR(BgL_bz00_9, BNIL);
  BgL_list1079z00_11 = MAKE_PAIR(BgL_az00_8, BgL_arg1081z00_12);
  return BGl_zb2zb2zz__r4_numbers_6_5z00(BgL_list1079z00_11);
}

  In this case, all parameters are of type obj_t (which is a pointer
  type).

Furthermore, looking at the actual implementation of 2+, it is
defined as

(define (2+ x y)
   (2op + x y))

and then 2op is defined as

(define-macro (2op op x y)
   (let ((opfx (symbol-append op 'fx))
	 (opfl (symbol-append op 'fl))
	 (opelong (symbol-append op 'elong))
	 (opllong (symbol-append op 'llong)))
      `(cond
	  ((fixnum? ,x)
	   (cond
	      ((fixnum? ,y)
	       (,opfx ,x ,y))
	      ((flonum? ,y)
	       (,opfl (fixnum->flonum ,x) ,y))
	      ((elong? ,y)
	       (,opelong (fixnum->elong ,x) ,y))
	      ((llong? y)
	       (,opllong (fixnum->llong ,x) ,y))
	      (else
	       (error ,op "not a number" ,y))))
        ; more cases enumerating all possible
        ; combinations of fixnum, flonum, elong, and llong

Now, compare this with Python:

- In the general case of the add definition, the code Bigloo
  generates is roughly equivalent to the sequence of function calls
  the Python interpreter performs. For Python,

def add(a,b):
   return a+b

  translates into

  2           0 LOAD_FAST                0 (a)
              3 LOAD_FAST                1 (b)
              6 BINARY_ADD
              7 RETURN_VALUE
              8 LOAD_CONST               0 (None)
             11 RETURN_VALUE

  The entire work is done in BINARY_ADD: It allocates a tuple with
  the two arguments, then dispatches to the actual __add__
  implementation. Compared to the code Bigloo generates, this might
  be more efficient (a single memory allocation instead of two).

- the approach of collecting all implementations of + in a single
  place of Bigloo cannot be transfered to Python. Python's
  implementation of + is dynamically extensible.

- the approach of directly calling add() in the toplevel init
  cannot be applied to Python, either. The meaning of the name
  "add" can change between the time of definition and the actual
  call. Therefore, the simple function call must go through a
  table lookup.

So while it would be possible to apply the same strategy to
Python, it likely wouldn't gain any performance increase over
the interpreter.

Regards,
Martin



More information about the Python-list mailing list