How does Ruby compare to Python?? How good is DESIGN of Rubycompared to Python?

Thu Feb 26 10:24:55 EST 2004

On Thu, 26 Feb 2004 02:48:30 GMT, Joe Mason <joe at notcharles.ca> wrote:

>In article <t5fq30l29nidjh1tsksm8jlld03ff3av5o at 4ax.com>, Stephen Horne wrote:
>> I don't know how widespread it is among very high level languages, but
>> there is a reason (in the not-that-high-level sense) that C++, and
>> other 3GLs with classes hacked on don't. I'm assuming object Pascals
>> and Basics (esp Delphi and VB) count here, though don't sue me if I'm
>> wrong - I'm not familiar with object support in either.
>
>I didn't think Pascals and Basics supported function pointers at all,
>but I haven't used them since high school, so maybe I just didn't
>encounter them at the time.

I'd be surprised if Delphi and VB don't have function pointers.
Without them, it would be impossible to use Windows API calls that
require callback functions for instance. Whether they are in the
Pascal and Basic standards, though, is an entirely different thing.

>> Anyway, from this perspective, the bound method is an interesting
>> in-between concept. I've sometimes wondered if full currying support
>> may have been a better choice, but then a bound method is much more
>> efficient simply because it's simpler - not to mention avoiding some
>> syntactic difficulties I ignored above.
>
>I've always found the performance differences between functional and imperative
>languages fascinating (well, ever since I found out about it) - on the
>one hand, pure functional languages can prove facts about the code
>mathematically, so in theory the compiler can optimize much more away.
>But on the other hand, supporting all the extra function state they need
>is very costly.

Hmmm - I think you may be confusing the functional vs imperative
distinction with something else. Supporting a functional style does
not, in itself, require any special baggage.

Python carries some extra baggage in each object. However, that
baggage has nothing to do with functional support - it is needed for
dynamic typing, run-time binding, reference counting etc. In the case
of dynamic typing and run-time binding, for instance, Python objects
use a system similar to that used in C++ for run-time binding (ie
virtual methods) and run-time type identification - each object has a
pointer to a block of data describing its type and how to use it (a
virtual table in C++, the type object in Python).

Haskell is a good pure-functional language to use as a comparison.
Haskell has dynamic typing, but you can restrict types in a static way
when you choose. Haskell therefore only includes dynamic typing
'baggage' when it is necessary.

In Python, we pay a bit in runtime efficiency for scripting-language
flexibility - for dynamic typing, garbage collection, run-time symbol
tables etc - and we get pretty good value, or else we'd be using some
other language. But I can't really think of a functional-style feature
of Python that results in extra baggage for Python objects.

Perhaps first class functions? No - the necessary 'overhead' in
comparison with, say, C function pointers is already present in order
to support dynamic typing.

Imperative languages can have run-time efficiency advantages, of
course. One is that a human programmer usually knows whether in-place
modification or replacement of data is appropriate, and tells an
imperitive language which to use. In a functional language, the
compiler has to work this out for itself as an optimisation, and it
can't always be sure.

For instance, it can be quite hard for programmers with imperitive
blinkers on to understand how Haskell can handle sorting. Many have
claimed that it can't - and immediately been proved wrong by someone
presenting the code that does it. But to imperitive-biassed eyes, that
code may look a lot like a joke.

To illustrate, a literal Python translation of a Haskell quicksort
would be something like...

def quicksort (p) :
  if len(p_Data) > 1 :
    l_Pivot = p [0]

    return   quicksort ([i for i in p [1:] if i <  l_Pivot])
           + [l_Pivot]
           + quicksort ([i for i in p [1:] if i >= l_Pivot])

  else :
    return p_Data

In Python, of course, this would be bad code - it creates huge numbers
of small sorted sequences only to discard them almost instantly. The
average performance is still the same old O(n long n) and the worst
case still O(n^2), but the problem is the scaling constants involved
in all the unnecessary work.

The key difference between Python and Haskell here is that a good
Haskell compiler should be able to optimise out all the throwaway
intermediate lists, and should even be able to determine for itself
whether an in-place modification of the original list is valid.
Python, of course, does not - it is the programmers responsibility to
handle sorting in a more efficient way.

My guess would be that Haskells method might fall down when, for
instance, the quicksort function is held in a separately compiled
library. In that case, the compilers analysis of the quicksort cannot
determine whether the initial unsorted list is used again after the
sort, so it can't know whether it is safe to overwrite the unsorted
list using an in-place method or not. Though even then, the library
could hold variants for both cases. I don't know enough about Haskell
to say, really.

One possibility where both Python and Haskell might have related
run-time issues is in cache friendliness. In Pythons case, this
results from the fact that all variables are bound using references,
all values are passed using references, etc. If you access a C
array-of-simple-values/structs sequentially, the way that RAM caching
and virtual memory operate on modern machines tends to mean that most
accesses are to fast cache memory because the data has been brought in
already along with earlier items. In Python, that doesn't tend to work
out because the actual values within the list are not normally
adjacent in memory - only the references are adjacent. This happens in
Haskell for much the same reason, though a Haskell compiler should
have a much better chance of optimising.

But even with cache friendliness, this isn't 'baggage from functional
features' in any sense. In fact, it often happens in C++ too - if you
are not certain of the concrete type of an object (you only know that
it is a subtype of a particular base class) you have to pass and store
a pointer rather than the object itself because you can't know at
compile time how much space the object takes up in memory, so you
can't reserve the space for it statically. So if you need (perhaps in
a drawing program) an array of shape objects, where each shape may be
an arbitrary subtype of the main shape base class, you actually have
to implement that as an array of pointers to shape objects. And the
shape objects are not likely to be at adjacent locations in memory, so
the drawing loop will make a much higher proportion of accesses to
relatively slow memory.

BTW - the reason this came to mind is because I'm currently working on
a Python extension module which supports a dynamically constructed
C-like struct (which can hold both Python objects and simple C-like
values - various sizes of integers and floats, plus fixed-length
null-padded string buffers using either 8- or 16-bit characters) and
can use them as both keys and values in a range of containers. For
records with no Python objects at all, I may add some
fixed-record-size file access stuff - but I have concerns about
portability and things ending with 'endian' etc. I've covered some of
the ground using the Boost/Python libraries before, but am now
discovering that the Python/C APIs aren't so very hard after all ;-)

Anyway, the real costs of using functional languages don't occur at
run-time. They occur at compile-time, as deep optimisations are very
far from trivial. I can't imagine turning off optimisations in a C++
compiler just to improve build speed, but in the Haskell world there
are compilers considered appropriate for development and prototyping
(minimal optimisations) and compilers considered appropriate for the
final release. I don't have the Haskell experience to know, but when
programmers make the effort to use a completely different compiler
just to reduce build times, I figure the difference must be pretty
substantial for large programs.

>Of course, hybrid languages like Python and Ruby have the worst of both
>worlds - side effects everywhere AND extra function baggage to pass
>around.

Not true. You do get quite high run-time costs in Python (and Perl,
and ...), but you get a lot in return - very good value IMO. And the
run-time costs in Python don't derive from functional features.

-- 
Steve Horne

steve at ninereeds dot fsnet dot co dot uk