OFF-TOPIC:: Why Lisp is not my favorite programming language

Stephen Horne steve at ninereeds.fsnet.co.uk
Tue Apr 6 22:52:14 EDT 2004


On 6 Apr 2004 13:58:45 -0700, nish20 at netzero.net (Mike Nishizawa)
wrote:

>mintiSPAMBLOCK at yahoo.com (Minti) wrote in message news:<e87fc4b0.0403221114.254b7ee5 at posting.google.com>...

>> However when some one says that the code might be 31.?
>> times slower as compared to C, kinda of scares me. Could assert such
>> figures. I know speed is not the one and only goal but just wondering
>> if these figures are correct.

It wouldn't surprise me. LISP is normally interpreted - maybe always,
though I'm not certain of that. The interpreter would be written in C.
And, depending on the implementation, the built-in list operations may
well be much more basic than those in Python, for instance, meaning
that more of the relatively sophisticated operations will be handled
using interpreted library code.

Python itself is at least 10 times slower than C for many things
(though the library often provides efficient algorithms that are
actually faster than anything you'd write for yourself even in C).

An implementation that has more of the standard operations built in,
or which used just-in-time compilation etc, may well run much faster.

>I would ask what the C program is doing.  If it beats LISP at it's own
>game which is, list processing, then that would be different.

It is always possible to beat any interpreted language for speed using
C if you are willing to put in the time and effort. 90% or more of
interpreters were written in C themselves, and by writing directly in
C you are avoiding the need for any interpreting at run-time.

>  I don't
>think I would choose to write an enterprise application in it, but I
>think it's as good a choice as anything for any type of parsing
>applications.  And to the smart guy who feels like he can take a stab
>at me for calling lisp a parsing language... consider:  a parser is,
>by definition, something that analyzes or separates (input, for
>example) into more easily processed components.  Let's take a
>document, for instance.  If I wanted to parse the document and
>separate it into words, I would write a parser to do so. What is a
>document but a list of words separated by spaces... an especially good
>application for a LISt Processing language.

Parsing isn't that heavy in list processing. Neither is handling a
large document.

In parsing, you are mostly dealing with one symbol at a time. The way
you access that symbol, whether you read it directly from a file or
input stream at need or keep it in a memory buffer, is the least
significant thing going on in the parser. Though the work done by most
practical parsers at run-time (rather than when the parser tables are
generated from the grammar) is not that complex. In most cases in
parsing, the language is pretty much immaterial. Though the
table-generating for, for instance, an LR-style parser (basically what
a tool like yacc does) needs to do a lot of set operations on
potentially quite large sets, and some fairly time consuming searches
and closures. Basically, it's something I wouldn't want to do in Lisp
purely for speed reasons. I have done it in Python, and the result
wasn't exactly practical - though it probably could have been with
more work and some C extension optimisations.

With large documents, while you can see the document as being a
(potentially huge) list of characters or words, you wouldn't want to
use a simple flat list data structure, again for performance reasons.
You wouldn't use a linked list because it would take too much time to
iterate through, not to mention the potentially very large storage
overhead - two pointers for a doubly-linked list takes 8 bytes, more
than the average word, and that's not even considering mallocs
rounding-up-to-a-multiple-of-x-bytes overhead. You wouldn't use a
simple resizable array because of the time wasted on insertions and
deletions, shifting the tail end backward and forward in memory.

I've never done this for a mutable document (I have for an immutable
document, but you can get away with a lot in that case) but I figure
that I'd create a kind of database table of 'chunks', along with a few
indexes into that table (using either tree or hash based methods). And
style and other non-text information would probably mostly be
separated out into other tables, with their own indexes. ie I would
learn a trick or two from the relational database guys, who have been
handling large sets of data reliably and efficiently for a long time.

Actually, this in-memory relational database thing is quite a useful
technique. In particular, it helps to reinforce the idea of keeping
your main data in an easily accessible place, and not hidden deeply in
the private internals of various of classes - so when you need to do
something you didn't anticipate before, the data you need to access is
easily available with maybe just an extra method or two on your
datastore object. Data hiding should only be done with care IMO.

Anyway, again this isn't a very big issue in terms of language
selection. Python might be a pretty good choice, though, simply
because there is both a convenient data-table type (list of tuples)
and a convenient index type (dictionary) built-in and ready to go.
Though I suspect a C extension might be needed to get good performance
for a few operations with very large documents on slow machines.

C++ is also in the running with the STL containers - harder to use, a
little more flexible, hard to say on speed. Python may actually be
faster as hashing is normally faster than red-black trees for handling
indexes, but then hashes don't support efficient in-order iteration so
it depends on how the data is accessed as much as anything.


-- 
Steve Horne

steve at ninereeds dot fsnet dot co dot uk



More information about the Python-list mailing list