A vision for Parrot

Benjamin Goldberg goldbb2 at earthlink.net
Tue Nov 12 16:13:40 EST 2002


Donal K. Fellows wrote:
> 
> Benjamin Goldberg wrote:
> > Considering that Jcl and Jython exist, it seems like a reasonable
> > goal
> 
> (JCL is something else.  I'd rather not remember it thankyouverymuch.)

Erm... that's the old IBM Job Control Language?  You mean this one?

   http://www.tuxedo.org/~esr/jargon/html/entry/JCL.html

Bleh, forget I mentioned it. :)  Twas a horrible typo :)

> > would be to make an interpreter which turns Java's .class files into
> > Parrot .pasm files.  Once that tool exists, one could simply
> > translate Jcl and Jython into parrot... there would be no need to
> > re-implement them.
> >
> > And one day, in the distant future, there will be a Perl6
> > decompiler, which will turn Parrot bytecode into Perl6.  Then we'll
> > be able to convert the translated Jython and Jcl into Perl6 :)
> 
> $10 says that only ever happens with a performance hit.  The problem
> is that not all bytecodes are created equal.  (And Jacl is an
> implementation of Tcl in Java more in the way that the usual form of
> Tcl is an implementation of the language in C.

So Jacl still converts Tcl into, well, Tcl bytecodes, even though it's
doing so in Java?  Blech.

Hmm, is there a way of making tcl dump the tcl-bytecodes to a file?

If so, one could probably make an attempt to translate those bytecodes
into parrot.  (And ignore Jacl).

> The fact that Java uses bytecodes is pretty much just a distraction
> here.  We also have another way of integrating Tcl with Java that
> keeps Tcl
> implemented in C, but which integrates almost identically with the Java
> language.)
> 
> > > This sort of thing tends to make me suspicious that this is little
> > > more than a pipe-dream, well, at least as far as Tcl's concerned.
> > > (I don't know the other languages nearly well enough to comment
> > > in a useful way.)
> [...]
> > Assuming you thouroughly understand Tcl's bytecodes, why not take a
> > look at Parrot, and see whether the set of bytecodes that parrot
> > supports is sufficient to do everything that Tcl's bytecodes do?
> 
> I know a bit about Tcl bytecodes, and a key factor about them is that
> they are very tightly targetted towards implementing Tcl.
> 
> Hmm.  A quick scan through the documentation doesn't really raise my
> hopes.  Or even leave me with a deep enough understanding of what's
> going on; is there any deeper description than
> http://www.parrotcode.org/docs/parrot_assembly.pod.html
> about?  (OK, Tcl's bytecodes need documentation too, but I've already
> gone to the effort to understand those as part of my maintenance and
> development duties.  I've just not got enough hours in the day.) 
> Unfortunately, the bits that I'm most interested in seem to be the
> bits with least info (isn't that always the way with complex software
> systems?)
> 
> First impressions: what is meant by "string" anyway?  Character
> sequence?  Byte sequence?  UTF-8 sequence?  ISO 8859-1 sequence?  [FX:
> Reads docs]  Oh, they carry about what their encoding is with them? 
> That must make working with them fun.  How does it handle things like
> the blecherous monstrosities[*] used for system encodings in the Far
> East?

Having read http://www.parrotcode.org/docs/strings.pod.html only just
now myself, it's possible I could be wrong on this, but...

Each string's encoding can be one of native, utf8, utf16, utf32, or
foreign.  So those "blecherous monstrosities" will either be converted
to one of the utf formats, or else have their own string vtable.

For now, they will probably be converted... the strings.pod.html says
this at the bottom:
   Foreign Encodings

   Fill this in later; if anyone wants to implement
   new encodings at this stage they must be mad."

> On a quite separate point, is there a strncmp() equivalent?  That
> would make implementing Tcl much easier...

You mean, for testing the first n characters of two strings for
equality?  There isn't that I know of, but one could always be added;
furthermore, it supposedly will be possible to make lightweight strings
which are substrings of other strings, without any copying involved. 
You could make your strncmp be a wrapper around making a substring of
the first n characters of each of your two strings, and comparing those
substrings.

> More generally, Tcl would need to use PMCs throughout.

Why?  (Not an objection, but I don't know much about Tcl's bytecode)

> The problem is that Tcl's value semantics (copy-on-write) do not line
> up well with that which Parrot seems to use (object-based)

Parrot will do copy-on-write.

Furthermore, Parrot may implement some strings as ropes, so that the
amount that needs to be copied will be even smaller.

> and which, IIRC from the discussions at the time when Parrot was being
> created, are closely based on those used in Perl even if not precisely
> coincident.

Perl is likely never going to implement strings as ropes.  It does now
have copy-on-write, though this is a recent development.

Perl5.6+ has two internal encodings for strings -- bytes and utf8. 
Parrot not only allows native, utf8, utf16, and utf32, but it also
allows any kind of user-defined encoding one might want.  I doubt that
perl5 will ever do this.

> Hence we'd be unable to use ground values except as part of the
> implementation of higher-level concepts.  That'll impact badly on
> performance.
> 
> It's at this point that I feel a round of "Stuff it. I'll stick to
> implementing in C." coming on.  I've been quietly watching Parrot for
> a while now, and I still don't think that implementing Tcl in it is
> really a winning proposition.
> I'd love someone to prove me wrong, but proof is building a Tcl
> interpreter in or on top of Parrot and running the Tcl test suite on
> it (and getting a decent proportion of the tests passing.)

Parrot does everything in two steps -- compile, then run.  Most likely,
it will have a compiler which converts Tcl bytecode to Parrot bytecode.

Whether or not Parrot will ever translate from Tcl source to Parrot
bytecode is another question entirely.

Thinking a bit more, particularly about how Tcl often needs to interpret
strings at runtime, I realize that no non-trivial Tcl program can work
without having a string-to-bytecode compiler.  Needless to say, this
poses a problem.

> BTW, how does Parrot handle calls to foreign code?  The docs I've seen
> are on the hazy side, and integration with existing C, C++ and FORTRAN
> monoliths is (alas) all too important in particularly commercial
> development.

Although I don't know *how* it will handle foreign code, I do know that
it *will* handle foreign code, and have a better interface than Perl5's
cruddy XS extension language.

-- 
my $n = 2; print +(split //, 'e,4c3H r ktulrnsJ2tPaeh'
."\n1oa! er")[map $n = ($n * 24 + 30) % 31, (42) x 26]



More information about the Python-list mailing list