[DOC-SIG] Comparing SGML DTDs

Michael McLay mclay@smtp.erols.com
Wed, 12 Nov 1997 14:31:50 -0500


This bounced on my first try.  Sorry if it is a repeat.

Several additional messges on the subject have arrive since I started
looking at TIM.  Seems like we need to define the requirements before
we can pick between Latex, TIM, XML, FRAME or any other approach to
generating documentation.  Is TIM too simple?  Is XML too new?  Is SGML 
too complex?  Is a proprietary tool detrimental to contributions of
documentation?  Don't all these questions require a set of requirments 
against which they can be evaluated?

Andrew Kuchling writes:
> 	The language reference is already in FrameMaker, but document
 > formats shouldn't be multiplied.  If GvR rules SGML/XML out, that's
 > that, and we have to consider the other options: just add some LaTeX
 > macros, or use TIM (which is built on top of Texinfo, a format which I
 > quite like), or invent some pod-like format.

Since the benevolent dictator and Bill Janssen suggest TIM then why
don't we take a closer look at it and discuss what switching to TIM
would fail to support.  In reviewing the TIM manual page at
ftp://ftp.parc.xerox.com/pub/ilu/2.0a11/manual-html/manual_21.html
several features make TIM look like a good option:


  1) Concise syntax that is easy to integrate with Python examples
  2) TIM works
  3) TIM was written in Python:-) (only about 820 lines of code)
  4) It looks like a markup that would be much easier to convert to
     XML than Latex.  (My guess is that XML will eventually become the 
     standard for WYSIWYG editors so the ugly tagging issue will go away.)
  5) Restricted set of tags, which makes it fairly easy to learn to use

Downside:

  1) Heavy dependance on external programs which may not be on every platform
	MAKEINFO = '/usr/bin/makeinfo'
	TEX = '/usr/bin/tex'
	TEXINDEX = '/usr/bin/texindex'
	DVIPS = '/usr/bin/dvips'

  2) May require some work to get the reference manual indexing
     working with the new tools.
  3) Restricted set of tags, which makes it fairly hard to extend
     (except by using macros.)
  4) Mixes macro language with markup.  Is this really a problem?
     The TIM macros seem to primarily be used to declare context names
     which are then translatable to generic typographic codes.  This 
     should make it easier to move the tagged text to meaningful XML
tags.

Defining Domain-specific markup commands isn't docuemnted.  The
documentation says it is [TBD].  I grepped for usage and found the
following.  Looks pretty simple to use.
   ilu-macros.tim:@timmacro var                   code
   ilu-macros.tim:@timmacro metavar               var
   ilu-macros.tim:@timmacro C                     code
   ilu-macros.tim:@timmacro C++                   code
   ilu-macros.tim:@timmacro command               code
   ilu-macros.tim:@timmacro constant              code
   ilu-macros.tim:@timmacro codeexample           example
   ilu-macros.tim:@timmacro dfn                   i
   ilu-macros.tim:@timmacro cl                    code
   ilu-macros.tim:@timmacro class                 code
   ilu-macros.tim:@timmacro exception             code
   ilu-macros.tim:@timmacro fn                    code
   ilu-macros.tim:@timmacro interface             code
   ilu-macros.tim:@timmacro java                  code
   ilu-macros.tim:@timmacro isl                   code
   ilu-macros.tim:@timmacro kwd                   code
   ilu-macros.tim:@timmacro language              asis
   ilu-macros.tim:@timmacro m3                    code
   ilu-macros.tim:@timmacro macro                 code
   ilu-macros.tim:@timmacro message               code

Would TIM make a good starting point?  If so, should it be modernized
to use re instead of regex and then developed into a more
full-featured markup language for Python? 

An example of a TIM file is attached.  The example is a snippet from
the ILU Python Tutorial.  Looks pretty readable to me.

@setfilename ilu-tutorial.info
@settitle Using ILU with Python:  A Tutorial
@finalout
@c $Id: tutpython.tim,v 1.8 1996/03/19 04:11:10 janssen Exp $
@ifclear largerdoc
@titlepage
@title Using ILU with Python:  A Tutorial
@author Bill Janssen @code{<janssen@@parc.xerox.com>}
@sp
Formatted @today{}.
@sp
Copyright @copyright{} 1995 Xerox Corporation@*
All Rights Reserved.
@end titlepage
@ifinfo
@node Top, ,(dir),(dir)
@top Using ILU with Python
@end ifinfo
@end ifclear

@syncodeindex pg cp

@section Introduction

This tutorial will show how to use the @system{ILU} system with the programming language @language{Python},
both as a way of developing software libraries, and as a way
of building distributed systems.
In an extended example, we'll build an @system{ILU} module that implements a simple
four-function calculator, capable of addition, subtraction,
multiplication, and division.  It will signal an error if
the user attempts to divide by zero.  The example demonstrates
how to specify the interface for the module; how to implement the module in @language{Python};
how to use that implementation as a simple library; how to provide the module as a remote service;
how to write a client of that remote service; and how to use subtyping to extend an object type
and provide different versions of a module.  We'll also demonstrate how to use @language{OMG IDL}
with @system{ILU}, and discuss the notion of network garbage collection.

Each of the programs and files referenced in this tutorial is available
as a complete program
in a separate appendix to this document; parts of programs are quoted
in the text of the tutorial.

@page
@section Specifying the Interface

Our first task is to specify more exactly what it is we're trying
to provide.  A typical four-function calculator lets a user enter
a value, then press an operation key, either +, -, /, or *,
then enter another number, then press = to actually have
the operation happen.  There's usually a CLEAR button to press
to reset the state of the calculator.  We want to provide something like
that.

We'll recast this a bit more formally as the @dfn{interface}
of our module; that is, the way the module will
appear to clients of its functionality.  The interface
typically describes a number of function calls which can be
made into the module, listing their arguments and return types,
and describing their effects.  @system{ILU} uses @dfn{object-oriented}
interfaces, in which the functions in the interface are grouped
into sets, each of which applies to an @dfn{object type}.  These
functions are called @dfn{methods}.

For example, we can think of the calculator as an object type,
with several methods:  Add, Subtract, Multiply, Divide, Clear, etc.
@system{ILU} provides a standard notation to write this down with,
called @dfn{ISL} (which stands for ``Interface Specification Language'').
@language{ISL} is a declarative language which can be processed
by computer programs.  It allows you to define object types (with methods),
other non-object types, exceptions, and constants.

The interface for our calculator would be written in ISL as:
@codeexample
INTERFACE Tutorial;

EXCEPTION DivideByZero;

TYPE Calculator = OBJECT
  METHODS
    SetValue (v : REAL),
    GetValue () : REAL,
    Add (v : REAL),
    Subtract (v : REAL),
    Multiply (v : REAL),
    Divide (v : REAL) RAISES DivideByZero END
  END;
@end codeexample
This defines an interface @isl{Tutorial}, an exception @isl{DivideByZero},
and an object type @isl{Calculator}.  Let's consider these one by one.

The interface, @isl{Tutorial}, is a way of grouping a number of type
and exception definitions.  This is important to prevent collisions
between names defined by one group and names defined by another group.
For example, suppose two different people had defined two different
object types, with different methods, but both called @isl{Calculator}!
It would be impossible to tell which calculator was meant.  By
defining the @isl{Calculator} object type within the scope of the
@isl{Tutorial} interface, this confusion can be avoided.

The exception, @isl{DivideByZero}, is a formal name for a particular
kind of error, division by zero.  Exceptions in @system{ILU} can specify
an @dfn{exception-value type}, as well, which means that real errors
of that kind have a value of the exception-value type associated with them.
This allows the error to contain useful information about why it might
have come about.  However, @isl{DivideByZero} is a simple exception,
and has no exception-value type defined.  We should note that the full
name of this exception is @isl{Tutorial.DivideByZero}, but for this
tutorial we'll simply call our exceptions and types by their short name.

The object type, @isl{Calculator} (again, really @isl{Tutorial.Calculator}),
is a set of six methods.  Two of those methods, @isl{SetValue} and
@isl{GetValue}, allow us to enter a number into the calculator object,
and ``read'' the number.  Note that @isl{SetValue} takes a single
argument, @metavar{v}, of type @type{REAL}.  @type{REAL} is a
built-in @language{ISL} type, denoting a 64-bit floating point number.
Built-in @language{ISL} types are things like @type{INTEGER} (32-bit
signed integer), @type{BYTE} (8-bit unsigned byte), and @type{CHARACTER}
(16-bit Unicode character).  Other more complicated types are
built up from these simple types using @language{ISL} @dfn{type constructors},
such as @isl{SEQUENCE OF}, @isl{RECORD}, or @isl{ARRAY OF}.

Note also that @isl{SetValue} does not return a value,
and neither do @isl{Add}, @isl{Subtract}, @isl{Multiply},
or @isl{Divide}.  Rather,
when you want to see what the current value of the calculator
is, you must call @isl{GetValue}, a method which has no arguments,
but which returns a @type{REAL} value, which is the value of the
calculator object.  This is an arbitrary decision on our part;
we could have written the interface differently, say as
@codeexample
TYPE NotOurCalculator = OBJECT
  METHODS
    SetValue () : REAL,
    Add (v : REAL) : REAL,
    Subtract (v : REAL) : REAL,
    Multiply (v : REAL) : REAL,
    Divide (v : REAL) : REAL RAISES DivideByZero END
  END;
@end codeexample
@noindent
-- but we didn't.

Our list of methods on @type{Calculator} is bracketed by the two
keywords @isl{METHODS} and @isl{END}, and the elements are separated
from each other by commas.  This is pretty standard in @language{ISL}:
elements of a list are separated by commas; the keyword @isl{END}
is used when an explicit list-end marker is needed (but not when it's
not necessary, as in the list of arguments to a method); the list often
begins with some keyword, like @isl{METHODS}.
The @dfn{raises clause} (the list of exceptions which a method
might raise) of the method @isl{Divide} provides another example
of a list, this time with only one member, introduced by the keyword
@isl{RAISES}.

Another standard
feature of @language{ISL} is separating a name, like @isl{v},
from a type, like @type{REAL}, with a colon character.  For example,
constants are defined with syntax like
@codeexample
CONSTANT Zero : INTEGER = 0;
@end codeexample
@noindent
Definitions, of interface, types, constants, and exceptions, are
terminated with a semicolon.

We should expand our interface a bit by adding more documentation
on what our methods actually do.  We can do this with the @dfn{docstring}
feature of @language{ISL}, which allows the user to add arbitrary
text to object type definitions and method definitions.  Using
this, we can write
@codeexample
INTERFACE Tutorial;

EXCEPTION DivideByZero
  "this error is signalled if the client of the Calculator calls
the Divide method with a value of 0";

TYPE Calculator = OBJECT
  COLLECTIBLE
  DOCUMENTATION "4-function calculator"
  METHODS
    SetValue (v : REAL) "Set the value of the calculator to `v'",
    GetValue () : REAL  "Return the value of the calculator",
    Add (v : REAL)      "Adds `v' to the calculator's value",
    Subtract (v : REAL) "Subtracts `v' from the calculator's value",
    Multiply (v : REAL) "Multiplies the calculator's value by `v'",
    Divide (v : REAL) RAISES DivideByZero END
      "Divides the calculator's value by `v'"
  END;
@end codeexample
@noindent
Note that we can use the @isl{DOCUMENTATION} keyword on object types
to add documentation about the object type, and can simply add documentation
strings to the end of exception and method definitions.  These docstrings
are passed on to the @language{Python} docstring system, so that they are available
at runtime from @language{Python}.  Documentation
strings cannot currently be used for non-object types.

@system{ILU} provides a program, @program{islscan}, which can be used
to check the syntax of an @language{ISL} specification.  @program{islscan}
parses the specification and summarizes it to standard output:
@transcript
% @userinput{islscan Tutorial.isl}
Interface "Tutorial", imports "ilu"
  @{defined on line 1
   of file /tmp/tutorial/Tutorial.isl (Fri Jan 27 09:41:12 1995)@}

Types:
  real                       @{<built-in>, referenced on 10 11 12 13 14 15@}

Classes:
  Calculator                 @{defined on line 17@}
    methods:
      SetValue (v : real);                          @{defined 10, id 1@}
        "Set the value of the calculator to `v'"
      GetValue () : real;                           @{defined 11, id 2@}
        "Return the value of the calculator"
      Add (v : real);                               @{defined 12, id 3@}
        "Adds `v' to the calculator's value"
      Subtract (v : real);                          @{defined 13, id 4@}
        "Subtracts `v' from the calculator's value"
      Multiply (v : real);                          @{defined 14, id 5@}
        "Multiplies the calculator's value by `v'"
      Divide (v : real) @{DivideByZero@};             @{defined 16, id 6@}
        "Divides the calculator's value by `v'"
    documentation:
      "4-function calculator"
    unique id:  ilu:cigqcW09P1FF98gYVOhf5XxGf15

Exceptions:
  DivideByZero               @{defined on line 5, refs 15@}
%
@end transcript

@noindent
@program{islscan} simply lists the types defined in the interface, separating
out object types (which it calls ``classes''), the exceptions, and
the constants.  Note that for the @type{Calculator} object type,
it also lists something called its @dfn{unique id}.  This is a 160-bit
number (expressed in base 64) that @system{ILU} assigns automatically
to every type, as a way of distinguishing them.  While
it might interesting to know that it exists (:-),
the @system{ILU} user never has know what it is; @program{islscan}
supplies it for the convenience of the @system{ILU} implementors, who
sometimes do have to know it.
@page
@section Implementing the True Module

After we've defined an interface, we then need to supply an implementation
of our module.  Implementations can be done in any language supported by
@system{ILU}.  Which language you choose often depends on what sort
of operations have to be performed in implementing the specific functions
of the module.  Different languages have specific advantages and disadvantages
in different areas.  Another consideration is whether you wish to use the
implementation mainly as a library, in which case it should probably be done
in the same language as the rest of your applications, or mainly as
a remote service, in which case the specific implementation language
is less important.

We'll demonstrate an implementation of the @type{Calculator}
object type in @system{Python}, which is one of the most capable
of all the @system{ILU}-supported languages.  This is just a matter
of defining a @language{Python} class, corresponding to the @type{Tutorial.Calculator} type.  Before we do that,
though, we'll explain how the names and signatures of the @language{Python} functions
are arrived at.

@subsection What the Interface Looks Like in Python

For every programming language
supported by @system{ILU}, there is a standard @dfn{mapping} defined
from @language{ISL} to that programming language.  This mapping defines
what @language{ISL} type names, exception names, method names,
and so on look like
in that programming language.

The mapping for @language{Python} is straightforward.  For type names,
such as @isl{Tutorial.Calculator}, the @language{Python} name
of the @language{ISL} type @isl{Interface.Name}
is @Python{Interface.Name}, with any hyphens replaced by underscores.  That is, the name of the interface in @language{ISL}
becomes the name of the module in @language{Python}.
So the name of our @type{Calculator} type in @language{Python}
would be @Python{Tutorial.Calculator}, which is really the name of a @language{Python} class.

The @language{Python} mapping for a method name such as @isl{SetValue}
is the method name, with any hyphens replaced by underscores.
The return type of this @language{Python} method is whatever is specified
in the @language{ISL} specification for the method, or @Python{None} if
no type is specified.  The arguments for the @language{Python} method are the
same as specified in the @language{ISL}; their types are the
@language{Python} types corresponding to the @language{ISL} types, @emph{except}
that one extra argument is added to the beginning of each @language{Python}
version of an @language{ISL} method; it is an @dfn{instance} of the object type
on which the method is defined.  An instance is simply a value of that
type.  Thus the @language{Python} method corresponding
to our @language{ISL} @isl{SetValue} would have the prototype signature
@codeexample
   def SetValue (self, v):
@end codeexample
@noindent

Similarly, the signatures for the other methods, in @language{Python}, are
@codeexample
   def GetValue (self):

   def Add (self, v):

   def Subtract (self, v):

   def Multiply (self, v):

   def Divide (self, v):
@end codeexample
@noindent
Note that even though the @isl{Divide} method can raise an exception,
the signature looks like those of the other methods.  This is because
the normal @language{Python} exception signalling mechanism is used to
signal exceptions back to the caller.
The mapping of exception names is similar to the mapping used for types.
So the exception @isl{Tutorial.DivideByZero}
would also have the name @Python{Tutorial.DivideByZero}, in @language{Python}.

One way to see what all the @language{Python} names for an interface
look like is to run the program @program{python-stubber}.  This program
reads an @language{ISL} file, and generates the necessary @language{Python}
code to support that interface in @language{Python}.  One of the files
generated is @file{@metavar{Interface}.py}, which contains the definitions
of all the @language{Python} types for that interface.
@transcript
% @userinput{python-stubber Tutorial.isl}
client stubs for interface "Tutorial" to Tutorial.py ...
server stubs for interface "Tutorial" to Tutorial__skel.py ...
%
@end transcript
@page
@subsection Building the Implementation

To provide an implementation of our interface, we @dfn{subclass} the
generated @language{Python} class for our @class{Calculator} class:

@codeexample
# CalculatorImpl.py

import Tutorial, Tutorial__skel

class Calculator (Tutorial__skel.Calculator):

        def __init__ (self):
                self.the_value = 0.0

        def SetValue (self, v):
                self.the_value = v

        def GetValue (self):
                return self.the_value

        def Add (self, v):
                self.the_value = self.the_value + v

        def Subtract (self, v):
                self.the_value = self.the_value - v

        def Multiply (self, v):
                self.the_value = self.the_value * v

        def Divide (self, v):
                try:
                        self.the_value = self.the_value / v
                except ZeroDivisionError:
                        raise Tutorial.DivideByZero
@end codeexample

Each instance of a @Python{CalculatorImpl.Calculator} object
inherits from @Python{Tutorial__skel.Calculator}, which in turn
inherits from @Python{Tutorial.Calculator}.  Each has an instance
variable called @Python{the_value}, which maintains a running total
of the `accumulator' for that instance.  We can create an instance
of a @isl{Tutorial.Calculator} object by simply calling @Python{CalculatorImpl.Calculator()}.

@page
So, a very simple program to use the @isl{Tutorial} module might be
the following:

@codeexample
# simple1.py, a simple program that demonstrates the use of the
#  Tutorial true module as a library.
#
# run this with the command "python simple1.py NUMBER [NUMBER...]"
#

import Tutorial, CalculatorImpl, string, sys

# A simple program:
#  1)  make an instance of Tutorial.Calculator
#  2)  add all the arguments by invoking the Add method
#  3)  print the resultant value.

def main (argv):

        c = CalculatorImpl.Calculator()
        if not c:
                error("Couldn't create calculator")

        # clear the calculator before using it

        c.SetValue (0.0)

        # now loop over the arguments, adding each in turn */

        for arg in argv[1:]:
                v = string.atof(arg)
                c.Add (v)

        # and print the result

        print "the sum is", c.GetValue()
        sys.exit(0)

main(sys.argv)
@end codeexample

@noindent
This program would be compiled and run as follows:
@transcript
% @userinput{python simple1.py 34.9 45.23111 12}
the sum is 92.13111
%
@end transcript

@noindent
This is a completely self-contained use of the @isl{Tutorial}
implementation; when a method is called, it is the true method
that is invoked.  The use of @system{ILU} in this program adds
some overhead in terms of included code, but has almost
the same performance as a version of this program that does not
use @system{ILU}.


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________