source code size metric: Python and modern C++

Thu Dec 5 10:23:58 EST 2002

In article <mailman.1039080494.23599.python-list at python.org>,
 Brian Quinlan  <brian at sweetapp.com> wrote:

>I disagree with the need for type descriptions being present when
>describing an interface. The Python documentation often does not
>formally present the argument and return types but people seem to get
>along fine.

Actually, much of the Python documentation _does_ explicitly mention
argument types. Most cases where it doesn't are where the arguments
are strings or numbers, and the type is clear from the context.

Here's string's join method, for example...

join(seq)
    Return a string which is the concatenation of the strings in the
    sequence seq. The separator between elements is the string
    providing this method.

and here's os.utime...

utime(path, times)
    Set the access and modified times of the file specified by path.
    If times is None, then the file's access and modified times are
    set to the current time. Otherwise, times must be a 2-tuple of
    numbers, of the form (atime, mtime) which is used to set the
    access and modified times, respectively.

Of course, the documentation doesn't _formally_ present the types,
because there is no formal type description syntax for Python.

>Also, IDL alone is not sufficient because it need not provide a semantic
>description and any constraints more fine than type e.g.

Of course. My point is that robust code needs to check its arguments
to make sure they meet its requirements. If the system does some of
that checking for you, you have gained something.

>> I'm not convinced that that kind of ability is so useful when you
>> think about real functions in real applications, of the level suitable
>> for use in an RPC system. I think it's useful for low-level functions
>> like adding things, and generic algorithms like sorting, but much less
>> useful for higher level things.
>
>Why would dynamic typing not be useful in an RPC system if it is useful
>in other applications? Or is dynamic typing not useful in general?

Like I say, I think dynamic typing is very useful for small regularly
called functions that can do something sensible with differing
argument types; I think it is less useful for "business logic" level
functions that are called by foreign code. In RPC systems, you are
often passing large complex data structures around. Being able to
specify the types statically saves you having to write a huge amount
of checking code. For situations where dynamic typing is required,
most RPC systems have an Any type.

[...]
>Assuming that you are correct and I really want type checking in my
>Python RPC server, I can accomplish that in two ways:
>
>1.  I can just add an assert statement e.g.
>    assert isinstance(x, int) and isinstance(y, float)

What if the types are significantly more complex?  For example, x is a
structure with members foo and bar; foo is a sequence of floating
point values; bar is either a string or a sequence of strings. Now
you're going to have to write several lines of code to check the type,
every place you are using the type (or write a function and call it
everywhere). Wouldn't it be nice if I could specify it in a concise
way and have the system write the checking code...?

>2.  For the newer XML-based web services, I can use any XML validation
>    scheme that I want (e.g. DTD, Schema, Relax NG). And my clients can 
>    use that validation system as well (if they want).

How is an XML validation scheme different from an interface definition
language, except for being significantly harder to read and write?

>But notice that Python gives you the flexibility without forcing you to
>jump through hoops. The worst case in Python forces you do no more work
>than in C++.

Python with XML-RPC forces you to do more type checking work than
Python with CORBA, or C++ with CORBA.

>But can't you convince yourself that Python has an advantage in this
>area though a simple thought experiment:
>
>Imagine writing a simple XML-RPC client that searches for Python
>articles on the O'Reilly Meerkat network and prints their titles. Now
>imagine doing the same using an IDL-based RPC interface in a statically
>typed language. Do you really think that the IDL technique would
>actually get you to working code faster.

Yes, I do think an IDL technique would get me to working code faster.

Look at the documentation for getItems:

"""
Signature: array meerkat.getItems(struct)

Returns an array of structs of RSS items given a recipe struct. The
getItems method makes full use of Meerkat's powerful recipes (read:
query parameters). The following is a complete list of parameters from
which to build a recipe:

    * Search Criteria
          o channel - (int) a channel's numeric ID
          o category - (int) a category's numeric ID
          o item - (int) a particular item's numeric ID (retrieve a
            specific item)
          o search - (string) /MySQL regular expression/
...
"""

That looks suspiciously like an informal representation of a set of
interface types to me. With a formal set, the RPC runtime would check
that the argument to a channel criterion really was an integer, and so
on.

>For fun, I actually did the Python XML-RPC version:

For fun, I tried to break it...

>from xmlrpclib import Server
>s = Server('http://www.oreillynet.com/meerkat/xml-rpc/server.php')
>for i in s.meerkat.getItems({'search' : '[Pp]ython'}): print i['title']

s.meerkat.getItems(5)
  raises a Fault.

s.meerkat.getItems({'search' : -5})
  returns results as if I had used the string "-5".

s.meerkat.getItems({'search' : ['[Pp]ython']})
  returns a load of results that have nothing to do with Python.

I think that last one is particularly insidious, since I gave it
something that looks almost right, but silently gives the wrong
results.

Cheers,

Duncan.

-- 
 -- Duncan Grisby         --
  -- duncan at grisby.org     --
   -- http://www.grisby.org --