source code size metric: Python and modern C++

Thu Dec 5 14:27:42 EST 2002

Duncan Grisby wrote:
> >I disagree with the need for type descriptions being present when
> >describing an interface. The Python documentation often does not
> >formally present the argument and return types but people seem to get
> >along fine.
> 
> Actually, much of the Python documentation _does_ explicitly mention
> argument types. Most cases where it doesn't are where the arguments
> are strings or numbers, and the type is clear from the context.
> 
> Here's string's join method, for example...

I said that the documentation "...often does not...". Listing two
examples does not provide any counter-evidence (unless there were only
two routines in the standard library).

> Of course. My point is that robust code needs to check its arguments
> to make sure they meet its requirements. 

I disagree. You must not consider the Python standard library robust
because it almost never checks arguments (note once again that listing
two counter-examples does not constitute proof of any kind).

> If the system does some of that checking for you, you have 
> gained something.

If you actually want the checking, and that checking doesn't cost you
anything (which it does) then I agree.

> Like I say, I think dynamic typing is very useful for small regularly
> called functions that can do something sensible with differing
> argument types; I think it is less useful for "business logic" level
> functions that are called by foreign code.

I've written XML-RPC servers that process different types from the same
method. 

> In RPC systems, you are
> often passing large complex data structures around. Being able to
> specify the types statically saves you having to write a huge amount
> of checking code. For situations where dynamic typing is required,
> most RPC systems have an Any type.

I have written commercial RPC systems in Python that didn't do any type
checking. They seem to work ok.

> What if the types are significantly more complex?  For example, x is a
> structure with members foo and bar; foo is a sequence of floating
> point values; bar is either a string or a sequence of strings. 

What happens if the strings have to begin with an uppercase letter, the
floats must be 0 <= x < 1, etc. You are arguing for checking at exactly
the level of granularity that your IDL has. I find that an interesting
coincidence :-)

> you're going to have to write several lines of code to check the type,
> every place you are using the type (or write a function and call it
> everywhere). Wouldn't it be nice if I could specify it in a concise
> way and have the system write the checking code...?

Actually, I usually don't bother. If you pass in types that don't make
sense then you will get an exception at the point where the type
actually causes an error.

> How is an XML validation scheme different from an interface definition
> language, except for being significantly harder to read and write?

XML validation schemes can do more validation than IDL. Some of them are
Turing complete so you can validate every aspect of the interface.

> Python with XML-RPC forces you to do more type checking work than
> Python with CORBA, or C++ with CORBA.

Maybe if you decide to do the type checking. I don't believe that I ever
have.

> >But can't you convince yourself that Python has an advantage in this
> >area though a simple thought experiment:
> >
> >Imagine writing a simple XML-RPC client that searches for Python
> >articles on the O'Reilly Meerkat network and prints their titles. Now
> >imagine doing the same using an IDL-based RPC interface in a
statically
> >typed language. Do you really think that the IDL technique would
> >actually get you to working code faster.
> 
> Yes, I do think an IDL technique would get me to working code faster.

OK, I'll send you a stub IDL. You will have 30 seconds to write a
working Java/C++ program. You wanna try it?

> The following is a complete list of parameters from
> which to build a recipe:
> 
>     * Search Criteria
>           o channel - (int) a channel's numeric ID
>           o category - (int) a category's numeric ID
>           o item - (int) a particular item's numeric ID (retrieve a
>             specific item)
>           o search - (string) /MySQL regular expression/
> ...
> 
> That looks suspiciously like an informal representation of a set of
> interface types to me. With a formal set, the RPC runtime would check
> that the argument to a channel criterion really was an integer, and so
> on.

How would you represent that structure in IDL where every possible
combination of arguments is acceptable? Would you use a default value to
indicate that the field is not in use? Then, in this case, I'd have to
initialize 8 fields. Very annoying since my entire Python program was 1
LOC. 

> >For fun, I actually did the Python XML-RPC version:
> 
> For fun, I tried to break it...

You didn't break my code, you wrote incorrect code.

To write incorrect IDL-based code would be trivial: I'd just write an
incorrect regular expression (which I bet would be a more common mistake
in practice than getting the type wrong).

Cheers,
Brian