[getopt-sig] More about commands on the command line

Fri, 8 Mar 2002 11:59:19 +0100 (CET)

Hello all,

At least some people are not very happy with my experiment to provide a more
generic approach to option processing. I don't really understand why,
apparently there is a clash in goals or in the approach of the problem.

Everybody that has read my reasoning and/or my code will have seen that I tend
to take orthogonality to the limit, and then try to take another step. Also, I
tend to split everything in as small pieces as possible with a single
well-defined function. There are a couple of reasons for doing that.
- Having a number of orthogonal pieces that the user can compose in any way he
  wants gives power to the user. He is able to use the pieces in ways we cannot
  imagine.
- Option processing may seem simple, but there are a number of issues
  intertwined in each other. By making a much (orthogonal) pieces as possible,
  the issues get seperated, and become concrete and understandable.
So, having lots of orthogonal pieces is good for users (it gives them power),
and good for us (we get better understanding of the issues involved, and their
relations with each other).
In a sense, it is the basic (destructive) scientific research approach. Take
everything apart in as much pieces as possible, and see what you end up with.
I think we made some progress in understanding what we want, while doing that.

With the better understanding often come new and better approaches for handling
things, and or new and better approaches for e.g. command lines.
Looking at the wild variety of command-lines for all programs, I'd say there is
not much fundamental understanding of what is good or bad, and why.
I suspect that 99.9% percent of the programs choose something 'because that and
that program also do it', or 'because that is what my option processing package
assumes', not because they know it is the best approach.
Wouldn't it be great if we could gain knowledge enough to get proper arguments
why certain approaches are really wrong ?
(we then can avoid falling in that trap, and make something better).

Trying to handle commands and options in the same way is just another such
experiment. I am still very happy with the results. Things that seemed
impossible a few weeks ago can now be handled with my generic solution without
major head aches. (See below as well.)

Ripping the option processing process apart in as much pieces as possible DOES
NOT MEAN that the SIG should adopt all the pieces, put them in the standard
option processing package for Python, and give them to the users (I'd like that
of course, but that is a different matter).
I consider that not even feasible as solution, because of the wide range of
Pythonists. Some are newbie, and we really don't want them to use the nuts and
bolts and assemble their own option processing. They need a pre-assembled
package. On the other hand, we have professional programmers that need to cope
with very non-standard requirements in very non-standard environments (e.g.
verious places that deliver options, different options should write their
results in different ways in different places).
I think we should try to be capable of handling as much as possible, without
loosing the 'lower-end' users.

I consider my experiments as a way of gaining knowledge about the option
processing problem, so that we can weigh the pros and cons well, rather than
blindly adopting some standard because it just seems nice (or because
'everybody does it') without knowing the consequences and the alternatives
(e.g. what does option processing look like if we do want to be able to handle
'cvs commit'-like command lines).
We can ask ourselves questions like what is nice, what is not, and why is that?

Ok, on to my answer on the challenging question of Matthias:

On Mon, 4 Mar 2002, Matthias Urlichs wrote:

> so frankly I don't see the point of your code..?

It makes options and (command) more equal citizens. Except for the special
treatment of for example -spam (which may be magically interpreted as '-s -p -a
-m'), commands and options have equal status.

It is true that you can parse cvs-like command lines with multiple instances of
parser, but it is a work-around rather than a proper solution. I mean, you fix
the problem by writing a solution around the limitations of the option
processing package.

The main reason for pursuing a `real' solution is that I have learned that code
that relies on work-arounds tends to have some basic assumption that isn't true,
at least not in all cases. A solution that can really copy with the situation
does not have that assumption, and is thus a more generic solution to the problem.

Comparing the work-around with my more generic solution:

* With the work-around, command-line arguments near the end of the line are
  parsed/copied more than once. Not something you'd really want to have.
  We are lucky that Python shares (string) data, otherwise it would have been a
  potentially costly work-around.

* The work-around happens to work for cvs-like cases, it is not a general
  solution for a much larger set of command-lines they we may have to deal with.
  Note that this is common for work-arounds. This is not a property of the
  option processing package, it is just sheer luck that the case is not so much
  out-of-sync with the assumptions of the option processing package, that it is
  still possible to program around it.
  (below are a number of cases which are truly hopeless even with the work-around).

* Everything that you can do now with the command line can still be done (i.e.
  I don't throw anything away).

* With the equal status of commands and options, I can have commands that act
  as options, like 'cvs verbose commit'. Maybe this is not normal now, but can
  anybody give me a good reason why 'verbose' is bad, and '--verbose' or '-v'
  is not ? At least, 'verbose commit' looks more intuitive and less technical
  to me, which may be a + for non computer-experts (until now, I cannot give
  a good reason to such people why we need to write a '-' in front of options
  rather than my example).

* With the equal status of commands and options, I can have options that act as
  commands, like 'rpm -q' and 'rpm -i'. This even works with command lines like
  'rpm -qp mypackage.rpm'.
  I'd like to see you handle such cases with the current option processing
  packages.

* With the equal status of commands and options, I can have optional dashes,
  like in 'tar xzf myfile.tgz'. Not pretty and not recommanded, but it fits in
  my solution without major head aches.

Also, I learned a few things:

* '--' is not necessarily part of a parser, i.e. it can be factored out, and be
  treated as a command or option (whatever you like).

* With my generic solution, the only difference that remains between options
  and commands is the magic involved in decoding stuff like '-spam'. I find
  this strange. Why is there no such magic with commands ?
  It seems that we make some assumption with options that for some reason does
  not exist in the context of commands.
  Interesting questions are thus
  - Can we factor the magic out of the parser, and treat it as a seperate entity ?
  - Is there a similar piece of magic that works for commands ?
    (an answer to this question may give a new way of specifying commands more
    efficiently / more compactly).

* I made the step from 'options and commands' to 'pieces of text'. Getting rid
  of this (in my current view) artificial seperation, simplifies and
  generalizes the problem. The step may seem insignificant, but for me it
  changed the way of thinking about what we aim to do.
  That change may give rise to fnding new and better approaches that are not
  available if being blocked by the assumption that options and commands are 2
  different things that need to be treated differently/seperately.

* The current option processing packages fit nicely in this framework if
  fetching words from the command-line is seperated from finding a matching
  option, and collecting non-options is seperated as well.
  These are not major new requirements, we already established that seperating
  fetching and recognizing is beneficial.
  I didn't check, but I suspect that collecting commands is already seperated
  as well.

I consider it advantageous to have the more generic solution. I learned a few
things, and I have more power to do things like I want rather than being forced
by the option processing package.
Sooner or later, somebody will need that power.

Enough option processing for today. I should do some work on an experiment
environment or on hardware IO, rather than processing options :-)

I hope to have made clear that I haven't yet reached the point where I consider
everything 'understood', although the number of obscure points is getting
smaller. I think there is still progress in the understanding. I thought that
sharing the experiments was nice, but apparently not everybody shares that opinion.

The discussion of what should and should not be part of the option processing
package is a seperate discussion to me. I can imagine that my generic aproach
looks very wild, and seems to be wildly outside what is considered 'option
processing'. On the other hand, there does seem to be a need for something
stronger than what e.g. Optik delivers by default. That 'something stronger' is
currently in the form of a work-around, which happens to function for some
cases (like cvs). It does not handle all cases, and neither is there any hope
that it ever will in its current form.

That may or may not be bad, depending on the aim of the option processing that
we envision for Python.

Albert
-- 
Constructing a computer program is like writing a painting