From a.t.hofkamp@tue.nl Mon Mar 4 13:48:20 2002 From: a.t.hofkamp@tue.nl (A.T. Hofkamp) Date: Mon, 4 Mar 2002 14:48:20 +0100 (CET) Subject: [getopt-sig] commands on the command line Message-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---1685886401-826736995-1015249700=:3699 Content-Type: TEXT/PLAIN; charset=US-ASCII Hello all, I've been thinking about commands on the command line. For example in cvs: cvs [cvs-options] command [command-options-and-arguments] Currently, we don't consider commands. However, that didn't feel right to me.... The basic idea that I thought is that a command is not very different from an option. Both are a single words that we must recognize. The only difference between options and commands is that occurrence of the latter may cause a change in which options and commands we recognize. That is, it should be possible to modify the recognition of options and commands while processing. I didn't see any fundamental obstacles to this approach. The longer I thought about it, the better it became, so it was worth an experiment, which turned out very nicely if I may say so: As an experiment, I have a list of options and a list of commands that we 'currently' recognize in a class Parser. The parser gets options and arguments from a Generator class, similar to the argparser code (I didn't have that code nearby, so I wrote my own version). The parser offers words from the generator to the 'current' options and commands using an ismatch() method. * Options can retrieve the option argument through a 'getargument()' method. Also, they should store the match, etc, what argtools, Optik, and other option processing packages currently do. (Since I wasn't interested in that part, my options just print '--spam detected'.) * Commands can use 'setopts()' and 'setcmds()' of the parser to modify the recognized options respectively commands. (There is no fundamental reason why an option could not modify options and/or commands, but I haven't yet seen any need for that.) See also the attached code. The example understands opts.py [-v] [--verbose] [-oFILE] [--output=FILE] commit [-c] opts.py [-v] [--verbose] [-oFILE] [--output=FILE] status [--brief] opts.py [-v] [--verbose] [-oFILE] [--output=FILE] version although it only prints recognized options. Leveling the playing field to recognize any piece of text is quite an improvement in my opinion. The ability to modify the recognized commands and options is extremely powerful, it should cover most of our needs. Also, it makes a new step in making the option-processing package more modular, which is always good (ok, I'll settle for 'almost always' :-) ). For example, the "--" which used to be a bit special, is trivial now. Just consider it a command that throws away all options. (actuallly, that is what the hard-coded version in the parser object is doing now). No doubt there are other nice features that I have forgotten now, but they will pop up in the discussion. TTFN, Albert -- Constructing a computer program is like writing a painting ---1685886401-826736995-1015249700=:3699 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="opts.py" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="opts.py" IyEvdXNyL2Jpbi9weXRob24NCiMNCmltcG9ydCBzdHJpbmcsc3lzDQoNCmNs YXNzIEdlbmVyYXRvcjoNCiAgICBkZWYgX19pbml0X18oc2VsZixjbWRsaW5l KToNCglzZWxmLmNtZGxpbmU9Y21kbGluZQ0KCXNlbGYuYXJndW1lbnQ9Tm9u ZQkjICE9Tm9uZTogTmV4dCB0aGluZyBpcyBhbiBvcHRpb24gYXJndW1lbnQN CglzZWxmLm90aGVyc2hvcnRvcHRzPU5vbmUgIyB3aXRoICcteHl6JywgdGhp cyBiZWNvbWVzICd5eicNCg0KICAgICMgQ2FsbCBiYWNrIGZyb20gb3B0aW9u LXJlY29nbml0aW9uIGNvZGUgdG8gZ2V0IG9wdGlvbi1hcmd1bWVudA0KICAg IGRlZiBnZXRhcmd1bWVudChzZWxmKToNCgkiIiJHZXQgdGhlIGFyZ3VtZW50 IG9mIGFuIG9wdGlvbiIiIg0KCWlmIHNlbGYuYXJndW1lbnQhPU5vbmU6DQoJ ICAgIHg9c2VsZi5hcmd1bWVudA0KCSAgICBzZWxmLmFyZ3VtZW50PU5vbmUN CgkgICAgcmV0dXJuIHgNCglpZiBzZWxmLm90aGVyc2hvcnRvcHRzOg0KCSAg ICB4PXNlbGYub3RoZXJzaG9ydG9wdHMNCgkgICAgc2VsZi5vdGhlcnNob3J0 b3B0cz1Ob25lDQoJICAgIHJldHVybiB4DQoJaWYgc2VsZi5jbWRsaW5lPT1b XToNCgkgICAgcmFpc2UgIk5vIGFyZ3VtZW50IGF2YWlsYWJsZSINCgl4ID0g c2VsZi5jbWRsaW5lWzBdDQoJc2VsZi5jbWRsaW5lPXNlbGYuY21kbGluZVsx Ol0NCglyZXR1cm4geA0KDQogICAgZGVmIGdldHdvcmQoc2VsZik6DQoJIiIi R2V0IHRoZSBuZXh0IHdvcmQiIiINCglpZiBzZWxmLmFyZ3VtZW50IT1Ob25l Og0KCSAgICByYWlzZSAiQXJndW1lbnQgbm90IHByb2Nlc3NlZCINCglpZiBz ZWxmLm90aGVyc2hvcnRvcHRzIT1Ob25lOg0KCSAgICBpZiBsZW4oc2VsZi5v dGhlcnNob3J0b3B0cyk8MjoNCiAgICAJCXg9Jy0nK3NlbGYub3RoZXJzaG9y dG9wdHNbMF0NCgkJc2VsZi5vdGhlcnNob3J0b3B0cz1Ob25lDQoJCXJldHVy biB4DQoJICAgIHg9Jy0nK3NlbGYub3RoZXJzaG9ydG9wdHNbMF0NCgkgICAg c2VsZi5vdGhlcnNob3J0b3B0cz1zZWxmLm90aGVyc2hvcnRvcHRzWzE6XQ0K CSAgICByZXR1cm4geA0KCWlmIHNlbGYuY21kbGluZT09W106DQoJICAgIHJl dHVybiBOb25lCQkjIE5vIHdvcmRzIGxlZnQNCgkgICANCgl3b3JkID0gc2Vs Zi5jbWRsaW5lWzBdDQoJc2VsZi5jbWRsaW5lPXNlbGYuY21kbGluZVsxOl0N Cg0KCWlmIHdvcmRbOjJdID09ICctLSc6CSMgd29yZCA9ICctLXh5eicgb3Ig Jy0teHl6PXFyc3QnDQoJICAgIGVxID0gc3RyaW5nLmZpbmQod29yZCwnPScp DQoJICAgIGlmIGVxPDA6CQkjICc9JyBub3QgZm91bmQNCgkJcmV0dXJuIHdv cmQNCgkgICAgZWxzZToNCgkJc2VsZi5hcmd1bWVudCA9IHdvcmRbZXErMTpd DQoJCXJldHVybiB3b3JkWzplcV0NCglpZiB3b3JkWzoxXSA9PSAnLSc6CSMg d29yZCA9ICcteHl6Jw0KCSAgICBpZiBsZW4od29yZCk9PTI6DQoJCXJldHVy biB3b3JkDQoJICAgIHNlbGYub3RoZXJzaG9ydG9wdHM9d29yZFsyOl0NCgkg ICAgcmV0dXJuIHdvcmRbOjJdDQoJIyB3b3JkIGlzIG5laXRoZXIgc2hvcnQg b3B0IG5vciBsb25nIG9wdA0KCXJldHVybiB3b3JkDQoJICAgIA0KDQojDQoj IENsYXNzZXMgZm9yIG1hdGNoaW5nIHRleHQuDQojIFRoZSBvbmx5IGV4dGVy bmFsbHkgY2FsbGVkIG1ldGhvZCBpcyAnc2VsZi5pc21hdGNoKCknDQojIFRv IHNlcGVyYXRlIHJlY29nbml0aW9uIGFuZCBhY3Rpb24sIHRoZSAnc2VsZi5l eGVjdXRlKCknDQojIGlzIGNhbGxlZCBmcm9tIHRoZSAnc2VsZi5pc21hdGNo KCknIG1ldGhvZCBhZnRlcg0KIyBtYXRjaGluZyB0aGUgdGV4dC4NCiMNCiMg UmV0dXJuIHZhbHVlcyBmcm9tIGlzbWF0Y2goKToNCiMgMCA9IG5vIG1hdGNo DQojID4wICBtYXRjaGVkDQojDQoNCiMNCiMgQmFzZSBjbGFzcyBmb3IgYW4g b3B0aW9uDQojIE5vdCBhIHRydWUgYmFzZSBjbGFzcywgc2luY2UgZXhlY3V0 ZSgpIHNob3VsZCBiZSBlbXB0eSB0aGVuDQojDQpjbGFzcyBPcHRpb246DQog ICAgZGVmIF9faW5pdF9fKHNlbGYsbm0saGFzYXJnPTApOg0KCXNlbGYubmFt ZXM9c3RyaW5nLnNwbGl0KG5tKQ0KCXNlbGYuaGFzYXJnPWhhc2FyZw0KICAg IGRlZiBpc21hdGNoKHNlbGYsd29yZCxwYXJzZXIpOg0KCWlmIHdvcmQgbm90 IGluIHNlbGYubmFtZXM6DQoJICAgIHJldHVybiAwDQogICAgICAgIGlmIHNl bGYuaGFzYXJnOg0KCSAgICBhcmcgPSBwYXJzZXIuZ2V0YXJndW1lbnQoKQ0K CSAgICBzZWxmLmV4ZWN1dGUod29yZCsiICIrYXJnLHBhcnNlcikNCgkgICAg cmV0dXJuIDINCglzZWxmLmV4ZWN1dGUod29yZCxwYXJzZXIpDQoJcmV0dXJu IDENCiAgICBkZWYgZXhlY3V0ZShzZWxmLHRleHQscGFyc2VyKToNCglwcmlu dCB0ZXh0KyIgZGV0ZWN0ZWQiDQoNCiMgQmFzZSBjbGFzcyBmb3IgYSAnY29t bWFuZCcNCiMgSXQgcmVjb2duaXplcyBpdHMgb3duIG5hbWUsIGFuZCBjYWxs cyBzZWxmLmV4ZWN1dGUoKQ0KY2xhc3MgQ29tbWFuZDoNCiAgICBkZWYgX19p bml0X18oc2VsZixubSk6DQoJc2VsZi5uYW1lcz1ubQ0KICAgIGRlZiBpc21h dGNoKHNlbGYsd29yZCxwYXJzZXIpOg0KCWlmIHNlbGYubmFtZXM9PXdvcmQ6 DQoJICAgIHNlbGYuZXhlY3V0ZSh3b3JkLHBhcnNlcikNCgkgICAgcmV0dXJu IDENCglyZXR1cm4gMA0KICAgIGRlZiBleGVjdXRlKHNlbGYsd29yZCxwYXJz ZXIpOg0KCXBhc3MNCg0KIyBTd2l0Y2ggcmVjb2duaXplZCBvcHRpb25zIGFu ZCBjb21tYW5kcyBpZiAnbmFtZScgZm91bmQNCmNsYXNzIENvbW1hbmRTd2l0 Y2goQ29tbWFuZCk6DQogICAgZGVmIF9faW5pdF9fKHNlbGYsbmFtZSxuZXdv cHRzPU5vbmUsbmV3Y21kcz1Ob25lKToNCglzZWxmLm5hbWVzPW5hbWUNCglz ZWxmLm5ld29wdHM9bmV3b3B0cw0KCXNlbGYubmV3Y21kcz1uZXdjbWRzDQoJ c2VsZi5tYXRjaGVkPTANCiAgICBkZWYgZXhlY3V0ZShzZWxmLHdvcmQscGFy c2VyKToNCglzZWxmLm1hdGNoZWQ9MQ0KCWlmIHNlbGYubmV3b3B0cyE9Tm9u ZToNCgkgICAgcGFyc2VyLnNldG9wdHMoc2VsZi5uZXdvcHRzKQ0KCWlmIHNl bGYubmV3Y21kcyE9Tm9uZToNCgkgICAgcGFyc2VyLnNldGNtZHMoc2VsZi5u ZXdjbWRzKQ0KDQojIENvbGxlY3QgZXZlcnl0aGluZyB0aGF0IGlzIG9mZmVy ZWQNCmNsYXNzIENvbW1hbmRDb2xsZWN0KENvbW1hbmQpOg0KICAgIGRlZiBf X2luaXRfXyhzZWxmKToNCglDb21tYW5kLl9faW5pdF9fKHNlbGYsJycpDQoJ c2VsZi5saXN0PVtdDQogICAgZGVmIGlzbWF0Y2goc2VsZixhcmcpOg0KCXNl bGYuZXhlY3V0ZShhcmcpDQoJcmV0dXJuIDENCiAgICBkZWYgZXhlY3V0ZShz ZWxmLGFyZyk6DQoJc2VsZi5saXN0LmFwcGVuZChhcmcpDQoNCg0KY2xhc3Mg UGFyc2VyOg0KICAgIGRlZiBfX2luaXRfXyhzZWxmLGdlbmVyYXRvcixvcHRz PU5vbmUsY21kcz1Ob25lKToNCglzZWxmLmdlbmVyYXRvciA9IGdlbmVyYXRv cg0KCXNlbGYub3B0cz1vcHRzDQoJc2VsZi5jbWRzPWNtZHMNCg0KICAgICMg Q2FsbGVkIGZyb20gT3B0aW9uIGNsYXNzOiBwYXNzIHRocm91Z2gNCiAgICBk ZWYgZ2V0YXJndW1lbnQoc2VsZik6DQoJcmV0dXJuIHNlbGYuZ2VuZXJhdG9y LmdldGFyZ3VtZW50KCkNCg0KICAgICMgQ2hhbmdlIHRoZSBjb25maWd1cmF0 aW9uIG9mIHRoZSBwYXJzZXINCiAgICBkZWYgc2V0b3B0cyhzZWxmLG9wdHMp Og0KCXNlbGYub3B0cz1vcHRzDQogICAgZGVmIHNldGNtZHMoc2VsZixjbWRz KToNCglzZWxmLmNtZHM9Y21kcw0KDQogICAgZGVmIHByb2Nlc3NfY21kbGlu ZShzZWxmKToNCiAgICAgICAgd29yZCA9IHNlbGYuZ2VuZXJhdG9yLmdldHdv cmQoKQ0KCXdoaWxlIHdvcmQhPU5vbmU6DQoJICAgIHNlbGYuX3Byb2Nlc3Nf d29yZCh3b3JkKQ0KCSAgICB3b3JkID0gc2VsZi5nZW5lcmF0b3IuZ2V0d29y ZCgpDQoNCiAgICBkZWYgX3Byb2Nlc3Nfd29yZChzZWxmLHdvcmQpOg0KCSIi IlRyeSB0byBmaW5kIGEgbWF0Y2ggZm9yICd3b3JkJyIiIg0KDQoJI3ByaW50 ICJcbi4uLiBXT1JEOiAiK3dvcmQNCglpZiB3b3JkID09ICctLSc6ICAgICMg Jy0tJyBtZWFucyAnbm8gb3B0cycNCgkgICAgc2VsZi5vcHRzPU5vbmUNCgkg ICAgcmV0dXJuDQogICAgICAgIGlmIHNlbGYub3B0cyE9Tm9uZToNCiAgICAJ ICAgIGZvciBvcHQgaW4gc2VsZi5vcHRzOg0KCQkjcHJpbnQgIi4uLiBUcnlp bmcgdG8gbWF0Y2ggYWdhaW5zdCBvcHRpb24gJyIrc3RyKG9wdC5uYW1lcykr IiciDQoJCWlmIG9wdC5pc21hdGNoKHdvcmQsc2VsZik6DQoJCSAgICByZXR1 cm4NCglpZiBzZWxmLmNtZHMhPU5vbmU6DQoJICAgIGZvciBjbWQgaW4gc2Vs Zi5jbWRzOg0KCQkjcHJpbnQgIi4uLiBUcnlpbmcgdG8gbWF0Y2ggYWdhaW5z dCBjbWQgJyIrY21kLm5hbWVzKyInIg0KCQlpZiBjbWQuaXNtYXRjaCh3b3Jk LHNlbGYpOg0KCQkgICAgcmV0dXJuDQoJcmFpc2UgIlVucmVjb2duaXplZCBh cmd1bWVudCAnIit3b3JkKyInIg0KDQpnZW5lcmljX29wdHMgPSBbIE9wdGlv bignLXYgLS12ZXJib3NlJywwKSwgT3B0aW9uKCctbyAtLW91dHB1dCcsMSkg XQkNCmNvbW1pdF9vcHRzID0gWyBPcHRpb24oJy1jJykgXQ0Kc3RhdHVzX29w dHMgPSBbIE9wdGlvbignLS1icmllZicpIF0NCmNvbGxlY3RfY21kID0gWyBD b21tYW5kQ29sbGVjdCgpIF0NCmNtZHMgPSBbIENvbW1hbmRTd2l0Y2goJ2Nv bW1pdCcsY29tbWl0X29wdHMsY29sbGVjdF9jbWQpLA0KICAgICAgICAgQ29t bWFuZFN3aXRjaCgnc3RhdHVzJyxzdGF0dXNfb3B0cyxjb2xsZWN0X2NtZCks DQoJIENvbW1hbmRTd2l0Y2goJ3ZlcnNpb24nLFtdLFtdKV0NCg0KaWYgX19u YW1lX18gPT0gJ19fbWFpbl9fJzoNCiAgICBnZW5lcmF0b3IgPSBHZW5lcmF0 b3Ioc3lzLmFyZ3ZbMTpdKQ0KICAgIHBhcnNlciA9IFBhcnNlcihnZW5lcmF0 b3IsZ2VuZXJpY19vcHRzLGNtZHMpDQogICAgcGFyc2VyLnByb2Nlc3NfY21k bGluZSgpDQogICAgc3lzLmV4aXQoMCkNCg== ---1685886401-826736995-1015249700=:3699-- From smurf@noris.de Mon Mar 4 14:24:17 2002 From: smurf@noris.de (Matthias Urlichs) Date: Mon, 4 Mar 2002 15:24:17 +0100 Subject: [getopt-sig] commands on the command line In-Reply-To: ; from a.t.hofkamp@tue.nl on Mon, Mar 04, 2002 at 02:48:20PM +0100 References: Message-ID: <20020304152417.I14835@noris.de> Hi, A.T. Hofkamp: > I've been thinking about commands on the command line. For example in cvs: > > cvs [cvs-options] command [command-options-and-arguments] > > Currently, we don't consider commands. However, that didn't feel right > to me.... > Personally, I fail to have a problem with - parse global options with one instance of an option parser - interpret the first non-option as command (assuming that options and arguments are Not To Be Interspersed) - parse command's options with another, different parser instance so frankly I don't see the point of your code..? -- Matthias Urlichs | noris network AG | http://smurf.noris.de/ -- Don't believe everything you hear or anything you say. From gward@python.net Thu Mar 7 17:30:44 2002 From: gward@python.net (Greg Ward) Date: Thu, 7 Mar 2002 12:30:44 -0500 Subject: [getopt-sig] commands on the command line In-Reply-To: <20020304152417.I14835@noris.de> References: <20020304152417.I14835@noris.de> Message-ID: <20020307173044.GA1733@gerg.ca> On 04 March 2002, Matthias Urlichs said: > Personally, I fail to have a problem with > > - parse global options with one instance of an option parser > - interpret the first non-option as command (assuming that options and > arguments are Not To Be Interspersed) > - parse command's options with another, different parser instance > > so frankly I don't see the point of your code..? I completely agree with Matthias. In fact, I posted pseudo-code for how to do precisely this with Optik a few weeks ago. Furthermore, this is a pretty arcane special case; the only obvious examples I can think of offhand are CVS and the Distutils. If I can't rewrite the Distutils' command-line processing with whatever comes out of this SIG, then this will have been a big fat waste of time, but I certainly don't think Optik (or whatever) needs to directly accomodate this somewhat unusual syntax. Greg -- Greg Ward - Linux weenie gward@python.net http://starship.python.net/~gward/ What happens if you touch these two wires tog-- From a.t.hofkamp@tue.nl Fri Mar 8 10:59:19 2002 From: a.t.hofkamp@tue.nl (A.T. Hofkamp) Date: Fri, 8 Mar 2002 11:59:19 +0100 (CET) Subject: [getopt-sig] More about commands on the command line In-Reply-To: <20020304152417.I14835@noris.de> Message-ID: Hello all, At least some people are not very happy with my experiment to provide a more generic approach to option processing. I don't really understand why, apparently there is a clash in goals or in the approach of the problem. Everybody that has read my reasoning and/or my code will have seen that I tend to take orthogonality to the limit, and then try to take another step. Also, I tend to split everything in as small pieces as possible with a single well-defined function. There are a couple of reasons for doing that. - Having a number of orthogonal pieces that the user can compose in any way he wants gives power to the user. He is able to use the pieces in ways we cannot imagine. - Option processing may seem simple, but there are a number of issues intertwined in each other. By making a much (orthogonal) pieces as possible, the issues get seperated, and become concrete and understandable. So, having lots of orthogonal pieces is good for users (it gives them power), and good for us (we get better understanding of the issues involved, and their relations with each other). In a sense, it is the basic (destructive) scientific research approach. Take everything apart in as much pieces as possible, and see what you end up with. I think we made some progress in understanding what we want, while doing that. With the better understanding often come new and better approaches for handling things, and or new and better approaches for e.g. command lines. Looking at the wild variety of command-lines for all programs, I'd say there is not much fundamental understanding of what is good or bad, and why. I suspect that 99.9% percent of the programs choose something 'because that and that program also do it', or 'because that is what my option processing package assumes', not because they know it is the best approach. Wouldn't it be great if we could gain knowledge enough to get proper arguments why certain approaches are really wrong ? (we then can avoid falling in that trap, and make something better). Trying to handle commands and options in the same way is just another such experiment. I am still very happy with the results. Things that seemed impossible a few weeks ago can now be handled with my generic solution without major head aches. (See below as well.) Ripping the option processing process apart in as much pieces as possible DOES NOT MEAN that the SIG should adopt all the pieces, put them in the standard option processing package for Python, and give them to the users (I'd like that of course, but that is a different matter). I consider that not even feasible as solution, because of the wide range of Pythonists. Some are newbie, and we really don't want them to use the nuts and bolts and assemble their own option processing. They need a pre-assembled package. On the other hand, we have professional programmers that need to cope with very non-standard requirements in very non-standard environments (e.g. verious places that deliver options, different options should write their results in different ways in different places). I think we should try to be capable of handling as much as possible, without loosing the 'lower-end' users. I consider my experiments as a way of gaining knowledge about the option processing problem, so that we can weigh the pros and cons well, rather than blindly adopting some standard because it just seems nice (or because 'everybody does it') without knowing the consequences and the alternatives (e.g. what does option processing look like if we do want to be able to handle 'cvs commit'-like command lines). We can ask ourselves questions like what is nice, what is not, and why is that? Ok, on to my answer on the challenging question of Matthias: On Mon, 4 Mar 2002, Matthias Urlichs wrote: > so frankly I don't see the point of your code..? It makes options and (command) more equal citizens. Except for the special treatment of for example -spam (which may be magically interpreted as '-s -p -a -m'), commands and options have equal status. It is true that you can parse cvs-like command lines with multiple instances of parser, but it is a work-around rather than a proper solution. I mean, you fix the problem by writing a solution around the limitations of the option processing package. The main reason for pursuing a `real' solution is that I have learned that code that relies on work-arounds tends to have some basic assumption that isn't true, at least not in all cases. A solution that can really copy with the situation does not have that assumption, and is thus a more generic solution to the problem. Comparing the work-around with my more generic solution: * With the work-around, command-line arguments near the end of the line are parsed/copied more than once. Not something you'd really want to have. We are lucky that Python shares (string) data, otherwise it would have been a potentially costly work-around. * The work-around happens to work for cvs-like cases, it is not a general solution for a much larger set of command-lines they we may have to deal with. Note that this is common for work-arounds. This is not a property of the option processing package, it is just sheer luck that the case is not so much out-of-sync with the assumptions of the option processing package, that it is still possible to program around it. (below are a number of cases which are truly hopeless even with the work-around). * Everything that you can do now with the command line can still be done (i.e. I don't throw anything away). * With the equal status of commands and options, I can have commands that act as options, like 'cvs verbose commit'. Maybe this is not normal now, but can anybody give me a good reason why 'verbose' is bad, and '--verbose' or '-v' is not ? At least, 'verbose commit' looks more intuitive and less technical to me, which may be a + for non computer-experts (until now, I cannot give a good reason to such people why we need to write a '-' in front of options rather than my example). * With the equal status of commands and options, I can have options that act as commands, like 'rpm -q' and 'rpm -i'. This even works with command lines like 'rpm -qp mypackage.rpm'. I'd like to see you handle such cases with the current option processing packages. * With the equal status of commands and options, I can have optional dashes, like in 'tar xzf myfile.tgz'. Not pretty and not recommanded, but it fits in my solution without major head aches. Also, I learned a few things: * '--' is not necessarily part of a parser, i.e. it can be factored out, and be treated as a command or option (whatever you like). * With my generic solution, the only difference that remains between options and commands is the magic involved in decoding stuff like '-spam'. I find this strange. Why is there no such magic with commands ? It seems that we make some assumption with options that for some reason does not exist in the context of commands. Interesting questions are thus - Can we factor the magic out of the parser, and treat it as a seperate entity ? - Is there a similar piece of magic that works for commands ? (an answer to this question may give a new way of specifying commands more efficiently / more compactly). * I made the step from 'options and commands' to 'pieces of text'. Getting rid of this (in my current view) artificial seperation, simplifies and generalizes the problem. The step may seem insignificant, but for me it changed the way of thinking about what we aim to do. That change may give rise to fnding new and better approaches that are not available if being blocked by the assumption that options and commands are 2 different things that need to be treated differently/seperately. * The current option processing packages fit nicely in this framework if fetching words from the command-line is seperated from finding a matching option, and collecting non-options is seperated as well. These are not major new requirements, we already established that seperating fetching and recognizing is beneficial. I didn't check, but I suspect that collecting commands is already seperated as well. I consider it advantageous to have the more generic solution. I learned a few things, and I have more power to do things like I want rather than being forced by the option processing package. Sooner or later, somebody will need that power. Enough option processing for today. I should do some work on an experiment environment or on hardware IO, rather than processing options :-) I hope to have made clear that I haven't yet reached the point where I consider everything 'understood', although the number of obscure points is getting smaller. I think there is still progress in the understanding. I thought that sharing the experiments was nice, but apparently not everybody shares that opinion. The discussion of what should and should not be part of the option processing package is a seperate discussion to me. I can imagine that my generic aproach looks very wild, and seems to be wildly outside what is considered 'option processing'. On the other hand, there does seem to be a need for something stronger than what e.g. Optik delivers by default. That 'something stronger' is currently in the form of a work-around, which happens to function for some cases (like cvs). It does not handle all cases, and neither is there any hope that it ever will in its current form. That may or may not be bad, depending on the aim of the option processing that we envision for Python. Albert -- Constructing a computer program is like writing a painting From rsc@plan9.bell-labs.com Fri Mar 8 12:24:45 2002 From: rsc@plan9.bell-labs.com (Russ Cox) Date: Fri, 8 Mar 2002 07:24:45 -0500 Subject: [getopt-sig] More about commands on the command line Message-ID: > It is true that you can parse cvs-like command lines with multiple instances of > parser, but it is a work-around rather than a proper solution. I mean, you fix > the problem by writing a solution around the limitations of the option > processing package. This is where I think those who objected (and myself) differ with you. In my mind, CVS-like command lines are bad program design. If you've got multiple commands, give them multiple names. I greatly dislike the idea that keyword early on changes the set of valid options later. At least in the case of CVS, the split between general options and command options is clear, because you've got this thing that is definitely not an option sitting there. Letting actual options behave this way is a terrible idea. > * With the work-around, command-line arguments near the end of the line are > parsed/copied more than once. Not something you'd really want to have. > We are lucky that Python shares (string) data, otherwise it would have been a > potentially costly work-around. This is entirely untrue. They don't get parsed more than once -- the first pass STOPS when it gets to the cvs command. > * With the equal status of commands and options, I can have commands that act > as options, like 'cvs verbose commit'. Maybe this is not normal now, but can > anybody give me a good reason why 'verbose' is bad, and '--verbose' or '-v' > is not ? At least, 'verbose commit' looks more intuitive and less technical > to me, which may be a + for non computer-experts (until now, I cannot give > a good reason to such people why we need to write a '-' in front of options > rather than my example). We are not debating how commands should work. We are trying to agree on an interface for a standard option parser. If you want to build something else entirely, go ahead, but you've gone away from options now. All you're doing is pointing out how crummy cvs is, which I won't argue against. Try doing this to ls and you'll see where it falls apart. > * With the equal status of commands and options, I can have options that act as > commands, like 'rpm -q' and 'rpm -i'. This even works with command lines like > 'rpm -qp mypackage.rpm'. > I'd like to see you handle such cases with the current option processing > packages. rpm is awful too. I'd like NOT to be able to handle such cases in the common case (I don't mind if there is a work around) so that we don't encourage such commands. > * With the equal status of commands and options, I can have optional dashes, > like in 'tar xzf myfile.tgz'. Not pretty and not recommanded, but it fits in > my solution without major head aches. Again, tar is nonstandard and is reasonable to require a nonstandard solution. The last thing we need is everyone implementing interfaces like tar. > * With my generic solution, the only difference that remains between options > and commands is the magic involved in decoding stuff like '-spam'. I find > this strange. Why is there no such magic with commands ? Because commands are not options. And too many programs have adopted this weird long-options-with-single-dash syntax. > I consider it advantageous to have the more generic solution. I learned a few > things, and I have more power to do things like I want rather than being forced > by the option processing package. > Sooner or later, somebody will need that power. Sooner or later, somebody will greatly abuse that power. You're arguing for nonstandard things that only complicate stuff for the user. It doesn't bother me at all that Optik (the current proposal on the table), by side effect of its interface, makes these more or less impossible. > The discussion of what should and should not be part of the option processing > package is a seperate discussion to me. I can imagine that my generic aproach > looks very wild, and seems to be wildly outside what is considered 'option > processing'. On the other hand, there does seem to be a need for something > stronger than what e.g. Optik delivers by default. That 'something stronger' is > currently in the form of a work-around, which happens to function for some > cases (like cvs). It does not handle all cases, and neither is there any hope > that it ever will in its current form. If you want to do arbitrary parsing, use the iterator that I posted, perhaps invoking it multiple times. Russ From david@sleepydog.net Fri Mar 8 12:55:49 2002 From: david@sleepydog.net (David Boddie) Date: Fri, 8 Mar 2002 12:55:49 +0000 Subject: [getopt-sig] More about commands on the command line In-Reply-To: References: Message-ID: <20020308125810.910D02AA9B@wireless-084-136.tele2.co.uk> [I've only quoted the text I wanted to reply to, so this may appear quite disjointed in certain places.] On Friday 08 Mar 2002 10:59 am, A.T. Hofkamp wrote: > Looking at the wild variety of command-lines for all programs, I'd say > there is not much fundamental understanding of what is good or bad, and > why. I suspect that 99.9% percent of the programs choose something 'because > that and that program also do it', or 'because that is what my option > processing package assumes', not because they know it is the best approach. I imagine that in many cases the syntax for the arguments passed to the program is dictated both by the ease of parsing those arguments and the type of functionality offered by the program. Therefore, I suspect that we see something of the internal operation of utilities such as "tar" and "rpm" in their syntax definitions. > I consider my experiments as a way of gaining knowledge about the option > processing problem, so that we can weigh the pros and cons well, rather > than blindly adopting some standard because it just seems nice (or because > 'everybody does it') without knowing the consequences and the alternatives > (e.g. what does option processing look like if we do want to be able to > handle 'cvs commit'-like command lines). We give the parser the ability to parse different styles of command lines. > It makes options and (command) more equal citizens. Except for the special > treatment of for example -spam (which may be magically interpreted as '-s > -p -a -m'), commands and options have equal status. This would depend on the style of command line which you are asking the parser to deal with. For example, -spam may be interpreted as 1. An argument, not an option. 2. A single option: -spam 3. A number of options: -s -p -a -m 4. A single option with a following argument: -s pam 5. Some other combination of options and arguments. Although we can hack option libraries to deal with some of these in a natural manner and cope with the others as special cases, I propose that to remove ambiguity a given style of command line would not mix these option styles. So, the command line -spam -viking -longship could not be interpreted as -spam -v -i -k -i -n -g -l ongship or any other confused input. > It is true that you can parse cvs-like command lines with multiple > instances of parser, but it is a work-around rather than a proper solution. > I mean, you fix the problem by writing a solution around the limitations of > the option processing package. I agree. We are in danger of rewriting options packages to deal with many special cases rather than addressing the more general problem. Indeed, in the cvs-like case, the complexity of the command line syntax is being "passed upwards" to the programmer, who then may have to perform simple syntax checking on command lines. > The main reason for pursuing a `real' solution is that I have learned that > code that relies on work-arounds tends to have some basic assumption that > isn't true, at least not in all cases. A solution that can really copy with > the situation does not have that assumption, and is thus a more generic > solution to the problem. We need to specify our requirements for such a solution, but not make it too general. > * With the equal status of commands and options, I can have commands that > act as options, like 'cvs verbose commit'. Maybe this is not normal now, > but can anybody give me a good reason why 'verbose' is bad, and '--verbose' > or '-v' is not ? At least, 'verbose commit' looks more intuitive and less > technical to me, which may be a + for non computer-experts (until now, I > cannot give a good reason to such people why we need to write a '-' in > front of options rather than my example). With command line styles you could allow "verbose", "-verbose" or "--verbose", but a mixture of these might prove problematic. You could equally well allow both "-v" and "+v" and have them mean the same thing, or different things. > * With the equal status of commands and options, I can have optional > dashes, like in 'tar xzf myfile.tgz'. Not pretty and not recommanded, but > it fits in my solution without major head aches. Without special characters to denote options, parsing would be slightly more difficult in this case. I imagine that the position of the options is important in the case of "tar", so it may be a special case command line with positional options/commands. Certainly, in extreme cases of this sort of command line, there is plenty of scope for ambiguity. > I consider it advantageous to have the more generic solution. I learned a > few things, and I have more power to do things like I want rather than > being forced by the option processing package. > Sooner or later, somebody will need that power. I think that we should be clear on what an option processing package should contain, and make it sufficiently modular to allow users to leave out or replace features they don't want or need. The package should: 1. Parse the command line, possibly using an appropriate style definition. 2. Check the input against a syntax definition to prevent invalid or ambiguous input. Optional extras include: a. Correcting the user's input using the syntax definition and confirming the corrections with the user. b. Providing a more specific error message to the error encountered. c. Clarifying what the user meant in cases of ambiguous syntax definitions or input. 3. Extract the values given by the user and make them available to the programmer in a useful form. 4. Convert values according to type declarations. The first feature would resolve any debate over the preferred style of command line to support. It would leave only a debate on what should be the default style. I haven't seen much enthusiasm for the second feature so far, although I would find it quite useful. It would allow one-shot parsing which produces either a collection of values or an exception, depending on whether a successful match was found. The last two features are already present in many existing packages, but I imagine that there is some scope for allowing different ways of presenting values to the programmer. > I hope to have made clear that I haven't yet reached the point where I > consider everything 'understood', although the number of obscure points is > getting smaller. I think there is still progress in the understanding. I > thought that sharing the experiments was nice, but apparently not everybody > shares that opinion. Although I don't have the time to compare lots of libraries, I appreciate the discussion of ideas. I feel that without discussion we could end up with a library which suits a particular way of thinking without solving some of the more fundamental problems involving command lines. This wouldn't be too bad, but I'm sure that many people would then go back to writing their own parsers as a result. > The discussion of what should and should not be part of the option > processing package is a seperate discussion to me. I can imagine that my > generic aproach looks very wild, and seems to be wildly outside what is > considered 'option processing'. On the other hand, there does seem to be a > need for something stronger than what e.g. Optik delivers by default. That > 'something stronger' is currently in the form of a work-around, which > happens to function for some cases (like cvs). It does not handle all > cases, and neither is there any hope that it ever will in its current form. I believe that we shouldn't build an option processing package on a case by case basis. David ________________________________________________________________________ This email has been scanned for all viruses by the MessageLabs SkyScan service. For more information on a proactive anti-virus service working around the clock, around the globe, visit http://www.messagelabs.com ________________________________________________________________________ From smurf@noris.de Fri Mar 8 13:06:05 2002 From: smurf@noris.de (Matthias Urlichs) Date: Fri, 8 Mar 2002 14:06:05 +0100 Subject: [getopt-sig] More about commands on the command line In-Reply-To: ; from a.t.hofkamp@tue.nl on Fri, Mar 08, 2002 at 11:59:19AM +0100 References: <20020304152417.I14835@noris.de> Message-ID: <20020308140605.I17477@noris.de> Hi, A.T. Hofkamp: > Ok, on to my answer on the challenging question of Matthias: > Thank you. ;-) > It makes options and (command) more equal citizens. Ok, that's a valid point, though you also have to consider arguments (either to the options, or to the program). I wonder if, with all the generality, people who try to use the package would wonder "how the heck do I do the _common_case_". Packages like Optick handle the common case well (and reasonably seamlessly) and force people to go through some number of hoops for the special stuff, which is something i consider a Good Thing. > * With the equal status of commands and options, I can have optional dashes, > like in 'tar xzf myfile.tgz'. Not pretty and not recommanded, but it fits in > my solution without major head aches. Note that this is also a "magic" command, i.e. it rips itself apart just like you "-spam" example. > * With my generic solution, the only difference that remains between options > and commands is the magic involved in decoding stuff like '-spam'. (but only if it's the first argument) and might be better handled by the sufficiently simple if not sys.argv[1].startswith('-'): sys.argv[1] = '-' + sys.argv[1] (NOTE: This test doesn't handle corner cases like 'no arguments' or 'empty argument'.) > this strange. Why is there no such magic with commands ? Because people are lazy, and typing "-s -p -a -m" is three times as much work as typing "-spam". Note that the "magic" depends on neither s,p, nor a taking an argument. Besides, traditionally Unix doesn't _have_ commands. It has differrent tools. I consider "rpm -q" a totally different tool from "rpm -U", which should have different access privileges and whatnot. Thus, what RPM 4 does internally is Not A Surprise: $ strace -e execve /bin/rpm -q foo execve("/bin/rpm", ["/bin/rpm", "-q", "foo"], [/* 55 vars */]) = 0 execve("/usr/lib/rpm/rpmq", ["/usr/lib/rpm/rpmq", "-q", "--", "foo"], [/* 55 vars */]) = 0 package foo is not installed $ They should have done that right in the first place. > That may or may not be bad, depending on the aim of the option processing that > we envision for Python. > Your code certainly helps with (eventually ;-) arriving at some sort of consensus as to what we want to accomplish. -- Matthias Urlichs | noris network AG | http://smurf.noris.de/ From smurf@noris.de Fri Mar 8 13:16:07 2002 From: smurf@noris.de (Matthias Urlichs) Date: Fri, 8 Mar 2002 14:16:07 +0100 Subject: [getopt-sig] More about commands on the command line In-Reply-To: <20020308140605.I17477@noris.de>; from smurf@noris.de on Fri, Mar 08, 2002 at 02:06:05PM +0100 References: <20020304152417.I14835@noris.de> <20020308140605.I17477@noris.de> Message-ID: <20020308141607.J17477@noris.de> Hi, I wrote: > if not sys.argv[1].startswith('-'): > sys.argv[1] = '-' + sys.argv[1] > > (NOTE: This test doesn't handle corner cases like 'no arguments' or > 'empty argument'.) > ... and completely forgot that this doesn't work for "tar" at all. But, given an option package which allows introspection, writing a generic code snipped which rips this apart is no problem at all. Untested code: if not sys.argv[1].startswith('-'): rargs=sys.argv[2:] args=[] for c in sys.argv[1]: args.append("-"+c) if opts.option(c).wants_an_argument(): args.append(rargs[0]) del rargs[0:1] sys.argv=sys.argv[0]+args+rargs so I think that "tar" is a rather noninteresting example. -- Matthias Urlichs | noris network AG | http://smurf.noris.de/ From gward@python.net Fri Mar 8 14:17:39 2002 From: gward@python.net (Greg Ward) Date: Fri, 8 Mar 2002 09:17:39 -0500 Subject: [getopt-sig] More about commands on the command line In-Reply-To: References: <20020304152417.I14835@noris.de> Message-ID: <20020308141739.GA1224@gerg.ca> On 08 March 2002, A.T. Hofkamp said: > At least some people are not very happy with my experiment to provide a more > generic approach to option processing. I don't really understand why, > apparently there is a clash in goals or in the approach of the problem. Wild experimentation is always a good thing, but it almost certainly doesn't belong in the Python standard library. So feel free to perform your experiments and discuss them here, but keep in mind that the primary purpose of this SIG is to come up with something better than the existing getopt module. Experiments in command-line interface design, interesting as they may be, are second-class citizens in the discourse of this SIG. > Everybody that has read my reasoning and/or my code will have seen > that I tend to take orthogonality to the limit, and then try to take > another step. Also, I tend to split everything in as small pieces as > possible with a single well-defined function. There are a couple of > reasons for doing that. > - Having a number of orthogonal pieces that the user can compose in > any way he wants gives power to the user. He is able to use the > pieces in ways we cannot imagine. I think others have beaten most of this issue to death, but I'd like to add one point: there's such a thing as too much orthogonality, or too much object-orientation. My case in point is the Java I/O library. I only used Java for a little while, and it was two years ago, but the pain lingers on. There are about 47 different fiddly little classes (some of them not so little) involved in writing to a file with Java, and I spent waaaay more time paging through docs than actually writing code. And goodness knows how many method-lookup-and-calls are involved every time you do a write (or a read). Someone recently mentioned on python-dev a case of overgrown formatting classes in Java causing 7000 function calls every time some app wrote to its log file. Yow! I think what I'm getting at here can be summed up quite simply: premature generalization is the root of much evil. You should never generalize/orthogonalize a design simply for the sake of it; rather, you should implement, deploy, and *use* the simplest design you can, and then refactor as needed. That way, the divisions between your classes/modules will fall along the lines actually needed in real life, not along every possible line you could think of during the design. At any rate, that's the way I'm approaching Optik. Starting from two fundamental classes -- Option and OptionParser -- I have refactored many methods to make subclassing easier. Lately, I have been thinking of factoring a HelpFormatter class out of OptionParser. I have also been thinking of splitting the Option class up along "action" lines -- StoreOption, AppendOption, HelpOption, CountOption, etc. But those divisions were *not* obvious from the start; they have only become clear after several months and various attempts to add interesting-but-not- essential functionality by subclassing. Greg -- Greg Ward - programmer-at-big gward@python.net http://starship.python.net/~gward/ I love ROCK 'N ROLL! I memorized the all WORDS to "WIPE-OUT" in 1965!! From gward@python.net Fri Mar 8 14:20:46 2002 From: gward@python.net (Greg Ward) Date: Fri, 8 Mar 2002 09:20:46 -0500 Subject: [getopt-sig] More about commands on the command line In-Reply-To: References: Message-ID: <20020308142046.GB1224@gerg.ca> On 08 March 2002, Russ Cox said: > This is where I think those who objected (and myself) differ with you. > In my mind, CVS-like command lines are bad program design. If you've > got multiple commands, give them multiple names. I only half agree with you here. I definitely think tar and rpm suck for all the reasons stated. But CVS is not too bad; it really doesn't matter whether you spell it "cvs commit" or "cvs-commit", because it's crystal clear what the sub-command is. In the case of the Distutils, where we quite obviously stole CVS' UI, there's a very good reason for a "sub-command" style interface: there is only one setup.py with a given package. We didn't want to make people distribute build.py, install.py, etc. (although as I recall that idea never even came up). I think it works pretty well, again because it's quite clear what the sub-command is. Greg -- Greg Ward - Unix nerd gward@python.net http://starship.python.net/~gward/ Dyslexics of the world, untie! From a.t.hofkamp@tue.nl Tue Mar 12 16:32:03 2002 From: a.t.hofkamp@tue.nl (A.T. Hofkamp) Date: Tue, 12 Mar 2002 17:32:03 +0100 (CET) Subject: [getopt-sig] More about commands on the command line In-Reply-To: Message-ID: On Fri, 8 Mar 2002, Russ Cox wrote: > Subject: Re: [getopt-sig] More about commands on the command line Wow, so many responses in so little time. I tried to create some structure in the chaos last weekend, but it didn't really work. The issues are too much inter-connected with each other. So instead, I will simply answer to the replies when I consider it necessary. Others, feel free to add or comment. > > It is true that you can parse cvs-like command lines with multiple instances of > > parser, but it is a work-around rather than a proper solution. I mean, you fix > > the problem by writing a solution around the limitations of the option > > processing package. > > This is where I think those who objected (and myself) differ with you. > In my mind, CVS-like command lines are bad program design. If you've Most of us may think so (I am not sure what to think of it), but the fact is, these programs are out there. At least the programmer considered their interface the best feasible, and most people seem to be able to work with the programs, and are not so annoyed that they modify the code. The only conclusion that I can draw is that users can handle a lot of ambiguity and (to us) crummy interfaces. The next question is then obvious, should we ignore these programs and leave the programmers on their own (i.e. have them start from scratch), or should we try to structure the problem for them ? Every option processing package I have seen until now does the former. They implement the (easy) POSIX sub-set for what an option is, and totally ignore the fact that a programmer has a much bigger problem (in fact, in some cases, the command-line interface w.r.t. options is a very tiny problem). I think that ignoring the bigger picture is exactly the reason why every option processing package, except "getopt" fails to catch on. "getopt", being the only extremely flexible solution fits in just about any command line interface scheme, and is more or less the only option (excuse the pun) left if the demands of the programmer are bigger than what an option package delivers. To overcome this problem, an option processing package has to extremely open and modular, allowing it to be used partly, or in a way we will never think of. > got multiple commands, give them multiple names. I greatly dislike the > idea that keyword early on changes the set of valid options later. At least You may think so, but I don't think that we as library designer can tell a programmer what he should or should not do, especially in an open source environment. It is much the same as when I would tell you to use for example /bin/csh, because I think all the other shells are evil. (I don't think that by the way). The only thing that we can do here is give him the power to organize things in his own way, and add a warning that he may be heading for trouble, because of so and so. In that way, you have warned him that it may not be wise to proceed, yet you didn't limit him in doing it anyway. Some programmers may 'make' it, they will produce something useful, which just happens to be outside our norms. Others will fail, and get bitten by their own creations. I don't think that it is bad if programmers burn their hands when playing with something powerful. If one is given power, one should respect it, and use it properly. If you don't, you will get burned. However, people (programmers included) cannot learn that if they are protected from powerful stuff. They can only learn that by experience. > > * With the work-around, command-line arguments near the end of the line are > > parsed/copied more than once. Not something you'd really want to have. > > We are lucky that Python shares (string) data, otherwise it would have been a > > potentially costly work-around. > > This is entirely untrue. They don't get parsed more than once -- the first > pass STOPS when it gets to the cvs command. I phrased it sloppy I am afraid. What I meant to say was that, in order to parse the remaining arguments, you need to create a 'new' command line for the next parser instance. Thus, you have to copy all remaining elements in a new list (and maybe do something special for the first element). If you have a lot of commands, and thus a lot of sub-parse runs, you end up creating a lot of lists. Clearly, the processing package should deliver a one-shot parsing solution in these cases. > > * With the equal status of commands and options, I can have commands that act > > as options, like 'cvs verbose commit'. Maybe this is not normal now, but can > > anybody give me a good reason why 'verbose' is bad, and '--verbose' or '-v' > > is not ? At least, 'verbose commit' looks more intuitive and less technical > > to me, which may be a + for non computer-experts (until now, I cannot give > > a good reason to such people why we need to write a '-' in front of options > > rather than my example). > > We are not debating how commands should work. I don't entirely agree with you. The above is more or less excluded as possibility because of assumptions of e.g. option processing packages. By getting rid of these assumptions (or at least, making them explicit), people may get new ideas or see new opportunities for implementing a better command-interface. I think that it is important that the option processing package should allow as much styles as possible, even if they are currently not commonly used. > rpm is awful too. I'd like NOT to be able to handle such cases in the > common case (I don't mind if there is a work around) so that we don't > encourage such commands. This may be the source of a lot of problems. I _NEVER_ said that I consider stuff like cvs, tar or rpm THE COMMON CASE (though Greg does for cvs). I think that Optik more or less covers the common case. However, what we are going to deliver should be flexible enough to be useful in the non-common case as well. > Again, tar is nonstandard and is reasonable to require a nonstandard solution. > The last thing we need is everyone implementing interfaces like tar. I agree completely. > Sooner or later, somebody will greatly abuse that power. Yep, and we are not going to stop them. Any piece of useful software can be abused. For example, I can use Python to compute 1+1. Is that bad ? No, in my opinion. Maybe the abuse is justified (e.g. I suddenly have a extremely lack of confidence in the computing abilities of my computer). Sometimes it is not. In the latter case, others will fix the problems (if the abuser does not first). Like I said, you only learn to respect fire by getting burned (quite literally, in this case). > You're arguing for nonstandard things that only complicate > stuff for the user. It doesn't bother me at all that Optik (the Nope, I extend the power for the non-common case. There is a real important function for the documentation here. It should clearly state the 3 levels of complexity: 1) iterator-like processing 2) the common case 3) anything beyond the common case. Especially with 3, the manual should warn that the territory is potentionally hazardous both for the programmer and the user. I think we should try to reach agreement on a) what exactly is in 'the common case' ? b) what is in 3 ? (if you ask me, as much as possible, with the condition that as a programmer I should be able to use code from 1 and 2.) > current proposal on the table), by side effect of its interface, > makes these more or less impossible. If that is the goal, we are wasting our time here. We can simply pick any option processing package. I can also predict that it will fail, just like all the other option processing packages that exist. > If you want to do arbitrary parsing, use the iterator that I posted, perhaps > invoking it multiple times. I know, but that is too low level. If I decide (after much thinking) to write code beyond the common case, I have a very complex problem. Exactly in that case, I need all the help I can get. So why is it then not possible to re-use the Option-object framework, exactly at a moment I most need it ? (and yes, these programs exist. look at distustils, cvs, tar, rpm, and probably many others). Albert -- Constructing a computer program is like writing a painting From rsc@plan9.bell-labs.com Tue Mar 12 16:47:47 2002 From: rsc@plan9.bell-labs.com (Russ Cox) Date: Tue, 12 Mar 2002 11:47:47 -0500 Subject: [getopt-sig] More about commands on the command line Message-ID: Most of our disagreement is philosophical, so I don't really expect to reach any sort of consensus here. However, this is just false: > If you have a lot of commands, and thus a lot of sub-parse runs, you end up > creating a lot of lists. Clearly, the processing package should deliver a > one-shot parsing solution in these cases. Why? Is argument parsing really the bottleneck in your applications? It's not in mine. Further, assuming that the option parser gives you back the list of unparsed command arguments, you're already holding the list you need -- you don't even have to write lots of code. List slices are cheap. This is a non-issue. Russ From a.t.hofkamp@tue.nl Tue Mar 12 17:07:08 2002 From: a.t.hofkamp@tue.nl (A.T. Hofkamp) Date: Tue, 12 Mar 2002 18:07:08 +0100 (CET) Subject: [getopt-sig] More about commands on the command line In-Reply-To: <20020308125810.910D02AA9B@wireless-084-136.tele2.co.uk> Message-ID: On Fri, 8 Mar 2002, David Boddie wrote: > > that and that program also do it', or 'because that is what my option > > processing package assumes', not because they know it is the best approach. > > I imagine that in many cases the syntax for the arguments passed to the > program is dictated both by the ease of parsing those arguments and the > type of functionality offered by the program. Therefore, I suspect that > we see something of the internal operation of utilities such as "tar" > and "rpm" in their syntax definitions. That may be the case. Anyway, they considered it the best reachable solution (which may be different from _KNOWING_ the best approach). > We give the parser the ability to parse different styles of command lines. Good. I haven't yet seen that here (or maybe I wasn't looking). Can you give a better explanation of this interesting subject ? How can we specify this in a enough generic way ? > > It is true that you can parse cvs-like command lines with multiple > > instances of parser, but it is a work-around rather than a proper solution. > > I mean, you fix the problem by writing a solution around the limitations of > > the option processing package. > > I agree. We are in danger of rewriting options packages to deal with many > special cases rather than addressing the more general problem. I agree (obviously :-) ). > Indeed, in the cvs-like case, the complexity of the command line syntax > is being "passed upwards" to the programmer, who then may have to perform > simple syntax checking on command lines. I am not entirely sure whether that can be eliminated completely, but at least an attempt should be made. In some sense, it can be compared with the "far pointers" in C/Pascal/etc compilers on Windows systems in the 80's and 90's. The problem is that Intel processors had(/s?) 2 different pointers. Pointers in a segment (which are 16 bit), and pointers to anywhere (which are 32 bit). The C language states that the compiler should hide implementation details. What Microsoft and Borland did was to base their entire compiler on the use of segments, and introduce a 'far' keyword for 32bit pointers. Effectively, they dumped the entire segment mess of the Intel processors on their users. Amazingly, users accepted this "extension". I still consider it very good marketing to trick compiler users into handling messy processor details that should have been hidden by the compiler implementation. To prevent such mess, I think that the work-around is not solution. If we want to be able to handle cvs-like commands (like Greg says he wants to), we should get a proper solution. > We need to specify our requirements for such a solution, but not make it > too general. I agree there is a danger of becoming too general. However, I also think we haven't yet reached that point. My 'generic solution' clearly eliminates some limitations that Optik has. When we re-structure code without an improvement in usefulness, we should start worrying... :-) > With command line styles you could allow "verbose", "-verbose" or > "--verbose", but a mixture of these might prove problematic. You could > equally well allow both "-v" and "+v" and have them mean the same thing, > or different things. He, +v is new. Can Optik do that ? I think it is not our job to define what a programmer can or cannot do. Given the huge changes in computing, who knows what programmers can and want tomorrow ? > > * With the equal status of commands and options, I can have optional > > dashes, like in 'tar xzf myfile.tgz'. Not pretty and not recommanded, but > > it fits in my solution without major head aches. > > Without special characters to denote options, parsing would be slightly > more difficult in this case. I imagine that the position of the options > is important in the case of "tar", so it may be a special case command > line with positional options/commands. I don't really agree with the example. I only mention it here because it _can_ be handled without too much trouble, while the general opinion here recently was that that example would be impossible to handle. > Certainly, in extreme cases of this sort of command line, there is plenty > of scope for ambiguity. Yes, and users and programmers do not seem to be bothered by it. This is also where I consider documentation very important. For example, we can support optional arguments or optional dashes, but add a warning "this may be dangerous, because ....". The programmer can then decide whether he considers that convincing enough to search for a different solution. > > I consider it advantageous to have the more generic solution. I learned a > > few things, and I have more power to do things like I want rather than > > being forced by the option processing package. > > Sooner or later, somebody will need that power. > > I think that we should be clear on what an option processing package > should contain, and make it sufficiently modular to allow users to > leave out or replace features they don't want or need. I agree completely. Especially getting as specific as possible for "the common case" is important. > The first feature would resolve any debate over the preferred style of > command line to support. It would leave only a debate on what should be I currently don't know how to specify something like that. Estimating the impact is even more difficult. We should also consider platform-specific issues. For example, can and should we handle the Windows-style of options, and/or the Mac-style (if it has one) ? > I haven't seen much enthusiasm for the second feature so far, although I > would find it quite useful. It would allow one-shot parsing which > produces either a collection of values or an exception, depending on > whether a successful match was found. I don't know whether it would not be too complex. The most extreme solution would be to have a scanner and a parser, and consider the command line a sentence in a language. For option processing, I think that solution is somewhat wildly outside what we want. Can you see a cheaper solution ? > Although I don't have the time to compare lots of libraries, I appreciate > the discussion of ideas. I feel that without discussion we could end up > with a library which suits a particular way of thinking without solving > some of the more fundamental problems involving command lines. > > This wouldn't be too bad, but I'm sure that many people would then go > back to writing their own parsers as a result. That is what you can see happening. Why else is getopt still the standard ? (not only in Python, but also in e.g. C or C++). Unless we close the gap between what option packages deliver and what programmers need, our solution is "just another option processing package". > > happens to function for some cases (like cvs). It does not handle all > > cases, and neither is there any hope that it ever will in its current form. > > I believe that we shouldn't build an option processing package on a case > by case basis. I agree, we should look at the big picture, and try to capture the general case. However, cases are useful for examples. Albert -- Constructing a computer program is like writing a painting From a.t.hofkamp@tue.nl Tue Mar 12 17:13:58 2002 From: a.t.hofkamp@tue.nl (A.T. Hofkamp) Date: Tue, 12 Mar 2002 18:13:58 +0100 (CET) Subject: [getopt-sig] More about commands on the command line In-Reply-To: <20020308140605.I17477@noris.de> Message-ID: On Fri, 8 Mar 2002, Matthias Urlichs wrote: > > It makes options and (command) more equal citizens. > > Ok, that's a valid point, though you also have to consider arguments > (either to the options, or to the program). > > I wonder if, with all the generality, people who try to use the package > would wonder "how the heck do I do the _common_case_". Packages like This is where documentation comes in. Lots of examples, careful seperation between the normal case and the 'dangerous' stuff. > Optick handle the common case well (and reasonably seamlessly) and force > people to go through some number of hoops for the special stuff, which is > something i consider a Good Thing. Yes, but Optik's usability (or rather the usability of our resulting package) should not drop to 0 when the demands are bigger than what can be delivered. > > like in 'tar xzf myfile.tgz'. Not pretty and not recommanded, but it fits in > > my solution without major head aches. > > Note that this is also a "magic" command, i.e. it rips itself apart just > like you "-spam" example. Good point. > Besides, traditionally Unix doesn't _have_ commands. It has differrent > tools. I consider "rpm -q" a totally different tool from "rpm -U", which Unix also didn't use to have graphics and stuff like Gnome and KDE. While I hate it, many users cannot live without (for example, we recently had complaints that some of our systems were unusable, because the new KDE and Gnome versions were not installed). Our package should be ready for the future in the sense that it should not stop people from solving their problem in the way they want to. That it takes extra effort to do something non-standard is not a problem in my opinion. > Your code certainly helps with (eventually ;-) arriving at some sort of > consensus as to what we want to accomplish. Let's hope so :-) Albert -- Constructing a computer program is like writing a painting From rsc@plan9.bell-labs.com Tue Mar 12 17:20:12 2002 From: rsc@plan9.bell-labs.com (Russ Cox) Date: Tue, 12 Mar 2002 12:20:12 -0500 Subject: [getopt-sig] More about commands on the command line Message-ID: <08e6c47aac70c9a6c9e26635fe445da7@plan9.bell-labs.com> Let's get Optik right for the common case and in the Python standard library and _then_ see what people want for nonstandard stuff. If we try to plan for everything people will want we'll end up with a tool that no one will want because it will be impossible to use and always do almost but not quite the right thing. It's not like Optik becomes immutable if we put it into the standard library. Back to more mundane things, Greg, why do you think that there should be both the list interface and the .add_option interface? Why not make the parser a subclass of list instead? Having two interfaces suggests to me that something is not quite right. The other problem with having more than one way to do it is that everyone will do it different ways, not know about the other ways, and be confused when they run into people who do it differently from them. Russ From a.t.hofkamp@tue.nl Tue Mar 12 17:35:57 2002 From: a.t.hofkamp@tue.nl (A.T. Hofkamp) Date: Tue, 12 Mar 2002 18:35:57 +0100 (CET) Subject: [getopt-sig] More about commands on the command line In-Reply-To: <20020308141739.GA1224@gerg.ca> Message-ID: On Fri, 8 Mar 2002, Greg Ward wrote: > Wild experimentation is always a good thing, but it almost certainly > doesn't belong in the Python standard library. So feel free to perform > your experiments and discuss them here, but keep in mind that the > primary purpose of this SIG is to come up with something better than the > existing getopt module. Experiments in command-line interface design, > interesting as they may be, are second-class citizens in the discourse > of this SIG. On the other hand, if an experiment succeeds to deliver something useful, we should adopt it. Note that even my experiment is better than getopt, so theoretically, it is a candidate solution :-) > > - Having a number of orthogonal pieces that the user can compose in > > any way he wants gives power to the user. He is able to use the > > pieces in ways we cannot imagine. > > I think others have beaten most of this issue to death, but I'd like to I think not, given the gap between what packages deliver and what programmers need. > add one point: there's such a thing as too much orthogonality, or too > much object-orientation. My case in point is the Java I/O library. I Yes there is, My idea of Options storing the value is an example where non-object-oriented is better than object-oriented. On the other hand, as long as more object-orientation delivers more useful power to the programmer, it is a step in the right direction. > pain lingers on. There are about 47 different fiddly little classes > (some of them not so little) involved in writing to a file with Java, I do have some space before being there :-) > python-dev a case of overgrown formatting classes in Java causing 7000 > function calls every time some app wrote to its log file. Yow! Wow, no surprise that Java is not one of the fastest languages in the west. > I think what I'm getting at here can be summed up quite simply: > premature generalization is the root of much evil. You should never Maybe, but generalization is very powerful in finding assumptions that linger somewhere deep in the code. Refactoring code for some case makes it easy to dismiss these assumptions as "a special case", in which case they continue to linger, hindering progress to the next level. > generalize/orthogonalize a design simply for the sake of it; rather, you > should implement, deploy, and *use* the simplest design you can, and True. Option processing at first seems to be a very simple operation in essence, and thus should need very little code. The fact that e.g. Optik (and other option processing packages as well) are large and complex instead. This is a strong indication that something fundamental is wrong in the implementation, or because option processing is not as simple as it looks. You are not going to find the cause by refactoring, because you are focussed at details in the code rather than the big picture. > then refactor as needed. That way, the divisions between your > classes/modules will fall along the lines actually needed in real life, > not along every possible line you could think of during the design. I don't do that afaik, at least I seperate experiments from discussions of whether or not they should be included. > At any rate, that's the way I'm approaching Optik. Starting from two > fundamental classes -- Option and OptionParser -- I have refactored many > methods to make subclassing easier. Lately, I have been thinking of > factoring a HelpFormatter class out of OptionParser. I have also been > thinking of splitting the Option class up along "action" lines -- > StoreOption, AppendOption, HelpOption, CountOption, etc. But those > divisions were *not* obvious from the start; they have only become clear > after several months and various attempts to add interesting-but-not- > essential functionality by subclassing. Yet the changes are not fundamental. For non-fundamental changes, refactoring is fine (just potentionally a lot of work). I try to find the fundamental elements of option processing (and beyond, if possible). Refactoring will not work there, because it is too much work to drag all code along. (that obviously does not exclude the possibility of making fundamental changes to a package by refactoring, it just takes more effort). In short, refactoring and making wild experiments are 2 different techniques aimed at finding out different things. Both have a place in the universe. Albert -- Constructing a computer program is like writing a painting From a.t.hofkamp@tue.nl Tue Mar 12 17:41:25 2002 From: a.t.hofkamp@tue.nl (A.T. Hofkamp) Date: Tue, 12 Mar 2002 18:41:25 +0100 (CET) Subject: [getopt-sig] More about commands on the command line In-Reply-To: <08e6c47aac70c9a6c9e26635fe445da7@plan9.bell-labs.com> Message-ID: On Tue, 12 Mar 2002, Russ Cox wrote: > Let's get Optik right for the common case and > in the Python standard library and _then_ see > what people want for nonstandard stuff. > > It's not like Optik becomes immutable if we put it > into the standard library. I know, but it not a matter of modifying, it is more a matter of restarting from scratch. Also, once you have published a 'standard', it becomes extremely difficult to modify the interface because it will break existing code. Therefore, making sure that Optik is useful even for non-standard situations now, saves a lot of head aches later. Albert -- Constructing a computer program is like writing a painting From david@sleepydog.net Tue Mar 12 18:11:17 2002 From: david@sleepydog.net (David Boddie) Date: Tue, 12 Mar 2002 18:11:17 +0000 Subject: [getopt-sig] More about commands on the command line In-Reply-To: References: Message-ID: <20020312181344.4F8762B0B8@wireless-084-136.tele2.co.uk> On Tuesday 12 Mar 2002 5:07 pm, you wrote: > On Fri, 8 Mar 2002, David Boddie wrote: > > We give the parser the ability to parse different styles of command > > lines. > > Good. I haven't yet seen that here (or maybe I wasn't looking). Can you > give a better explanation of this interesting subject ? We can treat the command line input as having a particular style. Options may begin with certain characters such as "-", the style may allow "--" as well, and "-abc" may be automatically expanded to "-a -b -c". The style may forbid "-spam" as a single option called "spam", or allow such options as in the style of the "xscreensaver-command" utility. We can also specify a syntax definition in a number of styles, but that might prove to be distracting, although it might be useful for certain audiences. For example: Imagine a platform in which options are specified using the "/" character rather than the "-" character. A syntax string might be written for the benefit of the user as: Syntax: myscript infile /o outfile This is then interpreted in the same way as we might interpret Syntax: myscript infile -o outfile > How can we specify this in a enough generic way ? I think that a solution which is fairly specific to the command line might be reasonably achieved by identifying the types of object which are commonly found on command lines. Objects such as options, commands and positional arguments tend to be treated differently, although in some cases they may be equivalent. > > I agree. We are in danger of rewriting options packages to deal with many > > special cases rather than addressing the more general problem. > > I agree (obviously :-) ). The agreement operator is commutative for this case. ;-) > > Indeed, in the cvs-like case, the complexity of the command line syntax > > is being "passed upwards" to the programmer, who then may have to perform > > simple syntax checking on command lines. > > I am not entirely sure whether that can be eliminated completely, but at > least an attempt should be made. I think that the use of a number of complementary libraries might be useful in dealing with complexity. While one library may be unsuitable for some tasks, another may be very good at exactly those kinds of tasks. > I think it is not our job to define what a programmer can or cannot do. > Given the huge changes in computing, who knows what programmers can and > want tomorrow ? The styles of command lines in use now may not change much in the future, but it would be a brave person who suggests that they will never change. More interesting developments might come from the use of command line syntax definitions to construct simple graphical user interfaces. If this occurred automatically as a side effect of using the standard library then non-command line users also benefit from the development of command line applications. I'm not suggesting that this module should include this, but it is interesting to consider the possibility. > We should also consider platform-specific issues. For example, can and > should we handle the Windows-style of options, and/or the Mac-style (if it > has one) ? The "os" module might be the correct place for such information. It could be useful to consider such a feature for the sake of native users on any given platform. > > I haven't seen much enthusiasm for the second feature so far, although I > > would find it quite useful. It would allow one-shot parsing which > > produces either a collection of values or an exception, depending on > > whether a successful match was found. > > I don't know whether it would not be too complex. The most extreme solution > would be to have a scanner and a parser, and consider the command line a > sentence in a language. For option processing, I think that solution is > somewhat wildly outside what we want. For certain types of syntax, it would make life easier to have a library deal with the structure of the syntax and extract values from the user input. For the test case at http://www.python.org/sigs/getopt-sig/compare.html a solution which just takes the options it finds is clearly going to be less work. In fact, my original solution scaled very badly with increasing numbers of consecutive optional arguments, so 14 consecutive options was quite a challenge for it. > Can you see a cheaper solution ? The library might determine the location of positional arguments automatically and treat the groups of options between them as independent entities. This is more or less along the lines of earlier suggestions except that the application programmer wouldn't have to perform the necessary operations manually. > > This wouldn't be too bad, but I'm sure that many people would then go > > back to writing their own parsers as a result. > > That is what you can see happening. Why else is getopt still the standard ? > (not only in Python, but also in e.g. C or C++). > > Unless we close the gap between what option packages deliver and what > programmers need, our solution is "just another option processing package". I agree completely. > > I believe that we shouldn't build an option processing package on a case > > by case basis. > > I agree, we should look at the big picture, and try to capture the general > case. However, cases are useful for examples. We have to start somewhere, yes. David ________________________________________________________________________ This email has been scanned for all viruses by the MessageLabs SkyScan service. For more information on a proactive anti-virus service working around the clock, around the globe, visit http://www.messagelabs.com ________________________________________________________________________ From gward@python.net Tue Mar 12 23:29:37 2002 From: gward@python.net (Greg Ward) Date: Tue, 12 Mar 2002 18:29:37 -0500 Subject: [getopt-sig] More about commands on the command line In-Reply-To: <08e6c47aac70c9a6c9e26635fe445da7@plan9.bell-labs.com> References: <08e6c47aac70c9a6c9e26635fe445da7@plan9.bell-labs.com> Message-ID: <20020312232937.GA21831@gerg.ca> On 12 March 2002, Russ Cox said: > Back to more mundane things, Greg, why do you > think that there should be both the list interface > and the .add_option interface? Silly historical reasons, probably dating back to the Getopt::Tabular Perl module I wrote about five years ago. add_option() was added when I realized how easy and elegant it was, which is why the documentation favours it. > Why not make the > parser a subclass of list instead? Nah, I'd rather say add_option() than append(). It's clearer that way. I rarely see a burning need to treat an OptionParser as a list of options. Yes, order matters so that the help looks sensible, but that's about it. Greg -- Greg Ward - nerd gward@python.net http://starship.python.net/~gward/ "Question authority!" "Oh yeah? Says who?"