need some guidance on Python syntax smart editor for use with speech recognition
Eric S. Johansson
esj at harvee.org
Mon Jan 5 02:43:11 EST 2015
Some of you will recognize me as someone who pops up occasionally asking
questions as I grope my way to a usable speech driven programming
environment. My last set of experiments with a technique called
togglename and speech driven template notation hit a pretty nasty wall
of usability because of a fundamental incompatibility between GUIs and
speech recognition and the lack of support Nuance gives to disabled
users in general.
Before anybody suggests it, yes I know about that guy who gave a talk at
a python convention and uses what we call the burp, belch, and fart
school of speech recognition engine abuse. yes that is actually an
affectionate description. :-) what he did is impressive but it's not
where I'm going
I think the techniques I was experimenting with are good ones because
they do make it easier to speak code. the problem comes about because of
the irreversibility of the transformation making editing code as
difficult as it was before.
A little background. Today, Python is an amazingly speech recognition
friendly programming language (especially if you ignore pep-8). Using
simple macros, you can pretty much noodle along and write code
relatively easily. A few more specialized pieces and it's almost easy to
rip, shred, and tear code into new shapes as you realize you went down
the wrong path but still have lots of good idioms.
However, as easy as it is to noodle along, creating code I find myself
somewhere around 0.8 as effective as I was with my hands and in editing
code, I'm around 0.5 or less. My goal is to make speech driven
programming at least on a parity with someone who has useful hands and
hopefully 3 to 5 times faster.
a few years ago, a disabled friend of mine pointed out that the hard
problem was not the creation of code but the editing of code. I took his
observations to heart and have been working on trying to create a speech
friendly environment that that can transform from the speech notation to
the code notation and back again and still remain functionally
identical. I have some ideas but I need some outside perspective from
people who know Python better than I do.
The core of the idea is an editor which can present code in two forms.
The first form is what you guys all know in love but is horrible to
speak. The second form is something that is easy to speak, and as I said
above, functionally identical to the code form. An ideal solution would
give me the ability to toggle back and forth between these two
representations. An experiment would be to play with is displaying both
representations at the same time so you can see what you speak in near
real-time.
The speech environment lends itself to speaking the broad intent and
then answering questions to fill in the detail to create something
concrete. For example, in one of my prototypes (shown below), I state
that I want a class. Then I fill a detail like an initialization
function, inheriting from a parent, copying in all the arguments etc.
and I end up with a full class definition much more quickly than I could
even type it with good hands. This is what I meant above by 3 to 5 times
faster than hand generated code.
But with every experimental success, there is usually more than one
problem. In this case is that I lose all the meta-information when I
create the instance of the intent plus detail. I can't go back to that
abstract form.
The obvious answer is saving that meta-information in conjunction with
the code but when working in a team environment, that information is
going to drive you handies up the wall because it's going to visually
overwhelm the actual code. Serving the meta-information separately will
mean it's even harder to recover a speech friendly version of the code
after it's been touched.
Another thought experiment has been with always generating syntactically
correct code and basing various code generation and navigation
constructs around that.
So the questions I have right now are, or
what's a good open editor ( preferably multiplatform) that actually
decomposes Python code into fundamental components such as class,
expression, etc. and, lets you operate on those components? this is in
contrast to editors such as Emacs which give you some fundamental pieces
you can operate on but it's really character oriented and all of the
syntax smartness not really available for coupling to speech recognition
environment. it would be great if it was in Python so I don't have to
learn yet another fricking language.
What would be the best way to store meta-information necessary to
re-create the speech friendly presentation of code? I don't know if this
is possible but I would like to be able to let handy programmers make
changes that will be propagated automatically into the speech friendly
code presentation without forcing them to learn the new notation.
An example of this is the definition of the class. In my world, a class
definition looks like this:
uses name:sta
uses init:yes
uses parent:dict
uses arg_list:magic dictonary, long sting, nuance sucks
uses super_arg_list:$arg_list
template class
Note: yes, I speak every single character or type it but with a smart
editor, there's a bunch of optimizations one can use in data entry given
the context. also, since I wrote this example, I realize that the uses
statement is superfluous and I could just use template: <name> As the
trigger for creating the instance of the template.
going from speech notation to code notation, I generate this:
class simple_class (super_nasty_class):
def __init__(self, magic dictonary, long sting, nuance sucks):
super(simple_class,self).__init__(magic dictonary, long sting,
nuance sucks)
Note: there is a mix of, what I call, codenames and string names in
these examples. The togglename process would transform all string names
into codenames at some later point in the user experience.
To elaborate on an earlier question, if someone put a doc string into
the class definition I would need to be able to recognize it and put it
back into the speech friendly form. Something like this:
class simple_class (super_nasty_class):
"""this is a real simple class to identify problems in the
speech user interface
"""
def __init__(self, magic dictonary, long sting, nuance sucks):
super(simple_class,self).__init__(magic dictonary, long sting,
nuance sucks)
when transformed back into speech friendly form, it should look like:
uses name:sta
uses init:yes
uses parent:dict
uses arg_list:magic dictonary, long sting, nuance sucks
uses doc_string:
this is a real simple class to identify problems in the speech user
interface
uses super_arg_list:$arg_list
template class
Speech driven programming is a hard problem. So thoughts, ideas would be
welcome. Don't worry about giving me old ideas that have been looked at
and rejected because you may have a take on it that I haven't seen
considered and it's worth trying.
Thank you for reading this far. I know it's a long message and on an
unfamiliar topic so I appreciate your attention.
--- eric
More information about the Python-list
mailing list