Help Rewriting/Refactoring/Rethinking Parsing Algorythm
Mike C. Fletcher
mcfletch at home.com
Sun Mar 18 23:32:02 EST 2001
Not precisely what you're trying to do, but I do most of my programming with
voice dictation (Dragon NS 5), and I find that a few command phrases are
fairly useful and let me program in just about any editor:
py-equal (pie-equal) -> ==
py-init -> __init__
py-main -> __main__
py-name -> __name__ # would be redefined for you...
py-def -> def
dublex (dooblex) -> wx\Caps Next Word\No-Space After
triple-quote -> \No-Space ''' \No-Space
py-len -> len
Looking at your list, adding:
py-caps -> \No-Space-On\Caps-On
py-norm -> \No-Space-Off\Caps-Off
py-name -> <word> \No-Space-On\Caps-On [ not sure if you can do this
without a macro... ]
Would give you:
py-name number customers py-norm
py-caps Abstract Class py-norm
if error type py-equal 5 then
dublex-python equalsign 6
py-name whiskey type py-norm equalsign 'peachy'
print "what's wrong with %s?" % py-name first name py-norm
For the indicated phrases. I find that the py-phrases give a decent
recognition rate. With vocedit (for Dragon) you can setup the options for
the words easily (e.g. for dublex-, and making _ put no spaces around
itself).
I like the idea of adding "py-name" modes to the mix (would save lots of
mucking around with \Cap \No-space. Will give it a try. The dublex- thing
saves lots of headaches for me, I had a similar one for Fox when I was using
that library.
As for travelling the parsing-path:
I think Aycock's parsing framework would fairly easily handle this kind of
work, it's got an extremely flexible algo (which is apparently fairly slow,
but should hold for interactive work I'd think).
Good luck, will be interested to see what you finally build,
Mike
-----Original Message-----
From: Boopy Bootles [mailto:aschneid at mindspring.com]
Sent: Sunday, March 18, 2001 8:57 PM
To: python-list at python.org
Subject: Help Rewriting/Refactoring/Rethinking Parsing Algorythm
I'm trying to write a simple piece of code to make programming by voice
recognition software easier to do. I wrote a very simple function that
would, for example, convert "number customers equals 5" to
"numberCustomers = 5". But once I started using it, I quickly
discovered several more cases I had to handle. So far, this is the
list:
number customers - > numberCustomers
Abstract Class -> AbstractClass
if error type == 5 then -> if errorType == 5:
whiskey x-ray python equals 6 -> wxPython = 6
normal whiskey type = 'peachy' -> whiskeyType = 'peachy'
print "what's wrong with %s?" % first name -> print "what's
wrong with %s?" % firstName
(the whiskey x-ray stuff is the International Communications Alphabet,
which you use when spelling something out in Dragon NaturallySpeaking if
NaturallySpeaking is having trouble understanding you).
I've gotten all but the last case to work. But once I started trying to
incorporate the last case--not messing up quoted strings--my already
overly messy code turned into a hideous snarl. I know there's _got_ to
be a better way to parse this input, but I don't have a clue where to
start. I'd like to avoid building a full-blown language parser, which
seems like overkill.
I've included the code below, which correctly translates all but the
last case. Any thoughts would be greatly appreciated, esp. thoughts re:
a simple object-oriented approach (I'm pretty sure there is one, I just
don't have enough experience writing OO code to figure it out).
Thanks,
Anders Schneiderman
P.S. Once I've gotten this code in better shape, I'll post it
somewhere. Even at this primitive stage, it makes a _huge_ difference
in writing code by voice (I use NaturallySpeaking plus Natlink, Joe
Gould's terrific system for writing NaturallySpeaking macros using
Python).
---------------------------------------------------------------------------
""" voicecode.py: routines for translating and otherwise manipulating
voice input into code.
"""
from string import *
SpecialWords = {'equals': ' = ', '=': ' = ', '==': ' == ', '%': ' % ',
'if': 'if ', 'then': ':', 'elsif': 'elif', 'dot': '.', '.':'.',
'open': ' = open(',
'try': 'try:', 'init': '__init__ (self, ', 'define': 'def ',
'except':'except:',
'finally':'finally:', 'tab':' ', 'blank': ' '}
ICA = {'alpha': 1, 'bravo': 1, 'charlie':1, 'delta':1, 'echo':1,
'foxtrot': 1, 'golf':1, 'hotel':1, 'india':1, 'juliet': 1,
'kilo':1, 'lima': 1, 'mike':1, 'november':1, 'oscar':1,
'papa':1, 'quebec':1, 'romeo':1, 'sierra':1, 'tango':1,
'uniform':1, 'whiskey':1, 'x-ray':1, 'xray':1, 'yankee':1,
'zulu':1 }
quote = {"'":1, '"':1 }
def translate(words):
"""Given an array of words, translate into code.
The rules are:
* In general, convert lists of words into wordWordWord
* For certain words/symbols, aka "special words", convert 'em
* For ICA words--alpha, bravo, etc.--convert to a single letter
* If the word is "normal", use the next word exactly as is
NOTE: Right now, this will NOT work on lines that are quoted:
print 'this is a test' will translate to 'thisIsATest'.
I need to find a cleaner way to solve this problem.
"""
line = ''
firstWord = 1
normalWord = 0
for word in words:
if normalWord:
# This word should be used exactly as is (prev word was
'normal')
if firstWord:
# Don't capitalize the first word of a variable or a
word in a quote
line = line + word
firstWord = 0
else:
line = line + capitalize(word)
normalWord = 0
elif lower(word) == 'normal':
normalWord = 1 # next word should be used exactly
as is
elif SpecialWords.has_key(word):
line = line + SpecialWords[word]
firstWord = 1
elif ICA.has_key(lower(word)):
# International Alphabet -- convert 'alpha' to 'a', etc.
line = line + word[0]
firstWord = 0
elif firstWord:
# Don't capitalize the first word of a variable or a word in
a quote
line = line + word # Don't capitalize the first word of a
variable
firstWord = 0
else:
line = line + capitalize(word)
return line
--
http://mail.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list