[slightly off-topic] programmable speech recognition software

Tim Peters tim_one at email.msn.com
Fri Oct 22 01:51:21 EDT 1999


[Stefan Franke]
> I'm looking for the appropriate SR software to build a part of a room
> installation

What is "a room installation"?

>. The task differs somewhat from the usual "dictate into word"
applications:

Bad sign <0.1 wink>.

> In the recorded audio stream should be recognized as many words as
> possible - speaker independent and in German! (I can hear you shudder)

Speaker-independent is hard.  From a recording is hard.  If people aren't
speaking into a high-quality microphone, it's very hard.  If it's not
intentional speech (i.e., if people don't know they're talking to a
computer -- they're just chatting), it's very hard.  The good news is that
German is easy (compared to the rest of this).

> In unclear cases - there will probably be lots of them - the closest
> interpretations would be OK.

I doubt you can do this yourself and get acceptable results with any
available SR software.  It's an extremely difficult problem.  Taking
software designed for close-talking microphone, speaker-dependent, "dictate
into word" applications, and trying to apply it to a radically harder task,
is like hoping (say) Perl can be used to get real work done <wink>.  The
closest Dragon Systems gets is described at:

    http://www.dragonsys.com/products/audiomining/index.html

Note that there's no product mentioned there -- Dragon has appropriate
technology, and that's what the page talks about.  This is so bleeding edge
there are no pre-packaged applications for sale.  Dragon would be happy to
build one for you, though, in return for an insignificant percentage of
Germany's gross national product <wink>.

> Therefore completeness and correctness don't have the highest priority
> in this case.
>
> The application is supposed to run on a Win 9x machine. Direct
> accessibility from Python would be super, otherwise possible via
> COM/SAPI from M$, admittedly I don't know anything about its
> capableness.
> I read the postings about NatLink in c.l.p,

Dragon's NaturallySpeaking includes a SAPI server accessed via COM, and a
Python binding to that is the core of what Joel Gould's (NatSpeak's original
architect, BTW!) NatLink offers.  AFAIK, it is the only Python interface
available to any SR software:

    http://www.synapseadaptive.com/joel/natlink.htm

> but in absence of a demo version of NatSpeak I cannot tell anything about
> its suitability, so I would be pleased to hear some opinions of this
group's
> numerous SR experts!

Sorry not to be more encouraging, Stefan -- you may be able to tell that I
don't work for the marketing dept <wink>.  Seriously, NatSpeak works very
well for its intended purposes, but your purposes are far from those in
several crucial respects.  If you want to play with SR on MS systems to "get
a feel for it", visit the little-known:

    http://www.microsoft.com/IIT/

You can download a free SR engine and development kit (note that this is for
hard-core Windows developers; there's nothing of use to end users there).

if-machines-could-understand-speech-god-wouldn't-have-felt-compelled-
    to-create-humans-ly y'rs  - tim


PS:  I don't speak for Dragon Systems.  If they thought you thought I did,
they'd probably die laughing.






More information about the Python-list mailing list