Channels ▼
RSS

Tools

Voice: It's The New UI


Processing Voice User Commands

To process user voice commands use speech-recognition services that are part of the System.Speech.dll assembly. If you include the System.Speech.Recognition namespace, you'll be able to access the classes, types, and enumerations offered by the speech-recognition wrapper:


using System.Speech.Recognition;

Speech-recognition engines are complex and require dozens of parameters. This example simplifies recognition, using a limited list of user commands. You have to create a new instance of the SpeechRecognitionEngine class, accessible to many methods and events that will interact with it:


private SpeechRecognitionEngine _recognitionEngine= new SpeechRecognitionEngine();.

You can then define a list of alternative items to make up an element in a grammar. This list is a Choices instance, which creates a new GrammarBuilder and then a Grammar that's loaded to the engine. The following code defines five possible voice commands:


string[] voiceCommands = new string[]
{
    "Favorite news",
    "Favorite movies",
    "Weather forecast",
    "New blog entry", 
    "New word document" 
};
var comChoices = new Choices(voiceCommands);
var comGrammarBuilder = new GrammarBuilder(comChoices);
var comGrammar = new Grammar(comGrammarBuilder);
_recognitionEngine.LoadGrammar(comGrammar);

There are many other ways to create grammars that can accept more complex commands. In fact, you can also work with XML elements defined under the Speech Recognition Grammar Specification (SRGS).

Now, it is necessary to add event handlers to the following events that the recognition engine is going to fire:

  • SpeechDetected. The user started talking.
  • SpeechRecognized. The recognition engine was capable of recognizing one of the voice commands. It is possible to check the results by adding code in an event handler attached to this event.
  • RecognizeRejected. The user began talking but the recognition engine wasn't capable of understanding a voice command.
  • RecognizeCompleted. The recognition engine finished its asynchronous execution. The code written in an event handler attached to this event will be executed after calling the event handler for either SpeechRecognized or RecognizeRejected.


_recognitionEngine.SpeechDetected += new EventHandler<SpeechDetectedEventArgs>(recognitionEngine_SpeechDetected);
_recognitionEngine.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(recognitionEngine_SpeechRecognized);
_recognitionEngine.RecognizeCompleted += new EventHandler<RecognizeCompletedEventArgs>(recognitionEngine_RecognizeCompleted);
_recognitionEngine.SpeechRecognitionRejected += new EventHandler<SpeechRecognitionRejectedEventArgs>(recognitionEngine_SpeechRecognitionRejected);

The following sample code shows definitions for the four event handlers. The e.Result.Text property in the SpeechRecognized event handler contains the recognized phrase that is going to match one of the previously added commands. Therefore, according to the recognized phrase, it is possible to execute an action.


private void recognitionEngine_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e)
{
    // Start another recognition
    _recognitionEngine.RecognizeAsync();
}

private void recognitionEngine_SpeechRecognitionRejected(object sender, SpeechRecognitionRejectedEventArgs e)
{
    // Tell the user you didn't understand him
}

private void recognitionEngine_SpeechDetected(object sender, SpeechDetectedEventArgs e)
{
    // Do something when the user's speech is detected
}

private void recognitionEngine_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
    if (e.Result.Text == "Favorite news")
    {
        // Start the application or go to the Web Site with the favorite news
    }
     // Check the other possible commands
}

It is very easy to add simple speech recognition capabilities to an existing C# application targeting. There are dozens of additional options because the speech recognition engine is very powerful and it allows a very complex customization to improve its efficiency. However, it was necessary to start with a simple example.

Conclusion

Speech may be a natural evolution from keyboards and touch screens, but the APIs required to work with speech-related services are complex. Windows 7 gives developers the ability to create speech-aware apps through performance and accuracy improvements to the speech-recognition engine. Once you begin working with speech-aware apps, you'll find great opportunities to take advantage of this natural user interface.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video