Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Embedded Systems

Vocal Joystick Uses Voice Input


Hannah Hickey is an engineering writer for the University of Washington.


The Internet offers wide appeal to people with disabilities. But many of those same people find it frustrating or impossible to use a handheld mouse. Software developed at the University of Washington provides an alternative using the oldest and most versatile mode of communication: the human voice.

"There are many people who have perfect use of their voice who don't have use of their hands and arms," said Jeffrey Bilmes, an associate professor of electrical engineering. "I think there are several reasons why Vocal Joystick might be a better approach, or at least a viable alternative, to brain-computer interfaces."

Vocal Joystick detects sounds 100 times a second and instantaneously turns that sound into movement on the screen. Different vowel sounds dictate the direction: "ah," "ee," "aw" and "oo" and other sounds move the cursor one of eight directions. Users can transition smoothly from one vowel to another, and louder sounds make the cursor move faster. The sounds "k" and "ch" simulate clicking and releasing the mouse buttons.

Versions of Vocal Joystick exist for browsing the Web, drawing on a screen, controlling a cursor and playing a video game. A version also exists for operating a robotic arm, and Bilmes believes the technology could be used to control an electronic wheelchair.

Existing substitutes for the handheld mouse include eye trackers, sip-and-puff devices and head-tracking systems. Each technology has drawbacks. Eye-tracking devices are expensive and require that the eye simultaneously take in information and control the cursor, which can cause confusion. Sip-and-puff joysticks held in the mouth must be spit out if the user wants to speak, and can be tiring. Head-tracking devices require neck movement and expensive hardware.

Vocal Joystick requires only a microphone, a computer with a standard sound card and a user who can produce vocal sounds.

"A lot of people ask: 'Why don't you just use speech recognition?'" Bilmes said. "It would be very slow to move a cursor using discrete commands like 'move right' or 'go faster.' The voice, however, is able to do continuous commands quickly and easily." Early tests suggest that an experienced user of Vocal Joystick would have as much control as someone using a handheld device.

In the laboratory, Univeristy of Washington doctoral student Jonathan Malkin, who helped develop the tool, uses Vocal Joystick to play a game called Fish Tale. It takes two minutes to train the program for Malkin's voice. He then moves the fish character easily around the screen, raising his voice slightly to speed up and avoid being eaten by a predator fish.

The newest development uses Vocal Joystick to control a robotic arm. The pitch of the tone moves the arm up and down; other commands are unchanged. This is the first time that vocal commands have been used to control a three-dimensional object, Bilmes said.

One initial concern, he said, was whether people would feel self-conscious using the tool.

"But once you try it you immediately forget what you're saying," Bilmes said. "I usually go to the New York Times' Web site to test the system and then I get distracted and start reading the news. I forget that I'm using it."

To test the device, the group has been working with about eight spinal-cord injury patients at the UW Medical Center since March.

"It's a really exciting idea. I think it has tremendous potential," said Kurt Johnson, a professor of rehabilitation medicine who is helping with the tests.

Bilmes said he hopes people will become more adept at using the system over time. Future research will incorporate more advanced controls that use more aspects of the human voice, such as repeated vocalizations, vibrato, degree of nasality and trills.

"While people use their voices to communicate with just words and phrases," Bilmes said, "the human voice is an incredibly flexible instrument, and can do so much more."


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.