Channels ▼
RSS

New Gestural Computing System


On December 19 at Siggraph Asia, researchers from The Massachusetts Institute of Technology will present a technique for turning innovative new touch-screen displays into giant lensless cameras. They've also built a prototype that demonstrates one application of such displays — letting users manipulate on-screen images using hand gestures.

Many other researchers have been working on such gestural interfaces, which would, for example, allow computer users to drag windows around a screen simply by pointing at them and moving their fingers, or to rotate a virtual object through three dimensions with a flick of the wrist. But existing systems "usually involve having a roomful of expensive cameras or wearing tracking tags on your fingers," says Matthew Hirsch, a PhD candidate at the Media Lab who, along with Media Lab professors Ramesh Raskar and Henry Holtzman and visiting researcher Douglas Lanman, developed the new display. Some experimental systems instead use small cameras embedded in a display to capture gestural information. But because the cameras are offset from the center of the screen, they don't work well at short distances, and they can't provide a seamless transition from gestural to touch screen interactions. Cameras set far enough behind the screen can provide that transition, as they do in Microsoft's SecondLight, but they add to the display's thickness and require costly hardware to render the screen alternately transparent and opaque. "The goal with this is to be able to incorporate the gestural display into a thin LCD device" — like a cell phone — "and to be able to do it without wearing gloves or anything like that," Hirsch says.

The Media Lab system requires an array of liquid crystals, as in an ordinary LCD display, with an array of optical sensors right behind it. The liquid crystals serve, in a sense, as a lens, displaying a black-and-white pattern that lets light through to the sensors. But that pattern alternates so rapidly with whatever the LCD is otherwise displaying — the list of apps on a smart phone, for instance, or the virtual world of a video game — that the viewer never notices it.

The simplest way to explain how the system works, Lanman says, is to imagine that, instead of an LCD, an array of pinholes is placed in front of the sensors. Light passing through each pinhole will strike a small block of sensors. Since each pinhole image is taken from a slightly different position, all the images together provide a good deal of depth information about whatever lies before the screen.

The problem with pinholes, Lanman explains, is that they allow very little light to reach the sensors, so they require exposure times that are too long to be practical. So the LCD instead displays a pattern in which each 19-by-19 block of pixels is subdivided into a regular pattern of black-and-white rectangles of different sizes. Since there are as many white squares as black, the blocks pass much more light.

The 19-by-19 blocks are all adjacent to each other, so the images they pass to the sensors overlap in a confusing jumble. But the pattern of black-and-white squares allows the system to computationally disentangle the images, capturing the same depth information that a pinhole array would, but capturing it much more quickly.

LCDs with built-in optical sensors are so new that the Media Lab researchers haven't been able to procure any yet, but they mocked up a display in the lab to test their approach. Like some existing touch screen systems, the mockup uses a camera some distance from the screen to record the images that pass through the blocks of black-and-white squares. But it provides a way to determine whether the algorithms that control the system would work in a real-world setting. In experiments in the lab, the researchers showed that they could manipulate on-screen objects using hand gestures and move seamlessly between gestural control and ordinary touch screen interactions.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video