Channels ▼
RSS

Mobile

Gesture-Based Computing for the Desktop


With the heavy focus on mobile devices, who would have thought there was still opportunity for innovation in the desktop space? Intel believes there is, and recently released its Perceptual Computing SDK. When combined with consumer-friendly, infrared-equipped cameras by hardware partners, the SDK can be used to develop highly interactive applications. For example, by waving your hand or pinching your fingers in front of the camera, you can interact with visual objects rendered on the computer's display. In this article, I explore the SDK and the possibilities of using the technology in everyday computing.

Getting Started

Developers can start exploring the freely available SDK right away by visiting Intel's Perceptual Computing home page. But you won't be able to test the sample applications or code you write until you purchase the Creative Interactive Gesture Camera Development Kit for $149. At the moment, this is the only camera hardware available that works with these applications, though if perceptual computing on the PC takes off, there will likely be other manufacturers to choose from.

You’ll need to install either the 32-bit or 64-bit version of SDK for Microsoft Windows 7 or higher running on a least a Core-generation Intel processor. The SDK installation will add the camera drivers, which Windows will automatically detect and configure. For the camera to capture gestures properly, position it so that it sits on top of a monitor and angle it such that it captures a field of view as would a typical webcam. Unlike the Microsoft Kinect, distinct gesture captures for the Creative hardware work best between roughly 6 and 12 inches from the camera. You can verify this distance check by running the gesture_viewer.exe sample application that comes with the SDK.


Figure 1: Running the Gesture Viewer sample application.

Next, check out some of the free demos that Intel has posted for download from their Showcase Applications Web page to see some interesting uses of the technology in action. These samples include games like SoftKinetic's Ballista and Fingertapp's Kung Pow Kevin to other playful SoftKinetic graphic demos like Lightning and Solar System.


Figure 2: Create the Solar System with a show of hands.

These showcase applications will give you an idea of some of the possibilities that gesture-based computing can be used for, but they are just the tip of the iceberg. There is much more that awaits in the SDK samples, ranging from face tracking, to text-to-speech and voice recognition, along with the various gesture analysis examples.

Development

Before diving into the code itself, it's worthwhile to spend a few hours reading over the various SDK reference manuals and guides. These can be downloaded from Intel's website.

The primary Intel Perceptual Computing SDK manual is packed with more than 130 pages of documentation listing the primary interfaces and utility classes to access the core framework. While the kit is predominantly aimed at Visual C++ developers, Intel has also included C# core framework access (via the libpxcclr.dll library wrapper) along with several examples showing C# developers how easy it is to connect to the hardware an interpret results.

The Perceptual Computing 2013 SDK Beta I used for this article still qualified as beta for a reason. I was reminded of this fact several times while developing my own perceptual computing-based applications. The first issue I encountered had to do with Visual Studio 2012. Due to the changes that Microsoft made in Visual C++, the SDK samples refused to compile. Fortunately, there was a simple fix. After opening the libpxcutils common library project in the SDK samples folder, and then recompiling the source code for that dependency, the samples were back up and ready to go. Intel provides the C++ sample code in Listing One to confirm that you're connected to the camera:

Listing One: Confirming the SDK is working.
#include "stdafx.h"
#include "pxcsession.h"
#include "pxcsmartptr.h"
int _tmain(int argc, _TCHAR* argv[]) {
      PXCSmartPtr<PXCSession> session;
      PXCSession_Create(&session);
      for (int i=0;;i++) {
  PXCSession::ImplDesc desc;
  pxcStatus sts=session->QueryImpl(0,i,&desc);
  if (sts<PXC_STATUS_NO_ERROR) break;
      wprintf(L"Module: %s, iuid=0x%x\n",desc.friendlyName,desc.iuid);
      }
    return 0; 
}

Running the application will also help to verify that the SDK and your library paths have been properly configured. It will also list all the Perceptual Computing SDK modules that have been installed and are available for SDK development.

A typical Perceptual Computing application involves a four-step process: session creation via PXCSession_Create, module creation via CreateImpl, then calling upon the module(s) created, and finally module resource release and termination of the program. Gesture recognition is based on loops that constantly poll the depth camera for changes to each frame that is captured.


Figure 3: Running the UV Map helps to distinguish subject matter from background.

Abstract Interfaces include PXCImage (and its 11 member functions) for images, PXCAudio (and its nine member functions) for audio capture, and PXCAccelerator for context to these storage types. You can also call upon the ReadStreamAsync and UtilPipeline (covering essential frame data, capture, face analysis, finger tracking, and voice-recognition functions) interfaces to perform synchronized reads of incoming color and depth streams.

Other modules include Face Detection, Analysis, and Recognition (PXCFaceAnalysis), Gesture Recognition (PXCGesture), Voice Recognition (PXCVoiceRecognition), and Synthesis (PXCVoiceSynthesis) via a Dragon Speaking add-in. There is also support for third-party modules and game engines like Unity, Processing, and openFrameworks, and as well as the creation of your own Perceptual Computing modules.


Figure 4: Face recognition is another functional aspect available in the Perceptual Computing SDK.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video