Gesture Recognition Moves Beyond Gaming
Test and Monitor | Posted May 22, 2011

Whether by hacking or SDK, new gesture-based hardware devices open up interface vistas for software developers. Here’s some ways that you could incorporate gesture technology in your own applications. Or just fantasize about it, perhaps.

Gesture recognition holds the promise of new frontiers in user interface that will change the face of coding forever, in much the same way that touch technology did a few years ago. And while touch-based computing has been around for decades, it wasn’t until the advent of the smartphone and “smart devices” like tablets that the technology moved out of niche markets and into the mainstream.

Gesture is looking as though its moment in the sun has finally arrived, too. This natural way to interface (device-less) with our digitally connected world through gesture recognition systems offers the promise of delivering “sky’s the limit” interface functionality to programmers who have a creative touch. “Dance a jig—and I’ll crunch those polygons...” your next-gen computer may say.

While gesture-based technology, like touch, has been around for decades (anyone own a Gyration mouse?) it wasn’t until the advent of another ground breaking device — the Microsoft (MS) Kinect  — that mainstream consumer electronics took notice.

Like all must-have new devices, one of the first things Kinect did was to start a hackers revolution. The input hardware was originally intended for Xbox only, and at first, the company’s stodgy management could only see this as the primary application platform. To Microsoft’s credit, the company quickly reversed its NIH (not invented here) thinking and published a (non-commercial) software developers kit (SDK), eliminating the need to “hack” a solution (at least for Windows based systems).

Mind you, there is still a healthy Kincect hacking community (such as with hundreds of applications already developed for multiple platforms — including one that supports rival Sony’s PS2 game console. This will continue until Microsoft’s commercial SDK ships later this year, and offers support for other (non Windows based) platforms.

But the world has shifted for the once game-centric Kinect, and its Interactive Entertainment Business (IEB) home at Microsoft. The company is unleashing the power of its 800-member strong research wing known as MSR (short for Microsoft Research) to embrace application developers in academia and the like, with the goal, “…to foster creativity, experimentation, and new research directions,” according to the company’s chief researcher, Craig Mundie. He believes, “The resulting creativity and invention will open up a whole new world of possibilities for computing.”

Numbers also tell the story of gesture computing’s sudden rise, based on the Microsoft innovation that became an instant hit. So much so, that in the first 50 days, Microsoft’s Kinect (as an Xbox accessory, mind you) sold more than 8 million devices, and grabbed the coveted “fastest selling gadget” title, still warm from the freshly baked “iPad” oven over at Apple.

Jobs and company had just a few months to bask in the glory of finally toppling the DVD, which held the title of fastest selling gadget for a full 14 years. For some old-timers, losing the crown to arch-rival Microsoft made the sting that much more acute.

But numbers don’t lie, and recent Microsoft sales updates now put the Kinect north of 10 million devices sold into the channel as we move past Q1-11 for what some are calling the interface device innovation of the decade.

So What Can I Do With It, Already?

For software developers looking to move into gesture beyond gaming, the low-hanging fruit is in smartphone and tablet applications. Software developers are creating gesture driven hand devices for Android phones, using sensor hooks in their software  (like gyroscope, accelerometers, and magnetometers) to translate motion into remote commands. The Android Gingerbread API, for example, enables up to a 9-axis motion sensor, using that OS’s hardware abstraction layer (or HAL for short) that simplifies the process for developers.

[caption id="attachment_498" align="alignleft" width="171" caption="Ford UK sponsored an engaging gesture-driven digital kiosk"]digital kiosk[/caption]

Gesture recognition is making its way into other devices as well. At the 2011 Consumer Electronics Show, several natural user interface (NUI) systems were shown including products from CE giants like Samsung, LG, and Panasonic. For example, at the Las Vegas confab in January, LG introduced its Magic Motion, a remote control used in the “Infina” line of 3D/LCD TVs. The technology includes a MEMS gyroscope and GUI software (albeit with a Wii-like hand controller) enabling gestures, using technology developed by Inversense. Chinese TV makers TCL and Hisense also offered gesture-based remote technology at the show.

But it’s the hands and device free approach, based on advanced image sensor technology like that found in Kinect, that seem to be gaining most traction these days. Advantages include no-touch sensing (non-contact), low light or even darkness (using infrared), gesture and shape recognition, and even facial recognition to help boost security and identity.

The Digital Out of Home (DOOH) market has a strong possibility to offer “killer app” marketplace for gesture recognition. Think of it as science-fiction-like interaction (Minority Report style) empowered by tiny hidden 3D cameras mapping (non-touch) human gestures and sophisticated image processing. That data is translated into active commands for any public digital display, even one behind a shop window. For example, one German company using the Microsoft Kinect technology is Weinheim Germany based Online Software AG with a product called POSlife to enable customers to “communicate with their bodies.” Digital Signage systems, controlled by its Kinect-enabled PRESTIGEenterprise, respond directly to customer body language.

“For the first time displays triggered by the Microsoft Kinect technology-based POSlife solution present information customers really want to see,” says the company website. For example, if a customer points to a specific product or a certain item, detailed information is displayed. According to the group, “The interactive sales counter finally becomes reality.”

Gesture Beyond Kinect

But Microsoft is finding it is not alone in this Gesture world, particularly in Gesture driven DOOH marketing. In the UK, London-based Ogilvy & Mather partnered with a digital production company, Grand Visual, to create an ad-based kiosk for malls that feature the seven-seat Ford Grand C-Max car.

This system uses hand and arm gestures to navigate around the info page, selecting car colors, experiencing features like the auto hatch, then take the car for a virtual test drive. The technology even allows people to handle miniaturized 3-D virtual models of the cars in the palms of their hands via giant interactive screens placed in shopping malls. Mark Simpson of Ford said, “Using live interactive campaigns is a great way to really engage with the audience in a way that is not possible with static posters [and] enabled us to create a targeted and tactical campaign that is relevant and fun to use.”

Other features of the system include virtual buttons that allow the user to completely interface with the car options, such as change car colors, peak inside open doors, check out the roomy inside by folding down the seats flat, rotate the car, and watch demos of key features.

As much as this sounds like the Kinnect, Andy Dibb, associate creative partner with Ogilvy said the gesture driven technology is based on a Panasonic D-Imager camera that accurately measures the users' real-time spatial depth output, along with special augmented reality algorithms that merge “…this real-life footage with the 3-D photo-real Grand C-MAX on screen.”

Another gesture-based application from Germany takes “Interactive Window Shopping” to a whole new dimension. Developed by the Heinrich Hertz Institute (HHI), part of the Fraunhofer Institute, it was recently shown at CeBIT. It’s a new type of 3D camera system using four (tiny) cameras that identify and then record the 3D positions of the hands, faces, and eyes of target subjects. Two cameras record the face and eyes; the other two record hand motion.

The group, headed by Fraunhofer scientist Paul Chojecki, created algorithms that process the images, calculating the coordinates and transforming them into the corresponding inputs. Image processing recognizes both gestures (like turning the hand or pointing at an icon) shown on the monitor. To lessen privacy concerns, “The system doesn’t store any personal data and only the coordinates of the body parts it recognizes are passed onto the visualization,” Chojecki said.

The interactive shop window is quite sophisticated:

    • It identifies how many people are in front of the shop window.

    • It makes suggestions on the basis of the gathered data, products, and information to the target group.

    • It offers customized greeting texts on the display to help generate a “bond” with the customer.

The 3D recording system is still in a prototype development phase at Fraunhofer, but they are close, as a complete on-line system was demonstrated at CeBIT.

So could it be, we are in a transition from forced, even contrived human/machine input, to a more natural way to interface with digital technology? This could help explain why the iPad was so hugely popular. Apple’s iPad stood on the shoulders of its App Store, launching after some 1 billion downloads and over 300,000 apps were released for the iPhone and iTouch. Devices that enabled a unique human/machine interface with embedded sensors took the mobile gaming space by storm. This ecosystem, created an unprecedented foundation for the iPad. And the rest was device sales history — broken only by another UI-centric device, the Kinect.

Perhaps the fact that we saw the crown change not once, but twice in 2010, (from Apple to Microsoft) underscores the point. Like that pivotal year for the DVD in 1996, when the world was undergoing the transition from analog to digital, we too are in another wave of transition. This time we’re empowered by new applications of sensor technology and the processing muscle to make sense of true user intent -- the whole purpose of these devices in the first place.

So all aboard. The world begins to look to a new gesture paradigm empowered by a low cost, innovative way to finally interact with the digitally connected world in a fun, intuitive, ubiquitous, and what only looks to be insane way.

What’s this mean for the apps you design and build? If you were given the freedom to use gesture-based technology in the software you create, what would you do with it? Tell us about it in the comments.


By submitting this form, you agree to our
Terms of Use and Privacy Policy

Thanks for Subscribing

Keep an eye on your inbox for more great content.

Continue Reading

Add a little SmartBear to your life

Stay on top of your Software game with the latest developer tips, best practices and news, delivered straight to your inbox