Thursday, August 30, 2007

Specifying Gestures By Example

by Dean Rubine


Summary

The paper proposes a system of creating a gesture recognizer automatically from example gestures. This system is incorporated into GRANDMA (Gesture Recognizers Automated in a Novel Direct Manipulation Architecture) which allowed the gestures to be used in a direct manipulation environment. The paper describes the interactions needed to control a drawing application, GDP. Of note is that GDP is not sketch recognition, the user utilizes a gesture to create a shapes starting point and then finishes the creation of the shape by dragging (rubberband). This system is single stroke only; the limitation avoids a segmentation issue and single strokes coincide with a single tensing and relaxing muscle movement. The author notes that 15 examples per gesture class is adequate. Also the author describes how gesture handlers are created for each gesture, flaunting the object-oriented nature of the project. The strokes are recorded by taking in x and y positions in addition to the time. A set of 13 features is calculated for each stroke. The feature hold that a small change in the shape should correspond to a small change in the feature value. The sample estimate is determined by the average of the feature for a class. At around 50 training examples the performance gained plateaus, also as the number of possible classes increases the recognition rate drops, but not beyond 90%. Two extensions to the system are proposed; one in which the system does eager recognition, and multi-finger recognition.

Discussion

The paper makes mention of multi-finger recognition which surprised me a little. Such devices have become popular recently, with Jeff Han's TED presentation and the popularity of the iPhone, but this paper was written in 1991. Sixteen years is quite a gap for a idea incubated in research to begin appearing in consumer products.

The algorithm is quite nice in that it doesn't limit features in being added, and also that the feature values appear to be weighted. As I read this paper I kept thinking about Palm's Grafitti language and the similarity in design. It did seem odd that the delete 'x' gesture used the start of the gesture not what object the gesture overlapped to determine what to delete.

Citation

Rubine, D. 1991. Specifying gestures by example. In Proceedings of the 18th Annual Conference on Computer Graphics and interactive Techniques SIGGRAPH '91. ACM Press, New York, NY, 329-337. DOI= http://doi.acm.org/10.1145/122718.122753

3 comments:

Unknown said...

Hi, Brian.

Found your blog through Google - interesting stuff. It'd be useful if you included the actual journal references to the papers you talk about, especially for academics and others who happen across your blog and find the subject interesting and worth pursuing.

June

Paul Taele said...

Hey, Brian. Can I join your fanclub, too? :P

I found the multi-finger recognition one of the most interesting parts of the paper, but it didn't click to me at the time about the time gap mentioned in the paper to its usage in recent devices. Now I find that aspect of the paper even more interesting.

Thought in isolation, sixteen years from idea to inception would seem like a long time. I did learn from one of my profs at my undergrad university an interesting tidbit. On average, ideas in computer science have a tendency to not see fruition in practical use for several decades. A few major ones we take for granted even took half a century. I guess one could say that converting an idea into real world application in sixteen years would be considered fast in the CS world. :D

Paul said...

Using the start point to identify does did seem odd to me as well. The best I can think of is that they wanted method to delete a single object without needing much computation to identify which object the user wanted to delete. Much of the algorithm they use seems to try to minimize computation time even to the point of losing accuracy.