Monday, September 3, 2007

Visual Similarity of Pen Gestures

by

A. Chris Long, Jr. , James A. Landay, Lawrence A. Rowe and Joseph Michiels

Summary

The paper is a precursor to the prior paper. This paper goes into more detail about the experiment to determine why users find gesture similar. The researches hope to roll these findings into a tool (quill) that is capable of notifying designers of gestures that are likely to be perceived as similar by users, that will be difficult for users to learn and remember and may be unrecognized by a computer. Similarity can affect how easily users can learn and recall gestures.

A brief overview of pen based interfaces is given. The authors point out that the handwriting recognition on the Newton was initially criticized (it recognized Cyrllic great by the way). Also noted is the power of gestures in that they can identify the operator and the operand simultaneously. The paper continues with an overview of perceptual similarity focusing on the psychology aspect and multi-dimension scaling.

The researchers ran two separate experiments. The goals of the first experiment were to determine what measurable geometric properties influenced similarity and to produce a model that could predict how similar two gestures would be viewed by a user. The similarity of the gestures is determined by the Euclidean distance between the feature sets of two gestures. Long etc. does propose additional predictors beyond Rubine, which they list as only having 11. The paper notes that the participants clumped into two separate groups, but gives no information about what distinguishes the two. The second study focused on varying different types of features affected perceived similarity.

Discussion

Why don't they list feature 12 and 13 of Rubine? Those are the features that take into consideration time data, and none of their additional features consider time. Also, Long etc. make note of the fact that users seem to fall into two separate groups, yet they give no additional information about those groups (i.e. the criteria for placement in said groups). I wonder how additional physical metrics of input (pressure of the pen, tilt of the pen) could be used in creating additional Rubin features?

Citation

Long, A. C., Landay, J. A., Rowe, L. A., and Michiels, J. 2000. Visual similarity of pen gestures. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (The Hague, The Netherlands, April 01 - 06, 2000). CHI '00. ACM Press, New York, NY, 360-367. DOI= http://doi.acm.org/10.1145/332040.332458

3 comments:

- D said...

I think additional measures used to create features might be nice. However, I think the tilt/pressure of the pen, etc., are very user dependent. For example, if I'm a lefty and you're a righty, our pens will be tilted differently. Maybe I'm a big strong dude that presses the pen hard, and someone else is a tiny girl writing a letter to grandma with graffiti. I think this is part of the reason the timing data was removed from the Rubine features--how quickly you can jot a sketch should not affect its classification, since this data is completely user-dependent and not gesture-dependent.

Still, it would have been nice for them to give us rationale for their feature choices.

Grandmaster Mash said...

The reason I'd remove time data is that drawing speeds can vary so much from person to person. If you're trying to look at gesture similarity you'd want to try to have the features themselves be relatively standard for each gesture. The users in their study also did not draw the gestures themselves (an animation was shown to the user instead). So maybe Long and Landay were just interested in visual similarity.

Paul Taele said...

Well, once again I'm late to such discussions on the matter, as Josh and Aaron already said what had to be said. But anyway...

Even though I wasn't too surprised though of the reasoning that Long & friends had about the lesser significance of time data, I was a little curious why it was eliminated completely. As brought up by Brandon and Dr. Hammond, time data does have its uses in corner finding (I think). I figured that this metric would be used to some lesser extent instead of doing away with it completely.

But it would seem tricky trying to somehow implement this metric to a lesser extent in the first place, or to at least exploit it in a difference fashion. That would probably be more of an art than a science though.