Tuesday, September 11, 2007

Sketch Based Interfaces: Early Processing for Sketch Understanding

by

Tevfik Metin Sezgin, Thomas Stahovich, Randall Davis

Summary

This approach taken in this paper is a shift from the past few we have read (Rubine, Long). Sezgin et. al. believe that a sketch based system should respond to how an object looks, and not how it was drawn. The goal of these sketch systems is to allow people to work with computation tool in the same manner they work with people (messy, informal), but without losing the nice computational gains. The authors state that the first step in accomplishing this goal is to be able to convert from pixels to geometric objects. This early processing must be able to find corners and fit lines and curves.

Their early processing approach has three phases: approximation, beautification and basic recognition. To accomplish stroke approximation first vertex must be detected. Their scheme uses both stroke direction and speed data to accomplish the detection. The extremas are compared against an average mean threshold to remove noise. This works under the principle that pen speed drops when going around a corner. On certain drawings though pen speed will not recognize corners. Thus the intersection of the vertex found by each method is used. The paper goes into further detail on how a hybrid fit is calculated, and how curves are handled. Beautification, as described in the paper, is the minor adjustments of the lines to insure that the intent of the drawing is true. An example of intent could be parallel, which is difficult to draw freehand, but the intent is not hard to recognize, and thus correct strokes for. Recognition is accomplished with hand tailored templates that examine the properties.

A user study was performed comparing this system to drawing using XFig. Participant were almost unanimous in their selection of the sketching system as the preferred choice. Once again it is pointed out that this system puts no limitations on how the user wishes to draw the object. It is for the computer to understand the user, not for the user to sketch in a way the computer understands.

Discussion

"Just drink the Kool-Aid already." - Brian David Eoff Approx. 17 Minutes Ago

Does it seem bizarre this idea that the interactions between a user and a computer could be "messy"? Interaction between a computer seems to be about removing ambiguities, and if there was a problem it was caused by the user right? Maybe it doesn't have to be rigid? Maybe we can have the best of both worlds; the ability to take all the good capabilities computation provides coupled with natural human interaction in all its messy imperfect yet nuanced glory. Really think about this, the GUI was huge compared to the command line, but that was ~30 years ago. Aren't we due for something different? Even if it isn't always better (and for some things I still go to the command line) shouldn't there be something else? The next. If not, what have we done with all this computational power and connectivity we've been chasing?

Well now onto the paper.
The paper makes an interesting observation that if user's draw slowly they are most likely being precise, and the system should respect that precision when recognizing the drawing. Also, the authors make not of their handling of overtracing, a natural movement that happens when one draws, that gesture recognition could not possibly recognize. This is a movement towards more looking at what the user drew and using how they drew it to clear up ambiguities versus solely depending on the physical act of the user drawing the object. Now that tablets can record tilt metrics on pen inputs could this be used along with speed and curvature to discover corners? The same with pressure. Were these values not available when this system was created, or were they deemed unnecessary?

Citation

Sezgin, T. M., Stahovich, T., and Davis, R. 2001. Sketch based interfaces: early processing for sketch understanding. In Proceedings of the 2001 Workshop on Perceptive User interfaces (Orlando, Florida, November 15 - 16, 2001). PUI '01, vol. 15. ACM Press, New York, NY, 1-8. DOI= http://doi.acm.org/10.1145/971478.971487

2 comments:

rg said...

I really liked the work they mentioned about overtracing. I wish there was more to it.

Paul Taele said...

I really liked how the paper addressed that observation you noted in handling overtracing. I figured that case would be handled in image recognition instead of sketch recognition. Furthermore, I didn't think too much of the usefulness of handing overtracing cases at first, because I thought it was more trouble for the recognizer than it was worth. After thinking more about it, I realized I tended to overtrace when I want to make an image more defined (and thus less ambiguous) myself. It is indeed a useful situation for the system to handle. I do wonder though how the system would handle cases of the user overtracing and the user intentionally wanting to draw several similar shapes in that vicinity. I don't believe that was addressed in the paper.