Sketch Recognition Blog: October 2007

Tuesday, October 23, 2007

Naturally Conveyed Explanations of Device Behavior

by

Michael Oltmans and Randall Davis

Summary

The paper combines speech recognition techniques and sketch recognition into a system titled ASSISTANCE. The system is able to recognize 2-D kinematic diagrams through the visuals provided through sketching and the context given by the user voicing additional information about what should occur in the sketch. Sketch recognition is in constant pursuit of supplementing the process that designers use in their early stages. This means low-level quick sketches, and also vocal discussions. The designer should be able to explain the intended actions by drawing additional context (arrows to indicate direction), pointing and giving spoken feedback. All of this information is used to build a model of the device, and the user could ask the system questions about the model. The paper tackles how to handle overlapping descriptions to infer meanings. ASSISTANCE takes three inputs, the structural model of the sketch generated by ASSIST, the phrases recognized by the speech recognition (ViaVoice) and the sketched arrows. ASSISTANCE must be able to unify multiple description instances, and resolve deictic references. The system must also understand casual links.

Discussion

I like this paper because it presents the combination of multiple natural methods of interaction to create something that would often require a much more restrictive technique. It is a good starting point. I am unsure how the difference between pointing and drawing is done. Also, does one simply draw, switches modes, and then begins to explain (using the pointer to make references)? Or can the user give verbal interaction throughout the whole experience? What happens if a mistake has made? Can statements be made contradictory and only the latter statement is used?

Citation

Oltmans, M. and Davis, R. 2001. Naturally conveyed explanations of device behavior. In Proceedings of the 2001 Workshop on Perceptive User interfaces (Orlando, Florida, November 15 - 16, 2001). PUI '01, vol. 15. ACM Press, New York, NY, 1-8. DOI= http://doi.acm.org/10.1145/971478.971498

Wednesday, October 17, 2007

Ambiguous Intentions: a Paper-like Interface for Creative Design

by

Mark D. Gross and Ellen Yi-Luen Do

Summary

"While most computer interfaces are text-based, we wonder how far we can push pen based, freehand additions."

Gross and Do begin by stating that most designers when working in the early design stages prefer pen and paper to a computer. It makes sens; pen and paper provides more fluidity than menu based computer applications. Computers are definitive and precise. The authors propose that a pen based computer application could have ambiguity, imprecision and be created in such a way to encourage incremental formalization of ideas. I had often viewed ambiguity and imprecision as something to be avoided, but the authors point out that ambiguity allows for alternatives to be considered and that imprecision allows for decisions to be postponed.

Gross and Do developed an Electronic Cocktail Napkin (They used Macintosh Common Lisp, my heart is a flutter). The program was suppose to function like a normal drawing application, but to also allow for retrieval, simulation, design critiquing and collaborative work. ECN can recognize "configurations", which are user defined patterns and beautification rules. Recognition of these configurations depends on the context in which the user is drawing in. The system allows for alternative recognitions, it is capable of waiting for more information to clarify another recognition. In the systems contexts are chained, the system checks for the glyph to be recognized by the first and then works its way down. Recognition of the glyphs and configurations takes place with consideration of the context. The system is able to recognize which context the author is creating in.

A variety of user studies were performed, focusing on how architectural student would use such as system. Gross and Do wanted ECN to be a general purpose drawing diagram that end users could make more specific.

Discussion

The retrieval component seemed similar to Brandon's MARQS application. The whole "general purpose that users make more specific" is something that LADDER accomplishes with domain descriptions. Architecture is such a drawing based field, I would find it more interesting if they perused a discipline that didn't so heavily rely on drawing and show how they could work more efficiently through a pen based input. All of these types of applications depend heavily on the concept of diagrams. They don't want people to actually sketch what they want, the want them to sketch simple shapes that have a metaphorical connection to what the author is visualizing.

Citation

Gross, M. D. and Do, E. Y. 1996. Ambiguous intentions: a paper-like interface for creative design. In Proceedings of the 9th Annual ACM Symposium on User interface Software and Technology (Seattle, Washington, United States, November 06 - 08, 1996). UIST '96. ACM Press, New York, NY, 183-192. DOI= http://doi.acm.org/10.1145/237091.237119

Monday, October 15, 2007

Graphical input through machine recognition of sketches

by

Christopher F. Herot

Summary

1976, was the year of the United States bicentennial. It was also the year that Apple Computer was formed. So one might ask why we are reading a paper from 1976? Our readings so far have lead us to believe - wrongly - that sketch recognition went from Sutherland's SketchPad straight to Rubine's gesture recognizer. This paper shatters that belief.

The paper opens up by noting that the research done was motivated by the "desire to involve the computer in the early stages of the design process, where the feedback generated by the machine can be most useful." If this sounds familiar it is because ever paper on sketch recognition uses a very similar motivation to frame their work. Herot states that the machine observes the sketches of the subject while they are being created, and this information could be used to make inferences about the user's attitude about the sketch. He also presents the question could a machine make useful interpretations of sketches without knowing the domain in which the sketches are made.

Herot also notes - which I wrongly assumed Sezgin discovered - that the speed of the stroke descends at the corners. Herot even plots this onto a graph noticing the minimum of speeds as the corners. He also notes that the recognition of the system is influenced by the drawing styles of those who've created it. During demonstrations the system would work well for some, but not for others. The paper also discusses latching - the connecting of near miss edges - and overtracing. The paper states that context should be used at the lowest levels of recognition, and that the program must be tuned to the user.

Herot states that the user should not be removed from the recognition process, noting that a promising approach involves "the user to make decisions of which the machine is not
capable, but still affording the unobtrusive input method of sketching."

Discussion

I didn't realize that Negroponte was involved in sketch recognition until I looked through the works cited section. It is weird to see the same language used to describe the problem in 1976 still used in 2007. The metric bentness and the use of speed seem to be very similar to the metrics that Sezgin used. More work still needs to be done with latching and overtracing.

Citation

Herot, C. F. 1976. Graphical input through machine recognition of sketches. In Proceedings of the 3rd Annual Conference on Computer Graphics and interactive Techniques (Philadelphia, Pennsylvania, July 14 - 16, 1976). SIGGRAPH '76. ACM Press, New York, NY, 97-102. DOI= http://doi.acm.org/10.1145/563274.563294

Sunday, October 7, 2007

Perceptually Based Learning of Shape Descriptions for Sketch Recognition

by

Olya Veselova and Randall Davis

Summary

A huge chunk of this paper can be summarized through one quote, "people pay unequal attention to different features". The goal for the authors is to have a system be able to learn descriptions from a single example. The system should only capture the relevant features that users would care about. The paper made heavy use of Goldmeier's work on human perception. Goldmeier had properties called singularities, which were features that small variations in had a qualitative difference. Goldmeier's singularities include vertically, symmetry, parallelism, horizontally and straightness. The paper lists the importance of these different constraints. This allows for a score. The score can be adjusted by obstruction, tension lines, and grouping. An example is if two lines are near they being parallel is an important constraint. If they are far away and a number of primitives are between them that constraint isn't so important.

To test their system the created a study and measured how often their system agreed with people's perceptual judgments on near-perfect drawings.

Discussion

The paper reminds me of those standardized test in which they student is given multiple shapes in a group and must recognize which shape does not belong. The work is interesting and has a direct connection to how we make our domain descriptions in LADDER. We should only focus on the constraints that the user believes to be relevant.

Citation

Veselova, O. and Davis, R. 2006. Perceptually based learning of shape descriptions for sketch recognition. In ACM SIGGRAPH 2006 Courses (Boston, Massachusetts, July 30 - August 03, 2006). SIGGRAPH '06. ACM Press, New York, NY, 28. DOI= http://doi.acm.org/10.1145/1185657.1185789

Saturday, October 6, 2007

Interactive Learning of Structural Shape Descriptions from Automatically Generated Near-miss Examples Intelligent User Interfaces

by

Tracy Hammond and Randall Davis

Summary

The paper tackles the problem of creating shape descriptors for sketch recognition systems. The paper builds on LADDER, which is a languages to describe how shapes are drawn, displayed and edited. The author point out that recognition is based on what the shapes look like, not how they were drawn. The problem is that descriptors can be under constrained, recognizing shapes that the designer did not intend, or over constrained, not recognizing shapes that the author did intend.The big problem is that it is much easier to draw a shape that to create a formal description.
The authors propose a visual debugger of shape descriptions. The system uses a hand-drawn positive example to build the initial structural description. The system then asks the user to identify near-miss shapes as positive or negative. To check if the description is over constrained a description is created with the constraint negated, and then a shape is generated. If the user views this shape to be correct the constraint is unnecessary. To test for the under constraint a list of possible restraints is created. The negation of the constraint is added and users are again queried. This can determine if a constraint should be included. The paper also describes parameters that the authors used when generating new shapes.
The end goal is to produce descriptions that contain no unnecessary constraints, also aren't missing needed constraints.

Discussion

The section about the users drawing similar examples was spot on. As I did my user study most of the participants would draw the same line over and over again. So figure 1 gets a special place in my heart. I choose this paper to read for my user study report and also my final project. In my final project I need users to draw shapes that will be game pieces. Looking back, in my Plinko description, I softened constraints because the system seemed to be unable to recognize my drawing as what I intended. A system like this would be better, I could draw the shape I was interested in, and then through an iterative process the shape description was perfected. I could see this being a way to get new users started in creating descriptions. Let them draw a shape, build a description and then let them modify it. It gets around the blank page problem. When I am writing there is always a fear of the blank page, I usually write on top of an outline, and that gets me over the initial hump. I could see new users intimidate creating new shapes from scratch.

Citation

Hammond, T. and Davis, R. 2006. Interactive learning of structural shape descriptions from automatically generated near-miss examples. In Proceedings of the 11th international Conference on intelligent User interfaces (Sydney, Australia, January 29 - February 01, 2006). IUI '06. ACM Press, New York, NY, 210-217. DOI= http://doi.acm.org/10.1145/1111449.1111495

Wednesday, October 3, 2007

LADDER, a sketching language for user interface developers

by

Tracy Hammond and Randall Davis

Summary

There are people who are good at making sketch recognition systems, unfortunately those people aren't usually domain experts. The opposite is true, some people really know their domains, but it is difficult for them to transfer this knowledge into a sketch recognition system . LADDER allows for the description of a domain's shapes at a simple level. This description consists of describing the primitive shapes, and the constraints on those primitive shapes. This description should also include information about how the shape should be edited and how it should be displayed to the user.

In LADDER shapes are built hierarchically re-using low-level shapes. Both hard and soft constraints can be put on shapes; thus the way a user draws the shape must not be identical to be recognized, but if a shape is often drawn in a common progression, this information could be used in recognition.

A user study was done with thirty users to gain an understanding of how people go about describing shapes and constraints in common language. These concepts were built itno the description language for LADDER.

LADDER is unable to describe abstract shapes, shapes must be composed of primitive shapes, and curves are inherently difficult for the system. The paper goes on to discuss the shape definitions, how shapes can be grouped, and the predefined shapes, conditions and display methods. Also, vectors is discussed. Vectors are used to recognize shapes with a variable number of sub-shapes. The example given is a dashed-line. Shapes are recognized using a bottom-up approach.

Discussion

The geometric recognition is a movement away from Rubine's feature based (Which is more gesture) into a system that doesn't constrain the user. The way domain descriptions are built is nice in that it is a very reusable system. I can see that when trying to create complex shapes it could get very frustrating.

Citation

Hammond, T. and Davis, R. 2007. LADDER, a sketching language for user interface developers. In ACM SIGGRAPH 2007 Courses (San Diego, California, August 05 - 09, 2007). SIGGRAPH '07. ACM Press, New York, NY, 35. DOI= http://doi.acm.org/10.1145/1281500.1281546

Sketch Recognition Blog

Tuesday, October 23, 2007

Naturally Conveyed Explanations of Device Behavior

by

Summary

Discussion

Citation

Wednesday, October 17, 2007

Ambiguous Intentions: a Paper-like Interface for Creative Design

by

Summary

Discussion

Citation

Monday, October 15, 2007

Graphical input through machine recognition of sketches

by

Summary

Discussion

Citation

Sunday, October 7, 2007

Perceptually Based Learning of Shape Descriptions for Sketch Recognition

by

Summary

Discussion

Citation

Saturday, October 6, 2007

Interactive Learning of Structural Shape Descriptions from Automatically Generated Near-miss Examples Intelligent User Interfaces

by

Summary

Discussion

Citation

Wednesday, October 3, 2007

LADDER, a sketching language for user interface developers

by

Summary

Discussion

Citation

About Me

Blog Archive

Links