Tuesday, October 23, 2007

Naturally Conveyed Explanations of Device Behavior

by

Michael Oltmans and Randall Davis

Summary

The paper combines speech recognition techniques and sketch recognition into a system titled ASSISTANCE. The system is able to recognize 2-D kinematic diagrams through the visuals provided through sketching and the context given by the user voicing additional information about what should occur in the sketch. Sketch recognition is in constant pursuit of supplementing the process that designers use in their early stages. This means low-level quick sketches, and also vocal discussions. The designer should be able to explain the intended actions by drawing additional context (arrows to indicate direction), pointing and giving spoken feedback. All of this information is used to build a model of the device, and the user could ask the system questions about the model. The paper tackles how to handle overlapping descriptions to infer meanings. ASSISTANCE takes three inputs, the structural model of the sketch generated by ASSIST, the phrases recognized by the speech recognition (ViaVoice) and the sketched arrows. ASSISTANCE must be able to unify multiple description instances, and resolve deictic references. The system must also understand casual links.

Discussion

I like this paper because it presents the combination of multiple natural methods of interaction to create something that would often require a much more restrictive technique. It is a good starting point. I am unsure how the difference between pointing and drawing is done. Also, does one simply draw, switches modes, and then begins to explain (using the pointer to make references)? Or can the user give verbal interaction throughout the whole experience? What happens if a mistake has made? Can statements be made contradictory and only the latter statement is used?

Citation

Oltmans, M. and Davis, R. 2001. Naturally conveyed explanations of device behavior. In Proceedings of the 2001 Workshop on Perceptive User interfaces (Orlando, Florida, November 15 - 16, 2001). PUI '01, vol. 15. ACM Press, New York, NY, 1-8. DOI= http://doi.acm.org/10.1145/971478.971498

1 comment:

Grandmaster Mash said...

I would have liked to see more results from this paper, but I like the concept.

I personally shy away from speech interfaces because I find voice recognition flaky. Plus, I personally get frustrated when a computer doesn't understand my voice. When I talk to another person I assume that they understand me, and if they don't I try to explain my meaning in another method. If you try speaking clearly to a computer and it still doesn't understand, there's nothing else you can do to convey meaning.