Nowadays machines are already capable of communicating with human beings in a "natural” fashion through the existence of capabilities to understand natural language, recognise hand writing, and for interpreting gestures. However, they are also capable of extending human perception through augmenting situations with additional knowledge ("augmented reality”), i.e. the depiction of information is contextualised according to the situation as perceived by the machine. Examples of this are smartphones and tablets (voice-control, face recognition, "goggle”, music recognition), vehicles (driver assistance systems), video game consoles (movement interpretation), but also in work-related contexts (surgery, human-robot-cooperation).
The lecture covers the foundations of voice- and gesture recognition, the sensing and recognition of objects in the environment, as well as information presentation. Sample applications (e.g. using the Kinect sensor) allow students to gain deeper understanding of the covered material.
Topics include:
- system performance of perception-based interaction
- sensor systems for the recognition of the environment (sound, video, 3d, touch, acceleration and rotation)
- Recognition (object recognition in video and 3d, speech- and behaviour recognition)
- interaction models (augmented reality, situation graphs)