A multimodal interface for virtual character animation based on live performance and natural language processing

Abstract:

Virtual character animation is receiving an ever-growing attention by researchers, who proposed already many tools with the aim to improve the effectiveness of the production process. In particular, significant efforts are devoted to create animation systems suited also to non-skilled users, in order to let them benefit from a powerful communication instrument that can improve information sharing in many contexts like product design, education, marketing, etc. Apart from methods based on the traditional Windows-Icons-Menus-Pointer (WIMP) paradigms, solutions devised so far leverage approaches based on motion capture/retargeting (the so-called performance-based approaches), on non-conventional interfaces (voice inputs, sketches, tangible props, etc.), or on natural language processing (NLP) over text descriptions (e.g., to automatically trigger actions from a library). Each approach has its drawbacks, though. Performance-based methods are difficult to use for creating non-ordinary movements (flips, handstands, etc.); natural interfaces are often used for rough posing, but results need to be later refined; automatic techniques still produce poorly realistic animations. To deal with the above limitations, we propose a multimodal animation system that combines performance- and NLP-based methods. The system recognizes natural commands (gestures, voice inputs) issued by the performer, extracts scene data from a text description and creates live animations in which pre-recorded character actions can be blended with performer’s motion to increase naturalness.