Kare - A multilingual conversational agent

Kare is an animated conversational agent that integrates a talking head interface with a linguistically motivated human-machine dialogue system. The agent has a range of nonverbal behaviors, which involve a mixture of machine vision, computer animation and natural language processing techniques. The system's architecture couples the agent's non-verbal communicative processes very tightly to its model of verbal interaction. The system is capable of using different non-verbal dialogue management signals when speaking different languages.

The system is bidirectional: sentence interpretation and generation are both controlled by a single declarative grammar, and dialogue act generation and interpretation are both controlled by a declarative specification of dialogue structure. The system is also bilingual: the system can converse with the user in either English or Mäori,

Kare is pronounced as in French carré. Te Karetao is Mäori for puppet. The shortened Kare is also a term of endearment.

Culture Specific Dialog Conventions

The differences between Mäori and English are not just verbal, they also have very different conversational conventions. When Kare converses, it uses the type of the dialogue act (yes/no answer, wh answer, acknowledgement, etc.) along with the langauge to determine the appropriate action or actions.


Comparison between an English (left) and a Mäori (right) negative response. Polynesians tend to look down, possibly with a vecalization for a negative responze, while English speakers tends to shake their heads side-to-side while also saying no.


Comparison between speaking English (left) and Mäori (right). English speakers use eye contact to control the flow of conversation. Mäori speakers avoid eye contact, as eye contact can put the participants in a state of conflict.


Comparison between an English (left) and a Mäori (right) positive response. English speakers nod their head one or more times while speaking yes. Mäori speakers tend to stare ahead for agreement, sometimes also vocalizing äna.

Examples

A sample converstion in quicktime (Sorenson compression 6.4M) or avi (12.3M).

Publications