Group 2

From CoIN

Revision as of 22:03, 21 January 2009 by Mehrbod (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Contents

Maddie

Output

Project Report and Presentation

All Project Files

FaceDetection API and User Guide

PitchDetection API and User Guide - Note: Some incompatibility with XP is found due to SAPI version

Members

  • Anca Dragan
  • Geo Leontiev

Project Details

MADDIE is a virtual pet designed to directly interact with humans though its visual and audio capabilities. It is aware of human presence and can carry out conversations with its users. More than that, MADDIE appears to have a personality and unlike other virtual (or physical) robots, will act against the user’s expectations. For example, it can insult the user or act differently according to gender, or react in a different way to the same situation, leaving an unpredictability factor that contributes to the character’s appeal. More precisely, MADDIE will unexpectedly things like “You’re fat”, or ignore the user even if it replied to the very same question on a different occasion. All these make MADDIE an interesting and unusual companion, mainly directed towards entertainment. However, more pragmatic applications do exist, as discussed below

Features

  • Face Recognition: MADDIE can recognize the presence of a human by identifying its face. It also tracks the user with its eyes so that if the user moves, MADDIE indicates its awareness. This feature makes the human-robot interaction aspect more interesting, allowing the user to trust more in its capabilities and regard the robot with more seriousness.
  • Speech Recognition: The robot is able to recognize what the user is saying. It makes use of a grammar to connect the audio input with its semantics, so that an appropriate response can be planned.
  • Speech Synthesis: MADDIE gives answers to the audio input using a female voice. This allows for an exchange of replies that indicates to the user an ability to reason.
  • Pitch Detection: The robot uses this feature to detect the difference between males and females, so that it can act differently and even change its appearance.

Technologies used

  • FaceOnIt – a software library that finds faces in pictures and videos
  • Microsoft Speech API 5.3 – providing speech recognition and synthesis abilities
  • Windows Presentation Foundation (WPF, formerly known as Avalon) - a graphical subsystem in .NET Framework 3.0
  • YAALP library – pitch detection
  • VideoLAN VLC Media Player – converting audio format

Future Directions

A first approach, which was part of the original design for the project, would be to install MADDIE on a small touch-screen and place it on top of a robot (such as IRobot CREATE, see http://store.irobot.com/shop/index.jsp?categoryId=3311368 for more details). This would allow for mobility in space, so that the robot can follow the user with not just its eyes, run away from the user when “upset” or even move to the entrance to greet the user when he/she arrives home. Another improvement would be to make the conversations spontaneous rather than based on predefined grammars. This could be achieved through Eliza (refer to http://bar.speech.cs.cmu.edu/11754S08/Alice/). Even though this technology is not yet advanced enough to produce conversations that emulate human reasoning, a hybrid between our current approach and the latter would be considerably better.

As for a more practical side, MADDIE can be given multiple uses (thus earning the name of “Multi-purpose auxiliary display”, again part of the original design) by becoming a home personal assistant for the parents, that reads emails and notes tasks in the user’s calendar, and a companion and teacher for children, allowing them to practice skills in a foreign language or play educational games with their favorite pet. Integrated with the IRobot, MADDIE can act as a mobile surveillance camera to improve house security.

On the personality aspect, MADDIE does not yet recognize differences between specific people and this is another direction to go so that the adaptability to its users increases. This, together with recoding data and reusing it in a learning process could make this virtual pet quite attractive in the real world. It could have a like/dislike system based almost entirely on the amount interaction that happens between the user and the unit (if you interact with it a lot, it likes you; if you ignore it id dislikes you). Based on this preference it will either compliment or make fun of a user. Also, a combination of face recognition and object tracking could be employed to recognize when a user has their back turned to the unit – imagine, for instance, that if the unit dislikes that person it will stick its tongue out at them. MADDIE could also be louder when a person it likes is around.

Overall, there are a lot of opportunities for this project to develop. While what we achieved is only setting the basis of that, we think it provides a demonstration of the possible utility (and entertainment) such a tool could bring.

Project Log

10/12 (Geo) Robin's system seems to be the perfect candidate for the type of facial recognition we are trying to do. It can tell different faces apart and works in WinXP. The only problem is that faces need to be manually registered with the system in order to be recognized, but this can be dealt with.

10/21 (Geo) The system we were trying out previously was having some serious issues, so I switched back to FaceOnIt. With Mehrbod's help I was able to compile the source in VS2005 on both XP and Vista machines.