Title: "Casual Communication between Robots and Humans based on Robot Technology Middleware and Multimedia Recognition"

Abstract: The Mascot Robot System has been developed by the author’s group as a part of the “Development Project for a Common Basis of Next-Generation Robots” sponsored by NEDO (New Energy and industrial technology Development Organization). The main purpose of the mascot robot system is to perform casual communication between robots and humans mainly based on the speech recognition module mounted on household robots. The system is implemented as a network of multi-robots freely connected by RT middleware (RTM). It consists of 5 robots, i.e., 4 fixed robots (placed on a TV, a darts game machine, an information terminal, and a mini-bar) and 1 mobile robot. Each of them includes an eye robot, a speech recognition module, and a notebook PC that controls the robot and the speech recognition module. These robots connect together with a server through the internet by RTM, thus constituting the Robot System. The Mascot Robot System's functioning is demonstrated in an ordinary living room, where casual communication between 5 robots and 4 human beings (1 host, 2 guests, and 1 walk-in) is conducted based on speech recognition and mentality expression of eye robots. The experimental results are shown by DVD files.


The NEDO project is extended to JSPS (Japan Society for Promotion of Science) project ongoing, where computational intelligence based gesture recognition method is proposed and embedded in the mascot robot system to realize casual communication between robots and humans. It utilized both video image data from web camera, voice information by microphone, and motion data given by a wearable 3D acceleration sensor on human wrists to identify the intentions and emotions, i.e., multi media recognition technology. To demonstrate the validity, the proposed method is now applied to a part of a mascot robot system, where a home party scenario performed by five eye robots and four human participants. On going results show the possibility that the proposed method may be used to improve the interaction between human and robotic systems.

Kaoru HIROTA received Dr. E. degrees from Tokyo Institute of Technology in 1979. After his career at Sagami Institute of Technology and Hosei University, he has been with Tokyo Institute of Technology. His research interests include fuzzy systems, intelligent robot, and image understanding. He experienced president-elect and fellow of IFSA (International Fuzzy Systems Association), and president of SOFT (Japan Society for Fuzzy Theory and Systems.) He is a chief editor of J. of Advanced Computational Intelligence and Intelligent Informatics. Banki Donat Medal, Henri Coanda Medal, Grigore MOISIL Award, SOFT best paper award, Acoustical Society of Japan best paper award, honorary professorships from de La Salle University and Changchun Univ. of Science & Technology, and Honoris Causa from Bulacan state university, Budapest Technical University, and Szechenyi Istvan University were awarded to him. He organized more than 10 international conferences/symposiums as a founding/general/program chair. He has been publishing about 250 journal papers, 50 books, and 450 conference papers.