What is Aria Valuspa?
The ARIA-VALUSPA (Artificial Retrieval of Information Assistants – Virtual Agents with Linguistic Understanding, Social skills, and Personalised Aspects) project will create a ground-breaking new framework that will allow easy creation of Artificial Retrieval of Information Assistants (ARIAs) that are capable of holding multi-modal social interactions in challenging and unexpected situations. The system can generate search queries and return the information requested by interacting with humans through virtual characters. These virtual humans will be able to sustain an interaction with a user for some time, and react appropriately to the user’s verbal and non-verbal behavior when presenting the requested information and refining search results. Using audio and video signals as input, both verbal and non-verbal components of human communication are captured. Together with a rich and realistic emotive personality model, a sophisticated dialogue management system decides how to respond to a user’s input, be it a spoken sentence, a head nod, or a smile. The ARIA uses special speech synthesizers to create emotionally colored speech and a fully expressive 3D face to create the chosen response. Backchannelling, indicating that the ARIA understood what the user meant, or returning a smile are but a few of the many ways in which it can employ emotionally colored social signals to improve communication.Project abstract
Aria Alice Preview
To get more insights into the ARIAValuspa System we uploaded Deliverables 2.1 (Implementation of cross-domain, context- sensitive speech analysis), 3.1 (Implementation of multi-lingual task-based dialogue system for the chosen scenario) and 4.1 (Implementation of overall dynamic audio-visual communicative behaviour generation).
On January 22-23 the project’s kickoff meeting took place at Nottingham University. A successfull start for this new project!
People of the Project
Michel ValstarUniversity of Nottingham
Michel Valstar is an Assistant Professor at the University of Nottingham, School of Computer Science, and a researcher in Automatic Visual Understanding of Human Behaviour. He is a member of both the Computer Vision Lab and the Mixed Reality Lab. This encompasses Machine Learning, Computer Vision, and a good idea of how people behave in this world. He was recently a Visiting Researcher at the Affective Computing group at the Media Lab, MIT, and a research associate with the iBUG group, which is part of the Department of Computing at Imperial College London. Michel’s expertise is facial expression recognition, in particular the analysis of FACS Action Units. He recently proposed a new field of research called ‘Behaviomedics’, which applies affective computing and Social Signal Processing to the field of medicine to help diagnose, monitor, and treat medical conditions that alter expressive behaviour such as depression.
Elisabeth AndréAugsburg University
Prof. Dr. Elisabeth André is a full professor of Computer Science at Augsburg University, Germany and Chair of the Laboratory for Multimedia Concepts and their Applications. Prior to that, she worked as a principal researcher at DFKI GmbH where she has been leading various academic and industrial projects in the area of intelligent user interfaces one of which was honored with the European IT Prize. Her current research interests include affective computing, multimodal user interfaces and synthetic agents. Elisabeth André is on the editorial board of Artificial Intelligence Communications (AICOM), Cognitive Processing (International Quarterly of Cognitive Science), Universal Access to the Information Society (UAIS), Autonomous Agents and Multi-Agent Systems (JAAMAS) and the International Journal of Human Computer Studies (IJHCS).
Björn SchullerImperial College London
Björn Schuller received his diploma in 1999, his doctoral degree for his study on Automatic Speech and Emotion Recognition in 2006, and his habilitation and Adjunct Teaching Professorship in the subject area of Signal Processing and Machine Intelligence in 2012, all in electrical engineering and information technology from TUM in Munich/Germany. He is a Senior Lecturer in Machine Learning in the Department of Computing at the Imperial College London (UK) and a tenured Full Professor heading the Chair of Complex Systems Engineering at the University of Passau/Germany. Dr. Schuller is president of the Association for the Advancement of Affective Computing (AAAC), elected member of the IEEE Speech and Language Processing Technical Committee, Editor in Chief of the IEEE Transactions on Affective Computing, and member of the ACM, IEEE and ISCA and (co-)authored 5 books and more than 400 publications in peer reviewed books, journals, and conference proceedings leading to more than 7000 citations.
Dirk HeylenUniversity of Twente
Dirk Heylen is professor Socially Intelligent Computing at the University of Twente. His research interests cover both the machine analysis of human(conversational) behaviour and the generation of human-like (conversational) behaviour by virtual agents and robots. He is especially interested in the nonverbal and paraverbal aspects of dialogue and what these signals reveal about the mental state (cognitive, affective, social). These topics are explored both from a computational perspective and as basic research in the humanities, reflecting my training as a computational linguist.
Catherine PelachaudParis Telecom
Catherine Pelachaud is a Director of Research at CNRS in the laboratory LTCI, TELECOM ParisTech. Her research interest includes embodied conversational agent, nonverbal communication (face, gaze, and gesture), expressive behaviors and socio-emotional agents. She has been involved and is still involved in several European projects related to believable embodied conversational agents (IST-MagiCster, ILHAIRE, VERVE, REVERIE), emotion (Humaine, CALLAS, SEMAINE, TARDIS) and social behaviors (SSPNet). She is associate editors of several journals among which IEEE Transactions on Affective Computing, ACM Transactions on Interactive Intelligent Systems and Journal on Multimodal User Interfaces. She has co-edited several books on virtual agents and emotion-oriented systems.
Chloé ClavelCNRS, Paris
Chloé Clavel is an associate professor in Affective Computing in the GRETA Team belonging to the MM (multimedia) group of the Signal and Image Processing Department of Telecom-ParisTech. Her research focuses on two issues: acoustic analysis of emotional speech and opinion mining through natural language processing. After her PhD, she worked in the laboratories of two big French companies that are Thales Research and Technology and EDF R&D where she developed her research around audio and text mining applications. At Telecom-ParisTech, she is currently working on interactions between humans and virtual agents, from user’s socio-emotional behavior analysis to socio-affective interaction strategies.
Eduardo CoutinhoImperial College London
Eduardo Coutinho received his diploma in Computer Science and Electrical Engineering from the University of Porto (Portugal, 2003), where he specialised in multi-agent systems, and his doctoral degree in Affective and Computer Sciences from the University of Plymouth (UK, 2009). Since then, Coutinho has been researching in the interdisciplinary fields of Music Psychology and Affective Computing. Currently, Coutinho is a Research Associate at the Department of Computing from the Imperial College London, a part-time Lecturer in Music Psychology at the University of Liverpool, and an Affiliate Researcher at the Swiss Center for Affective Sciences. He is a member of the INNS, ISRE and SMPC.
Johannes WagnerAugsburg University
Johannes Wagner graduated as a Master of Science in Informatics and Multimedia from the University of Augsburg, Germany, in 2007. He is currently employed as a research assistant at the lab of Human Centered Multimedia (HCM) and has been working in several European projects including Humaine, Callas, Ilhaire and CEEDs. His main research focus is the integration of Social Signal Processing (SSP) in real-life applications. He is the founder of the Social Signal Interpretation
(SSI) framework, a general framework for the integration of multiple sensors into multimedia applications.
Tobias BaurAugsburg University
Tobias Baur received his M.Sc in Computer Science and Multimedia in 2012 from Augsburg University, Germany. He then joined the Human Centered Multimedia Lab in Augsburg as a PhD Candidate where he contributed to the EU Projects TARDIS and Ilhaire. His research focuses on Social Signal Processing, Automated Behavior Analysis, Affective Computing, Social Robotics and Human-Agent Interaction. He’s involved in developing the SSI Framework, and the NonVerbal Behavior Analyzer (NovA) tool which aims to analyze social signals in an automated manner.
Angelo CafaroCNRS, Paris
Angelo Cafaro is a postdoctoral researcher at CNRS-LTCI, Telecom ParisTech in France. He is doing research in the area of embodied conversational agents and serious game environments with emphasis on social interaction, group behavior and expression of social attitudes. He obtained his Ph.D. from Reykjavik University in 2014. His dissertation dealt with analyzing and modeling human nonverbal communicative behavior exhibited by a virtual agent in a first greeting encounter with the user. In his dissertation he also proposed a SAIBA compliant computational model featuring a unified specification for the Function Markup Language (FML). More information is available on his personal webpage: www.angelocafaro.info.
Brais MartinezUniversity of Nottingham
Brais Martinez is a Research Fellow at the University of Nottingham. He has previously been a Research Associate in the intelligent Behaviour Understanding Group (iBUG) at Imperial College London. He received his PhD in computer science at the Autonomous University of Barcelona in 2010.
Currently he is working in the fields of computer vision and pattern recognition, where his main interest is in automatic face analysis. He has predominantly worked on problems such as face detection, facial landmarking and facial expression analysis based on Facial Action Units, publishing his research on these topics at authoritative journals and conferences including TPAMI, TSMC-B, PR or CVPR. He is an IEEE member.
Dr. Blaise Potard is a Post-doctoral Researcher at CereProc Ltd in Edinburgh. In the past, his research has focused on various fields related to speech technology (Acoustic-to-Articulatory Inversion, Text-to-Speech Synthesis, Automatic Speech Recognition…). In the context of the ARIA-VALUSPA project, he will be mostly working on improving the expressiveness of Text-to-Speech voices, in particular their emotional aspects.
Mariët TheuneUniversity of Twente
Mariët Theune is an assistant professor at the University of Twente. She has a background in computational linguistics. Her main research interests include interactive storytelling with intelligent agents, natural language generation, embodied conversational agents and dialogue systems. She has worked on these topics in the context of various national projects and has supervised several PhD, Master and Bachelor students in these areas.
Emeline Bantegnie received her engineer diploma in Computer Science from Ensimag school (France, 2011). She then joined Cantoche as a 3D software developer. She is specialized in the integration of animated and 3D avatars on devices. She also has expertise on 3D animation programming.
She has been recently involved in the European project ILHAIRE and the French one Avatar 1:1. In the context of the ARIA-VALUSPA project, she will provide solutions to embed an animated agent on targeted devices.
Amr MousaUniversity of Passau
Amr Mousa received his M.Sc degree in Computer Science from Ain Shams University (Cairo, Egypt) in 2006, and his doctoral degree from RWTH Aachen University (Aachen, Germany) in 2014, where his research was focused on Automatic Speech Recognition (ASR). Currently, Dr. Mousa is a Research Associate at the Chair of Complex and Intelligent Systems, University of Passau (Passau, Germany). His main research interests are: Large Vocabulary Continuous Speech Recognition, Language Modeling, Acoustic Modeling, Natural Language Processing and Deep Learning. In the context of the ARIA-VALUSPA project, he is mostly working on building online ASR systems for multiple languages in order to improve the understanding of the human verbal behaviour during social interactions.