Multimodal & Embodied Research
Multimodal semantic integration
How does one integrate what one perceives with what others say and do (and vice-versa)? Semantic integration of language, perception (vision, audition, olfaction etc.) and motion comprises a fundamental cognitive mechanism that enables -among others- multimodal interaction and learning. Understanding and conquering such mechanism enables -among others- the computational modeling of multimodal human machine/robot interaction. The latter is one of the ultimate objectives in Artificial General Intelligence agents, who are expected to perform multimodal semantic integration when interacting with humans.
Our ultimate objective in this research line is the development of a computational mechanism for the automatic analysis and synthesis of multimodal messages in Artificial General Intelligence agents…
Multimodal knowledge acquisition
What are the common uses of objects? Which perceptual attributes are characteristic of them? Which object(s) can one use to achieve certain goals? This kind of knowledge comprises a vital part of our prior knowledge of the world, which we acquire developmentally, through multimodal exploration of the world and interaction with others. However, hardly can such knowledge be found explicitly, which hinders machines to understand human intentions, instructions, goals. The lack of explicit/verbal expression of such prior knowledge of the world is one of the biggest challenges in modern Generative AI systems.
Our ultimate objective in this research line is the development of large scale multimodal knowledge acquisition algorithms for human-computer/robot interaction…
embodied knowledge representation & reasoning
The need for structured knowledge in intelligent systems has led to the development of knowledge bases, ontologies and semantic graphs. While Neuroscience findings point to a multisensory, multimodal and distributed semantic memory in the human brain, these computational resources remain static storage spaces of mostly -verbal concepts- with ad hoc defined associations/relations. Generalisation and common sense reasoning in intelligent systems still stumbles upon the gap between high-level, symbolic representations and low-level sensorymotor experiences.
Our ultimate objective in this research line is the development of a computational semantic memory structured in a way that will enable generalisation and will serve multimodality.
embodied language processing
Natural language processing does not take place in a cognitive vacuum, isolated from perception and action. Contrary to the traditional approaches of computational language analysis and generation that operate in a ‘language-only’ space, we advocate for a new theoretical and computational look at language as an active system in multimodal cognition applications.
Our ultimate objective in this research line is the development of such new description of language and of a corresponding suite of embodied language processing tools so that we bridge the gap between natural language and the sensorimotor space; this will allow intelligent systems to go beyond using language as an interface medium, to taking full advantage of its potential for behavior generalization, creativity and intention attribution...
Goal-directed actions
Language and action have been found to share a common neural basis and in particular a common ‘syntax’, an analogous hierarchical and compositional organization. While grammatical formalisms and associated discriminative or generative computational models exist for language structure, such formalisms and models for the structure of action are still elusive. However, structuring action has important implications on action learning and generalization, leading to a clear definition of what comprises a single action vs. an action sequence or whole event and dictating principles based on which single actions are combined into sequences and events.
Our ultimate objective in this research line is the development of action parsing and action-planning algorithms…
Indicative TOPICS/Applications
Human-Robot Interaction:
- Robots@Home (e.g., Robotic Culinary Assistants)
- Robots@School (e.g., Robots for Cognitive Offloading in Education/Vocational Training)
- Robots@Health (e.g., Robotic Health Staff for assisting patients with neurodegenerative diseases)
- Frugal Robotic Architectures
Semantic Augmentation of Multimodal Data:
- Multimedia Literacy
- Humor Analysis/Synthesis
- CyberCartography (e.g., Multimodal & Multisensory Digital Maps for Cultural Heritage, Creative Industries and Tourism)
AI4Forensics (e.g. Deception Detection)
AI4Supporting Diversity, Equity and Inclusion (e.g. Misogyny Detection)
GenAI4Edu (e.g. Guidelines for employing Generative AI tools in Education)