The goal of the Spinoza Prize projects “Understanding Language by Machines” is to develop computer models that can assign deeper meaning to language that approximates human understanding and to use these models to automatically read text and understand language in relation to the world. Current approaches to natural language understanding consider language as a closed-world of relations between words. Words and text are however highly ambiguous and vague while at the same time there is a large variation in the way we express similar things. People do not notice this ambiguity and variation when using language within their social communicative context. This project tries to get a better understanding of the scope and complexity of ambiguity and variation and to model the social communicative contexts to resolve it.
In this video Piek Vossen explains the background, results and cohesion of the FIVE Spinoza-projects:
It explains three key notions of our research: identity, reference and perspective of language understanding.
What are the things in the world, what words and expressions in our language can refer to these and from what perspective do we choose to make reference in particular way? What makes this challenging is the massive ambiguity and variation in language which not even big data models can handle.
Four different projects have addressed these problems in the last five years:
- Project 1 — Word Sense Disambiguation
- Project 2 — Perception & Description of Images
- Project 3 — Storylines & Perspectives
- Project 4 — Context & Background Knowledge
Recently, a fifth project started in which all aspects come together in the physical embodiment of a robot called Leolani. Understanding the world and being able to talk about it with humans is a real challenge for robots:
- Project 5 — Make Robots talk