Grounding Language in Descriptions of Scenes

Grounding Language in Descriptions of Scenes (2006)

The problem of how abstract symbols, such as those in systems of natural language, may be grounded in perceptual information presents a significant challenge to several areas of research. This paper presents the GLIDES model, a neural network architecture that shows how this symbol-grounding problem can be solved through learned relationships between simple visual scenes and linguistic descriptions. Unlike previous models of symbol grounding, the model's learning is completely unsupervised, utilizing the principles of self organization and Hebbian learning and allowing direct visualization of how concepts are formed and grounding occurs. Two sets of experiments were conducted to evaluate the model. In the first set, linguistic test stimuli were presented and the scenes that were generated by the model were evaluated as the grounding of the language. In the second set, the model was presented with visual test samples and its language generation capabilities based on the grounded representations were assessed. The results demonstrate that symbols can be grounded based on associations of perceptual and linguistic representations, and the grounding can be made transparent. This transparency leads to unique insights into symbol grounding, including how many-to-many mappings between symbols and referents can be maintained and how concepts can be formed from cooccurrence relationships.

View:

PDF

Citation:

In Proceedings of the 28th Annual Meeting of the Cognitive Science Society, 2006.

Bibtex:

People

Risto Miikkulainen	Faculty	risto [at] cs utexas edu
Paul Williams	Undergraduate Alumni	pwilly [at] cs utexas edu

Areas of Interest

Natural Language Processing (Cognitive) Cognitive Science Concept and Schema Learning