CUED Publications database

Morph-fitting: Fine-tuning word vector spaces with simple language-specific rules

Vulic, I and Mrkšic, N and Reichart, R and Séaghdha, D and Young, S and Korhonen, A (2017) Morph-fitting: Fine-tuning word vector spaces with simple language-specific rules. In: 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017), 2017-7-30 to 2017-8-4, Vancouver, Canada pp. 56-68..

Full text not available from this repository.


Morphologically rich languages accentuate two properties of distributional vector space models: 1) the difficulty of inducing accurate representations for low-frequency word forms; and 2) insensitivity to distinct lexical relations that have similar distributional signatures. These effects are detrimental for language understanding systems, which may infer that inexpensive is a rephrasing for expensive or may not associate acquire with acquires. In this work, we propose a novel morph-fitting procedure which moves past the use of curated semantic lexicons for improving distributional vector spaces. Instead, our method injects morphological constraints generated using simple language-specific rules, pulling inflectional forms of the same word close together and pushing derivational antonyms far apart. In intrinsic evaluation over four languages, we show that our approach: 1) improves low-frequency word estimates; and 2) boosts the semantic quality of the entire word vector collection. Finally, we show that morph-fitted vectors yield large gains in the downstream task of dialogue state tracking, highlighting the importance of morphology for tackling long-tail phenomena in language understanding tasks.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Uncontrolled Keywords: Semantic specialisation Morphologically complex languages Vector space models Dialogue state tracking Word embeddings
Divisions: Div F > Machine Intelligence
Div F > Computational and Biological Learning
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 19:58
Last Modified: 10 Apr 2021 22:21
DOI: doi:10.18653/v1/P17-1006