CUED Publications database

The infinite HMM for unsupervised PoS tagging

Van Gael, J and Vlachos, A and Ghahramani, Z (2009) The infinite HMM for unsupervised PoS tagging. EMNLP 2009 - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009. pp. 678-687.

Full text not available from this repository.


We extend previous work on fully unsupervised part-of-speech tagging. Using a non-parametric version of the HMM, called the infinite HMM (iHMM), we address the problem of choosing the number of hidden states in unsupervised Markov models for PoS tagging. We experiment with two non-parametric priors, the Dirichlet and Pitman-Yor processes, on the Wall Street Journal dataset using a parallelized implementation of an iHMM inference algorithm. We evaluate the results with a variety of clustering evaluation metrics and achieve equivalent or better performances than previously reported. Building on this promising result we evaluate the output of the unsupervised PoS tagger as a direct replacement for the output of a fully supervised PoS tagger for the task of shallow parsing and compare the two evaluations. © 2009 ACL and AFNLP.

Item Type: Article
Divisions: Div F > Computational and Biological Learning
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 19:07
Last Modified: 01 Aug 2017 03:04