CUED Publications database

Scholarly Document Information Extraction using Extensible Features for Efficient Higher Order Semi-CRFs

Cuong, NV and Chandrasekaran, MK and Kan, MY and Lee, WS (2015) Scholarly Document Information Extraction using Extensible Features for Efficient Higher Order Semi-CRFs. In: UNSPECIFIED pp. 61-64..

Full text not available from this repository.

Abstract

© 2015 ACM. We address the tasks of recovering bibliographic and document structure metadata from scholarly documents. We leverage higher order semi-Markov conditional random fields to model long-distance label sequences, improving upon the performance of the linear-chain conditional random field model. We introduce the notion of extensible features, which allows the expensive inference process to be simplified through memoization, resulting in lower computational complexity. Our method significantly betters the state-of-the-art on three related scholarly document extraction tasks.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Subjects: UNSPECIFIED
Divisions: Div F > Computational and Biological Learning
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 19:37
Last Modified: 14 Sep 2017 01:27
DOI: