Handwritten word-spotting using hidden Markov models and universal vocabularies

Producció científica: Article en revista indexadaArticleAvaluat per experts

148 Cites (Scopus)


Handwritten word-spotting is traditionally viewed as an image matching task between one or multiple query word-images and a set of candidate word-images in a database. This is a typical instance of the query-by-example paradigm. In this article, we introduce a statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization. We explore the use of two types of HMMs for the word modeling part: continuous HMMs (C-HMMs) and semi-continuous HMMs (SC-HMMs), i.e. HMMs with a shared set of Gaussians. We show on a challenging multi-writer corpus that the proposed statistical framework is always superior to a traditional matching system which uses dynamic time warping (DTW) for word-image distance computation. A very important finding is that the SC-HMM is superior when labeled training data is scarce—as low as one sample per keyword—thanks to the prior information which can be incorporated in the shared set of Gaussians.
Idioma originalAnglès
Pàgines (de-a)2106
Nombre de pàgines2116
RevistaPattern Recognition
Estat de la publicacióPublicada - 2012


Navegar pels temes de recerca de 'Handwritten word-spotting using hidden Markov models and universal vocabularies'. Junts formen un fingerprint únic.

Com citar-ho